parsing - ELKI CSV parser problems -
i have changed .arff file .csv file in tool in weka. can't use arffparser parser in elki.
what parser should use? default numbervectorlabelparser. gives me arrayindexoutofboundsexception:
running: -verbose -verbose -dbc.in /home/db/lisbet/datasets/without ids/try 2/calling them .txt using parser/lymphography_withoutdupl_norm_1ofn.csv -dbc.parser numbervectorlabelparser -algorithm outlier.lof.lof -lof.k 2 -evaluator outlier.outlierroccurve -rocauc.positive yes task failed java.lang.arrayindexoutofboundsexception: 47 @ de.lmu.ifi.dbs.elki.datasource.parser.numbervectorlabelparser.gettypeinformation(numbervectorlabelparser.java:337) @ de.lmu.ifi.dbs.elki.datasource.parser.numbervectorlabelparser.buildmeta(numbervectorlabelparser.java:242) @ de.lmu.ifi.dbs.elki.datasource.parser.numbervectorlabelparser.nextevent(numbervectorlabelparser.java:211) @ de.lmu.ifi.dbs.elki.datasource.bundle.multipleobjectsbundle.fromstream(multipleobjectsbundle.java:242) @ de.lmu.ifi.dbs.elki.datasource.parser.abstractstreamingparser.asmultipleobjectsbundle(abstractstreamingparser.java:89) @ de.lmu.ifi.dbs.elki.datasource.inputstreamdatabaseconnection.loaddata(inputstreamdatabaseconnection.java:91) @ de.lmu.ifi.dbs.elki.database.staticarraydatabase.initialize(staticarraydatabase.java:119) @ de.lmu.ifi.dbs.elki.workflow.inputstep.getdatabase(inputstep.java:62) @ de.lmu.ifi.dbs.elki.kddtask.run(kddtask.java:108) @ de.lmu.ifi.dbs.elki.application.kddcliapplication.run(kddcliapplication.java:60) @ [...]
my .csv file looks this:
'lymphatics = deformed','lymphatics = displaced','lymphatics = arched','lymphatics = normal','block_of_affere = yes','block_of_affere = no','bl_of_lymph_c = no','bl_of_lymph_c = yes','bl_of_lymph_s = no','bl_of_lymph_s = yes','by_pass = no','by_pass = yes','extravasates = yes','extravasates = no','regeneration_of = no','regeneration_of = yes','early_uptake_in = yes','early_uptake_in = no','changes_in_lym = oval','changes_in_lym = round','changes_in_lym = bean','defect_in_node = lacunar','defect_in_node = lac_central','defect_in_node = lac_margin','defect_in_node = no','changes_in_node = lac_central','changes_in_node = lacunar','changes_in_node = no','changes_in_node = lac_margin','changes_in_stru = faint','changes_in_stru = drop_like','changes_in_stru = stripped','changes_in_stru = coarse','changes_in_stru = diluted','changes_in_stru = grainy','changes_in_stru = no','changes_in_stru = reticular','special_forms = vesicles','special_forms = no','special_forms = chalices','dislocation_of = no','dislocation_of = yes','exclusion_of_no = yes','exclusion_of_no = no',lym_nodes_dimin,lym_nodes_enlar,no_of_nodes_in,outlier 1,0,0,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0.333333,0.285714,no 0,1,0,0,1,0,1,0,1,0,1,0,0,1,1,0,1,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,0.333333,0.142857,no
there 11 parsers available. maybe data, large parser.
thank you, bug in elki csv parser.
it did not expect class column have label.
so if remove ,outlier
part of first line (or first line completely), should read file fine.
i push change makes more robust here (it still lose label though, because elki has support column labels numerical columns not string label columns).
Comments
Post a Comment