c# - Removing line breaks from XML data before converting to CSV -
so i'm using following snippet in c# wpf application convert xml data csv.
string text = file.readalltext(file); text = "<root>" + text + "</root>"; xmldocument doc = new xmldocument(); doc.loadxml(text); streamwriter write = new streamwriter(filename1); xmlnodelist rows = doc.getelementsbytagname("xml"); foreach (xmlnode row in rows) { list<string> children = new list<string>(); foreach (xmlnode child in row.childnodes) { children.add(child.innertext.trim()); } write.writeline(string.join(",", children.toarray())); }
however i've run situation. input xml data looks following (sorry, have scroll horizontally see how data looks in raw format):
<xml><header>1.0,770162,20121009133435,3,</header>20121009133435,721,5,1,0,0,0,00:00,00:00,<event>00032134826064957,4627,</event><drug>1,1872161156,7,0,10000</drug><dose>1,0,5000000,0,10000000,0</dose><carearea>1 </carearea><encounter></encounter><advisory>keep simple or spell tham out. reason not case please press on button when trying activate device codes available on list</advisory><caregiver></caregiver><patient></patient><location>20121009133435,00-1d-71-0a-71-80,-66</location><route></route><site></site><power>0,50</power></xml>
now, problem i'm encountering .. output looks (given below); since, csv file, want output in 1 single row, how go removing line breaks raw data output in single horizontal line? i'm lost how approach situation. replace(system.environment.newline, "")
work? appreciated!
1.0,770162,20121009133435,3,,20121009133435,721,5,1,0,0,0,00:00,00:00,,00032134826064957,4627,1,,1872161156,7,0,10000,1,0,5000000,0,10000000,0,1 ,,keep simple or spell tham out. reason not case please press on button when trying activate device codes available on list,,,20121009133435,00-1d-71-0a-71-80,-66,,,0,50
edit:
also note input file has several thousand lines shown below:
<xml><header>1.0,770162,20121009133435,3,</header>20121009133435,721,5,1,0,0,0,00:00,00:00,<event>00032134826064957,4627,</event><drug>1,1872161156,7,0,10000</drug><dose>1,0,5000000,0,10000000,0</dose><carearea>1 </carearea><encounter></encounter><advisory>keep simple or spell tham out. reason not case please press on button when trying activate device codes available on list</advisory><caregiver></caregiver><patient></patient><location>20121009133435,00-1d-71-0a-71-80,-66</location><route></route><site></site><power>0,50</power></xml> <xml><header>2.0,773162,20121009133435,3,</header>20121004133435,761,5,1,0,0,0,00:00,00:00,<event>00032134826064957,4627,</event><drug>1,18735166156,7,0,10000</drug><dose>1,0,5000000,0,10000000,0</dose><carearea>1 </carearea><encounter></encounter><advisory>keep simple or spell tham out. reason not case please press on button when trying activate device codes available on list</advisory><caregiver></caregiver><patient></patient><location>20121009133435,00-1d-71-0a-71-80,-66</location><route></route><site></site><power>0,50</power></xml> .. goes on
try
children.add(regex.replace(child.innertext, "\\s+", " "));
this shouldn't depend on specific newline character , rid of 4 spaces in between every line. \s
regex whitespace , +
means 1 or more occurrences.
Comments
Post a Comment