Use Java Regex to parse xml file -
for reason cannot use sax , dom parsers , need parse regex.
i want extract values in key-value pairs(key being content in tag1, value being content in tag 3) . of keys don't have key values in between, have ignore keys.
xml file
<main tag><element><tag1>key1</tag1><tag2>not intrested</tag2><tag3>value1</tag3></element><element><tag1>key2</tag1><tag2>not intrested</tag2></element><element><tag1>key3</tag1><tag2>not intrested</tag2><tag3>value3</tag3></element></main tag>
the above xml file indentation:
<main tag> <element> <tag1>key1</tag1> <tag2>not intrested</tag2> <tag3>value1</tag3> </element> <element> <tag1>key2</tag1> <tag2>not intrested</tag2> </element> <element> <tag1>key3</tag1> <tag2>not intrested</tag2> <tag3>value3</tag3> </element> </main tag>
so above file need extract key1-value1 , key3-value3, ignoring key2 because doesn't have value.
using matcher:
final pattern pattern = pattern.compile("<tag1>(.+?)</tag1>.*<tag3>(.+?)</tag3>"); final matcher matcher = pattern.matcher(above string); matcher.find(); system.out.println(matcher.group(1)); // gives key1 system.out.println(matcher.group(1)); // gives value3 // instead of value1
give pattern try:
"<(tag[13])>(.+?)</tag[13]>"
usage:
public static void main(string[] args) throws exception { string xmlstring = "<maintag><element><tag1>key1</tag1><tag2>not intrested</tag2><tag3>value1</tag3></element><element><tag1>key2</tag1><tag2>not intrested</tag2></element><element><tag1>key3</tag1><tag2>not intrested</tag2><tag3>value3</tag3></element></maintag>"; matcher matcher = pattern.compile("<(tag[13])>(.+?)</tag[13]>").matcher(xmlstring); while (matcher.find()) { system.out.println(matcher.group(1) + " " + matcher.group(2)); } }
results:
tag1 key1 tag3 value1 tag1 key2 tag1 key3 tag3 value3
non regex
or use document
& documentbuilderfactory
org.wc3.dom
package.
something like:
public static void main(string[] args) throws exception { string xmlstring = "<maintag><element><tag1>key1</tag1><tag2>not intrested</tag2><tag3>value1</tag3></element><element><tag1>key2</tag1><tag2>not intrested</tag2></element><element><tag1>key3</tag1><tag2>not intrested</tag2><tag3>value3</tag3></element></maintag>"; document xmldocument = documentbuilderfactory.newinstance().newdocumentbuilder().parse(new inputsource(new bytearrayinputstream(xmlstring.getbytes("utf-8")))); node rootnode = xmldocument.getfirstchild(); if (rootnode.haschildnodes()) { // each element child node nodelist elementslist = rootnode.getchildnodes(); (int = 0; < elementslist.getlength(); i++) { if (elementslist.item(i).haschildnodes()) { // each tag child node element node nodelist tagslist = elementslist.item(i).getchildnodes(); (int i2 = 0; i2 < tagslist.getlength(); i2++) { node tagnode = tagslist.item(i2); if (tagnode.getnodename().matches("tag1|tag3")) { system.out.println(tagnode.getnodename() + " " + tagnode.gettextcontent()); } } } } } }
results:
tag1 key1 tag3 value1 tag1 key2 tag1 key3 tag3 value3
Comments
Post a Comment