Java parsing xml, parsing xml four methods, DOM, SAX, JDOM, DOM4j, XPath

【introduction】

At present, there are many techniques for parsing XML in Java. The mainstream ones are DOM, SAX, JDOM, and DOM4j. The following mainly introduces the use, advantages, disadvantages, and performance tests of these four parsing XML document technologies.

First, [basic knowledge - literacy]

Sax, dom are two methods for parsing xml documents (no specific implementation, just interfaces), so only they can not parse xml documents; jaxp is just api, it further encapsulates sax, dom interfaces, and provides DomcumentBuilderFactory/DomcumentBuilder and SAXParserFactory/SAXParser (the xerces interpreter is used by default).

Second, [DOM, SAX, JDOM, DOM4j simple use introduction]

1、【DOM(Document Object Model) 】

The interface provided by the W3C reads the entire XML document into memory and builds a DOM tree to operate on each node (Node).
sample code:

<?xml version="1.0" encoding="UTF-8"?>  
<university name="pku">  
    <college name="c1">  
        <class name="class1">  
            <student name="stu1" sex='male' age="21" />  
            <student name="stu2" sex='female' age="20" />  
            <student name="stu3" sex='female' age="20" />  
        </class>  
        <class name="class2">  
            <student name="stu4" sex='male' age="19" />  
            <student name="stu5" sex='female' age="20" />  
            <student name="stu6" sex='female' age="21" />  
        </class>  
    </college>  
    <college name="c2">  
        <class name="class3">  
            <student name="stu7" sex='male' age="20" />  
        </class>  
    </college>  
    <college name="c3">  
    </college>  
</university>  

The text code is used in the following text (the document is placed in the src path, both compiled and in the classes path), which refers to the xml document.

package test.xml;  
  
import java.io.File;  
import java.io.FileNotFoundException;  
import java.io.FileOutputStream;  
import java.io.IOException;  
import java.io.InputStream;  
  
import javax.xml.parsers.DocumentBuilder;  
import javax.xml.parsers.DocumentBuilderFactory;  
import javax.xml.parsers.ParserConfigurationException;  
import javax.xml.transform.Transformer;  
import javax.xml.transform.TransformerConfigurationException;  
import javax.xml.transform.TransformerException;  
import javax.xml.transform.TransformerFactory;  
import javax.xml.transform.dom.DOMSource;  
import javax.xml.transform.stream.StreamResult;  
  
import org.w3c.dom.Document;  
import org.w3c.dom.Element;  
import org.w3c.dom.Node;  
import org.w3c.dom.NodeList;  
import org.w3c.dom.Text;  
import org.xml.sax.SAXException;  
  
/** 
   * dom read and write xml 
 * @author whwang 
 */  
public class TestDom {  
      
    public static void main(String[] args) {  
        read();  
        //write();  
    }  
      
    public static void read() {  
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  
        try {  
            DocumentBuilder builder = dbf.newDocumentBuilder();  
            InputStream in = TestDom.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = builder.parse(in);  
            // root <university>  
            Element root = doc.getDocumentElement();  
            if (root == null) return;  
            System.err.println(root.getAttribute("name"));  
            // all college node  
            NodeList collegeNodes = root.getChildNodes();  
            if (collegeNodes == null) return;  
            for(int i = 0; i < collegeNodes.getLength(); i++) {  
                Node college = collegeNodes.item(i);  
                if (college != null && college.getNodeType() == Node.ELEMENT_NODE) {  
                    System.err.println("\t" + college.getAttributes().getNamedItem("name").getNodeValue());  
                    // all class node  
                    NodeList classNodes = college.getChildNodes();  
                    if (classNodes == null) continue;  
                    for (int j = 0; j < classNodes.getLength(); j++) {  
                        Node clazz = classNodes.item(j);  
                        if (clazz != null && clazz.getNodeType() == Node.ELEMENT_NODE) {  
                            System.err.println("\t\t" + clazz.getAttributes().getNamedItem("name").getNodeValue());  
                            // all student node  
                            NodeList studentNodes = clazz.getChildNodes();  
                            if (studentNodes == null) continue;  
                            for (int k = 0; k < studentNodes.getLength(); k++) {  
                                Node student = studentNodes.item(k);  
                                if (student != null && student.getNodeType() == Node.ELEMENT_NODE) {  
                                    System.err.print("\t\t\t" + student.getAttributes().getNamedItem("name").getNodeValue());  
                                    System.err.print(" " + student.getAttributes().getNamedItem("sex").getNodeValue());  
                                    System.err.println(" " + student.getAttributes().getNamedItem("age").getNodeValue());  
                                }  
                            }  
                        }  
                    }  
                }  
            }  
        } catch (ParserConfigurationException e) {  
            e.printStackTrace();  
        } catch (FileNotFoundException e) {  
            e.printStackTrace();  
        } catch (SAXException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
          
    }  
      
    public static void write() {  
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  
        try {  
            DocumentBuilder builder = dbf.newDocumentBuilder();  
            InputStream in = TestDom.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = builder.parse(in);  
            // root <university>  
            Element root = doc.getDocumentElement();  
            if (root == null) return;  
                         / / Modify the properties  
            root.setAttribute("name", "tsu");  
            NodeList collegeNodes = root.getChildNodes();  
            if (collegeNodes != null) {  
                for (int i = 0; i <collegeNodes.getLength() - 1; i++) {  
                                         // delete the node  
                    Node college = collegeNodes.item(i);  
                    if (college.getNodeType() == Node.ELEMENT_NODE) {  
                        String collegeName = college.getAttributes().getNamedItem("name").getNodeValue();  
                        if ("c1".equals(collegeName) || "c2".equals(collegeName)) {  
                            root.removeChild(college);  
                        } else if ("c3".equals(collegeName)) {  
                            Element newChild = doc.createElement("class");  
                            newChild.setAttribute("name", "c4");  
                            college.appendChild(newChild);  
                        }  
                    }  
                }  
            }  
                         // Add a node  
            Element addCollege = doc.createElement("college");  
            addCollege.setAttribute("name", "c5");  
            root.appendChild(addCollege);  
            Text text = doc.createTextNode("text");  
            addCollege.appendChild(text);  
              
                         // Save the modified document to a file  
            TransformerFactory transFactory = TransformerFactory.newInstance();  
            Transformer transFormer = transFactory.newTransformer();  
            DOMSource domSource = new DOMSource(doc);  
            File file = new File("src/dom-modify.xml");  
            if (file.exists()) {  
                file.delete();  
            }  
            file.createNewFile();  
            FileOutputStream out = new FileOutputStream(file);           
            StreamResult xmlResult = new StreamResult(out);  
            transFormer.transform(domSource, xmlResult);  
            System.out.println(file.getAbsolutePath());  
        } catch (ParserConfigurationException e) {  
            e.printStackTrace();  
        } catch (SAXException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        } catch (TransformerConfigurationException e) {  
            e.printStackTrace();  
        } catch (TransformerException e) {  
            e.printStackTrace();  
        }  
    }  
}  

The code can be made simpler with a little modification, and you don't have to write if to always determine if there are child nodes.

2、【SAX (Simple API for XML) 】

SAX does not need to load the entire document into memory. Based on the event-driven API (Observer mode), users only need to register events of interest to them. SAX provides EntityResolver, DTDHandler, ContentHandler, ErrorHandler interfaces for listening to parsing entity events, DTD processing events, body processing events, and handling error events. Similar to AWT, SAX also provides a default class DefaultHandler for these four interfaces. The default implementation here is actually an empty method. Generally, as long as you inherit DefaultHandler, you can rewrite the events you are interested in.

Sample code:

package test.xml;  
  
import java.io.IOException;  
import java.io.InputStream;  
  
import javax.xml.parsers.ParserConfigurationException;  
import javax.xml.parsers.SAXParser;  
import javax.xml.parsers.SAXParserFactory;  
  
import org.xml.sax.Attributes;  
import org.xml.sax.InputSource;  
import org.xml.sax.Locator;  
import org.xml.sax.SAXException;  
import org.xml.sax.SAXParseException;  
import org.xml.sax.helpers.DefaultHandler;  
  
/** 
 * 
 * @author whwang 
 */  
public class TestSAX {  
  
    public static void main(String[] args) {  
        read();  
        write();  
    }  
      
    public static void read() {  
        try {  
            SAXParserFactory factory = SAXParserFactory.newInstance();  
            SAXParser parser = factory.newSAXParser();  
            InputStream in = TestSAX.class.getClassLoader().getResourceAsStream("test.xml");  
            parser.parse(in, new MyHandler());  
        } catch (ParserConfigurationException e) {  
            e.printStackTrace();  
        } catch (SAXException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
    }  
      
    public static void write() {  
                 System.err.println("Pure SAX is powerless for write operations");  
    }  
      
}  
  
 // Rewrite the event handling method that is of interest to you  
class MyHandler extends DefaultHandler {  
  
    @Override  
    public InputSource resolveEntity(String publicId, String systemId)  
            throws IOException, SAXException {  
        return super.resolveEntity(publicId, systemId);  
    }  
  
    @Override  
    public void notationDecl(String name, String publicId, String systemId)  
            throws SAXException {  
        super.notationDecl(name, publicId, systemId);  
    }  
  
    @Override  
    public void unparsedEntityDecl(String name, String publicId,  
            String systemId, String notationName) throws SAXException {  
        super.unparsedEntityDecl(name, publicId, systemId, notationName);  
    }  
  
    @Override  
    public void setDocumentLocator(Locator locator) {  
        super.setDocumentLocator(locator);  
    }  
  
    @Override  
    public void startDocument() throws SAXException {  
                 System.err.println("Start parsing the document");  
    }  
  
    @Override  
    public void endDocument() throws SAXException {  
                 System.err.println("End of parsing");  
    }  
  
    @Override  
    public void startPrefixMapping(String prefix, String uri)  
            throws SAXException {  
        super.startPrefixMapping(prefix, uri);  
    }  
  
    @Override  
    public void endPrefixMapping(String prefix) throws SAXException {  
        super.endPrefixMapping(prefix);  
    }  
  
    @Override  
    public void startElement(String uri, String localName, String qName,  
            Attributes attributes) throws SAXException {  
        System.err.print("Element: " + qName + ", attr: ");  
        print(attributes);  
    }  
  
    @Override  
    public void endElement(String uri, String localName, String qName)  
            throws SAXException {  
        super.endElement(uri, localName, qName);  
    }  
  
    @Override  
    public void characters(char[] ch, int start, int length)  
            throws SAXException {  
        super.characters(ch, start, length);  
    }  
  
    @Override  
    public void ignorableWhitespace(char[] ch, int start, int length)  
            throws SAXException {  
        super.ignorableWhitespace(ch, start, length);  
    }  
  
    @Override  
    public void processingInstruction(String target, String data)  
            throws SAXException {  
        super.processingInstruction(target, data);  
    }  
  
    @Override  
    public void skippedEntity(String name) throws SAXException {  
        super.skippedEntity(name);  
    }  
  
    @Override  
    public void warning(SAXParseException e) throws SAXException {  
        super.warning(e);  
    }  
  
    @Override  
    public void error(SAXParseException e) throws SAXException {  
        super.error(e);  
    }  
  
    @Override  
    public void fatalError(SAXParseException e) throws SAXException {  
        super.fatalError(e);  
    }  
      
    private void print(Attributes attrs) {  
        if (attrs == null) return;  
        System.err.print("[");  
        for (int i = 0; i < attrs.getLength(); i++) {  
            System.err.print(attrs.getQName(i) + " = " + attrs.getValue(i));  
            if (i != attrs.getLength() - 1) {  
                System.err.print(", ");  
            }  
        }  
        System.err.println("]");  
    }  
}  

3、【JDOM】

JDOM is very similar to DOM. It is a pure JAVA API for processing XML. The API uses a lot of Collections classes, and JDOM only uses concrete classes instead of interfaces. JDOM itself does not contain a parser. It typically uses the SAX2 parser to parse and validate the input XML document (although it can also take a previously constructed DOM representation as input). It contains some converters to output JDOM representations into SAX2 event streams, DOM models, or XML text documents.

Sample code:

package test.xml;  
  
import java.io.File;  
import java.io.FileOutputStream;  
import java.io.IOException;  
import java.io.InputStream;  
import java.util.List;  
  
import org.jdom.Attribute;  
import org.jdom.Document;  
import org.jdom.Element;  
import org.jdom.JDOMException;  
import org.jdom.input.SAXBuilder;  
import org.jdom.output.XMLOutputter;  
  
/** 
   * JDom read and write xml 
 * @author whwang 
 */  
public class TestJDom {  
    public static void main(String[] args) {  
        //read();  
        write();  
    }  
      
    public static void read() {  
        try {  
            boolean validate = false;  
            SAXBuilder builder = new SAXBuilder(validate);  
            InputStream in = TestJDom.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = builder.build(in);  
                         // Get the root node <university>  
            Element root = doc.getRootElement();  
            readNode(root, "");  
        } catch (JDOMException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
    }  
      
    @SuppressWarnings("unchecked")  
    public static void readNode(Element root, String prefix) {  
        if (root == null) return;  
                 // get the property  
        List<Attribute> attrs = root.getAttributes();  
        if (attrs != null && attrs.size() > 0) {  
            System.err.print(prefix);  
            for (Attribute attr : attrs) {  
                System.err.print(attr.getValue() + " ");  
            }  
            System.err.println();  
        }  
                 / / Get his child node  
        List<Element> childNodes = root.getChildren();  
        prefix += "\t";  
        for (Element e : childNodes) {  
            readNode(e, prefix);  
        }  
    }  
      
    public static void write() {  
        boolean validate = false;  
        try {  
            SAXBuilder builder = new SAXBuilder(validate);  
            InputStream in = TestJDom.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = builder.build(in);  
                         // Get the root node <university>  
            Element root = doc.getRootElement();  
                         / / Modify the properties  
            root.setAttribute("name", "tsu");  
                         // delete  
            boolean isRemoved = root.removeChildren("college");  
            System.err.println(isRemoved);  
                         // new  
            Element newCollege = new Element("college");  
            newCollege.setAttribute("name", "new_college");  
            Element newClass = new Element("class");  
            newClass.setAttribute("name", "ccccc");  
            newCollege.addContent(newClass);  
            root.addContent(newCollege);  
            XMLOutputter out = new XMLOutputter();  
            File file = new File("src/jdom-modify.xml");  
            if (file.exists()) {  
                file.delete();  
            }  
            file.createNewFile();  
            FileOutputStream fos = new FileOutputStream(file);  
            out.output(doc, fos);  
        } catch (JDOMException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
    }  
      
}  

4、【DOM4j】

Dom4j is currently the best in xml parsing (Hibernate, Sun's JAXM also uses dom4j to parse XML), it incorporates many features beyond the basic XML document representation, including integrated XPath support, XML Schema support, and Event-based processing of large or streaming documents

Sample code:

package test.xml;  
  
import java.io.File;  
import java.io.FileWriter;  
import java.io.IOException;  
import java.io.InputStream;  
import java.util.List;  
  
import org.dom4j.Attribute;  
import org.dom4j.Document;  
import org.dom4j.DocumentException;  
import org.dom4j.DocumentHelper;  
import org.dom4j.Element;  
import org.dom4j.ProcessingInstruction;  
import org.dom4j.VisitorSupport;  
import org.dom4j.io.SAXReader;  
import org.dom4j.io.XMLWriter;  
  
/** 
   * Dom4j read and write xml 
 * @author whwang 
 */  
public class TestDom4j {  
    public static void main(String[] args) {  
        read1();  
        //read2();  
        //write();  
    }  
  
    public static void read1() {  
        try {  
            SAXReader reader = new SAXReader();  
            InputStream in = TestDom4j.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = reader.read(in);  
            Element root = doc.getRootElement();  
            readNode(root, "");  
        } catch (DocumentException e) {  
            e.printStackTrace();  
        }  
    }  
      
    @SuppressWarnings("unchecked")  
    public static void readNode(Element root, String prefix) {  
        if (root == null) return;  
                 // get the property  
        List<Attribute> attrs = root.attributes();  
        if (attrs != null && attrs.size() > 0) {  
            System.err.print(prefix);  
            for (Attribute attr : attrs) {  
                System.err.print(attr.getValue() + " ");  
            }  
            System.err.println();  
        }  
                 / / Get his child node  
        List<Element> childNodes = root.elements();  
        prefix += "\t";  
        for (Element e : childNodes) {  
            readNode(e, prefix);  
        }  
    }  
      
    public static void read2() {  
        try {  
            SAXReader reader = new SAXReader();  
            InputStream in = TestDom4j.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = reader.read(in);  
            doc.accept(new MyVistor());  
        } catch (DocumentException e) {  
            e.printStackTrace();  
        }  
    }  
      
    public static void write() {  
        try {  
                         // Create an xml document  
            Document doc = DocumentHelper.createDocument();  
            Element university = doc.addElement("university");  
            university.addAttribute("name", "tsu");  
                         // comment  
                         university.addComment("This is the root node");  
            Element college = university.addElement("college");  
            college.addAttribute("name", "cccccc");  
            college.setText("text");  
              
            File file = new File("src/dom4j-modify.xml");  
            if (file.exists()) {  
                file.delete();  
            }  
            file.createNewFile();  
            XMLWriter out = new XMLWriter(new FileWriter(file));  
            out.write(doc);  
            out.flush();  
            out.close();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
    }  
}  
  
class MyVistor extends VisitorSupport {  
    public void visit(Attribute node) {  
        System.out.println("Attibute: " + node.getName() + "="  
                + node.getValue());  
    }  
  
    public void visit(Element node) {  
        if (node.isTextOnly()) {  
            System.out.println("Element: " + node.getName() + "="  
                    + node.getText());  
        } else {  
            System.out.println(node.getName());  
        }  
    }  
  
    @Override  
    public void visit(ProcessingInstruction node) {  
        System.out.println("PI:" + node.getTarget() + " " + node.getText());  
    }  
}  

Third, [performance test]

Environment: AMD4400+ 2.0+GHz clock speed JDK6.0
Operating parameters: -Xms400m -Xmx400m
xml file size: 10.7M
Results:
DOM: >581297ms
SAX: 8829ms
JDOM: 581297ms
DOM4j: 5309ms
Time includes IO, just a simple test, for reference only! ! ! !

Fourth, [comparison]

1、【DOM】

The DOM is a tree-based structure that usually needs to load the entire document and construct the DOM tree before it can start working.

advantage:

  1. Since the entire tree is in memory, random access to the xml document is possible.
  2. Can modify the xml document
  3. Compared to sax, dom is also simpler to use.

Disadvantages:

  1. The entire document must be parsed once
  2. Because the entire document needs to be loaded into memory, it is costly for large documents

2、【SAX】

SAX is similar to streaming media. It is event-driven, so there is no need to load the entire document into memory. Users only need to listen to events of interest to them.

advantage:

  1. No need to load the entire xml document into memory, so it consumes less memory
  2. Can register multiple ContentHandlers

Disadvantages:

  1. Cannot randomly access nodes in xml
  2. Cannot modify document

3、【JDOM】

JDOM is a pure Java API for processing XML. Its API uses a lot of Collections.

advantage:

  1. Advantages of the DOM approach
  2. Java rules with SAX

Disadvantage

  1. Disadvantages of the DOM approach

4、【DOM4J】

Among the 4 xml parsing methods, the best one is set, easy to use and performance.

Five, [small episode XPath]

XPath is a language for finding information in an XML document that can be used to traverse elements and attributes in an XML document. XPath is the main element of the W3C XSLT standard, and XQuery and XPointer are built on top of XPath expressions. Therefore, the understanding of XPath is the foundation of many advanced XML applications.
XPath is very similar to the SQL language for database operations, or JQuery, which makes it easy for developers to grab what is needed in a document. (dom4j also supports xpath)

Sample code:

package test.xml;  
  
import java.io.IOException;  
import java.io.InputStream;  
  
import javax.xml.parsers.DocumentBuilder;  
import javax.xml.parsers.DocumentBuilderFactory;  
import javax.xml.parsers.ParserConfigurationException;  
import javax.xml.xpath.XPath;  
import javax.xml.xpath.XPathConstants;  
import javax.xml.xpath.XPathExpression;  
import javax.xml.xpath.XPathExpressionException;  
import javax.xml.xpath.XPathFactory;  
  
import org.w3c.dom.Document;  
import org.w3c.dom.NodeList;  
import org.xml.sax.SAXException;  
  
public class TestXPath {  
  
    public static void main(String[] args) {  
        read();  
    }  
      
    public static void read() {  
        try {  
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  
            DocumentBuilder builder = dbf.newDocumentBuilder();  
            InputStream in = TestXPath.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = builder.parse(in);  
            XPathFactory factory = XPathFactory.newInstance();  
            XPath xpath = factory.newXPath();  
                         // select the name attribute of all class elements  
                         // Introduction to XPath syntax: http://w3school.com.cn/xpath/  
            XPathExpression expr = xpath.compile("//class/@name");  
            NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);  
            for (int i = 0; i < nodes.getLength(); i++) {  
                System.out.println("name = " + nodes.item(i).getNodeValue());  
                    }  
        } catch (XPathExpressionException e) {  
            e.printStackTrace();  
        } catch (ParserConfigurationException e) {  
            e.printStackTrace();  
        } catch (SAXException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
    }  
      
}  

Six, [supplement]

Note the handling of the TextNode by the four parsing methods:

1. When using DOM, call node.getChildNodes() to get the child nodes of the node, and the text node will be returned as a Node.

Such as:

<?xml version="1.0" encoding="UTF-8"?>  
<university name="pku">  
    <college name="c1">  
        <class name="class1">  
            <student name="stu1" sex='male' age="21" />  
            <student name="stu2" sex='female' age="20" />  
            <student name="stu3" sex='female' age="20" />  
        </class>  
    </college>  
</university>  
package test.xml;  
  
import java.io.FileNotFoundException;  
import java.io.IOException;  
import java.io.InputStream;  
import java.util.Arrays;  
  
import javax.xml.parsers.DocumentBuilder;  
import javax.xml.parsers.DocumentBuilderFactory;  
import javax.xml.parsers.ParserConfigurationException;  
  
import org.w3c.dom.Document;  
import org.w3c.dom.Element;  
import org.w3c.dom.Node;  
import org.w3c.dom.NodeList;  
import org.xml.sax.SAXException;  
  
/** 
 * dom read and write xml 
 * @author whwang 
 */  
public class TestDom2 {  
      
    public static void main(String[] args) {  
        read();  
    }  
      
    public static void read() {  
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  
        try {  
            DocumentBuilder builder = dbf.newDocumentBuilder();  
            InputStream in = TestDom2.class.getClassLoader().getResourceAsStream("test.xml");  
            Document doc = builder.parse(in);  
            // root <university>  
            Element root = doc.getDocumentElement();  
            if (root == null) return;  
//          System.err.println(root.getAttribute("name"));  
            // all college node  
            NodeList collegeNodes = root.getChildNodes();  
            if (collegeNodes == null) return;  
                         System.err.println("university child nodes:" + collegeNodes.getLength());  
                         System.err.println("Subnodes are as follows:");  
            for(int i = 0; i < collegeNodes.getLength(); i++) {  
                Node college = collegeNodes.item(i);  
                if (college == null) continue;  
                if (college.getNodeType() == Node.ELEMENT_NODE) {  
                                         System.err.println("\tElement node:" + college.getNodeName());  
                } else if (college.getNodeType() == Node.TEXT_NODE) {  
                                         System.err.println("\ttext node:" + Arrays.toString(college.getTextContent().getBytes()));  
                }  
            }  
        } catch (ParserConfigurationException e) {  
            e.printStackTrace();  
        } catch (FileNotFoundException e) {  
            e.printStackTrace();  
        } catch (SAXException e) {  
            e.printStackTrace();  
        } catch (IOException e) {  
            e.printStackTrace();  
        }  
          
    }  
}  

The output is:

Number of university sub-nodes: 3  
 The child nodes are as follows:  
         Text node: [10, 9]  
         Element node: college  
         Text node: [10]  

The \n ASCII code is 10, and the \t ASCII code is 9. The result is surprising. The number of children in the university is not 1, nor 2, but 3. Who are these 3 child nodes? To see more clearly, change the xml document to:

<?xml version="1.0" encoding="UTF-8"?>  
<university name="pku">11  
    <college name="c1">  
        <class name="class1">  
            <student name="stu1" sex='male' age="21" />  
            <student name="stu2" sex='female' age="20" />  
            <student name="stu3" sex='female' age="20" />  
        </class>  
    </college>22  
</university>  

Or the above program, the output is:

Number of university sub-nodes: 3  
 The child nodes are as follows:  
         Text node: [49, 49, 10, 9]  
         Element node: college  
         Text node: [50, 50, 10] 

The ASCII code of the number 1 is 49, and the ASCII code of the number 2 is 50.

2. Use SAX to parse the same DOM. When you rewrite its public void characters(char[] ch, int start, int length) method, you can see it.

3, JDOM, call node.getChildren () only returns the child node, does not include the TextNode node (regardless of whether the node has Text information). If you want to get the Text information of the node, you can call the node.getText() method, which returns the Text information of the node, including special characters such as \n\t.

4, DOM4j and JDOM

reference:
http://www.docin.com/p-78963650.html

Intelligent Recommendation

Xml parsing - dom, sax, dom4j

Difference between dom parsing and sax parsing: Dom mode parsing: Allocate a number structure in memory according to the hierarchical structure of xml, and encapsulate the tags, attributes and text of...

dom xml parsing it, dom4j, SAX

A, XML: 1, concepts: Extensible Markup Language tags are customized, a record is not necessary to compile 2 distinction, xml and the html: 1, html tags are precompiled, xml tag is free play 2, html pa...

Four kinds of XML parsing dom, sax, jdom, dom4j principle and performance comparison

XML: four parsers (dom, sax, jdom, dom4j) principle and performance comparison   Dom is one of the underlying interfaces for parsing xml (the other is sax). And jdom and dom4j are more advan...

dom4j, xpath, sax, DOM parsing xml file (foundation)

basis There are two types of XML parsers, respectively DOM and SAX. DOM DOM parser when parsing an XML document, all elements will document, in accordance with the hierarchy in which it appears, parse...

JAVA parsing xml (JDOM, DOM4J)

JDOM parsing xml Create a SAXBuilder object, create an input stream Load xml into the stream, load the input stream into saxBuilder Return the Document object Get the root node Get the collection of c...

More Recommendation

Four ways to parse XML in Java (DOM, SAX, JDOM, DOM4J)

This article takes the following xml as an example: 1. DOM: The idea of ​​this method is to parse each layer in the xml tag layer by layer in the form of parent node and child node. step: 1. First imp...

Java parses XML four ways: SAX, DOM, DOM4J, JDOM

This article mainly introduces four mainstream Java parsing XML files, for reference only, this article is based on the books.xml file content: SAX analysis XML 1.1 SAXParserHandler.java 1.2 Test clas...

DOM, SAX, JDOM, JAXB parsing XML documents

Kree wrote   There are four main methods for reading xml files in the java environment:DOM、SAX、JDOM、JAXB 1.  DOM(Document Object Model)  This method is mainly provided by the W3C. It re...

Parsing XML messages ---- DOM, JDOM, SAX

example: DOM document object model (memory overflow when 10M document) advantage: 1, shapeIt becomes a tree structure, which helps to better understand and master, and the code is easy to write. 2. Du...

xml analysis ---java DOM, SAX, JDOM, DOM4J

XML is a common text format, which can provide us with the standard of information transmission of various system components and the mode of information persistent storage. Generally, in interface des...

Copyright  DMCA © 2018-2026 - All Rights Reserved - www.programmersought.com  User Notice

Top