xml parsing and dom4j API explanation

Keywords: Java xml

1, xml parsing

Read and write xml

1.1 xml parsing method (parsing idea)

  • dom
  • sax

1.1.1 dom parsing method

w3c standard – official.
dom parsing is to analyze the whole xml document into a dom tree and put it into memory.

Advantages: any node can be operated. You can read and write.
Disadvantages: if the file is large, it will occupy more memory and low efficiency. It may also cause memory overflow.

1.1.2 sax parsing method (simple API for XML)

Folk standards.
It is a kind of parsing that can only read.
Load one row and read one row.
It is a relatively lightweight parsing method.

Advantages: high reading efficiency.
Disadvantages: can only read, can not write, and programming is difficult.

1.2 xml parsing package

What I said earlier is the analytical idea: dom sax

  • jaxp – officially provided by sun. It's a bit of abuse. In javase, there is no need to include the import package (understand) in the jdk

  • Dom4j (key) is very powerful (dom/sax)
    DOM (sax and DOM) for java

  • Jdom

1.2.1 dom4j parsing method

To use dom4j, you first import the jar package.
There is a dom4j-1.6.1.jar file under Dom4j, which is imported into our project.

  1. Get a parser
    SAXReader reader = new SAXReader();
  2. Read an xml and get the corresponding Document
    Document document = reader.read(new File("books.xml"));
  3. Gets the root element
    Element root=document.getRootElement();

2, dom4j API details

2.1 method

  1. elements()
    Get all the child elements under all the root elements and return a java.util.List object.
  2. element(String name);
    Gets the child element with the specified name under the first element.
    The returned is an Element
  3. getText()
    Gets the text information in the element.
  4. DocumentHelper
    This is used for creating operations.
    Create new element
    DocumentHelper.createElement("type")
  5. add()
    Add operation
  6. Add attribute
    addAttribute(String name,String value);
    If the attribute exists, it is modified
  7. get attribute
    Element.attribute(String name); An Attribute object is returned
  8. Delete attribute
    Element.remove(Attribute a);

2.2 dom4j writeback

XMLWriter writer = new XMLWriter(new FileWriter(file));
writer.write(document);
writer.close();

2.2.1 there will be coding problems during operation

Reason: if the xml file is UTF-8, the filewriter uses the system default encoding. If it is a Chinese system, the gbk encoding is used.
Solution:

  1. XMLWriter writer = new XMLWriter(new OutputStreamWriter(new FileOutputStream(file),"utf-8"));
    Use to specify the encoded character stream.
  2. Provide an OutputFormat class in dom4j that specifies the encoding.
    OutputFormat format=OutputFormat.createPrettyPrint(); //Formatted
    format.setEncoding("utf-8");//Specify encoding
    XMLWriter writer = new XMLWriter(new FileOutputStream(file),format);
    writer.write(document);
    writer.close();
    

2.2.2 code demonstration

students.xml

<?xml version="1.0" encoding="UTF-8"?>
<students> 
<student id="a1003C"> 
<name>Li Si</name> 
<age>18</age> 
<gender>female</gender> 
</student> 
<student id="a1003C"> 
<name>Zhang San</name> 
<age>18</age> 
<gender>female</gender> 
</student> 
<student id="a1003C">
<name>Zhang San</name>
<age>18</age>
<gender>female</gender>
</student>
</students>

1) Get root element

DemoTest
public class DemoTest {

public static void main(String[] args) throws Exception {
SAXReader reader = new SAXReader();
Document read = reader.read(new File("students.xml"));
Element root = read.getRootElement();  //Get root element
System.out.println(root.getName());//Get the name of the following element
}

}

2) Gets the child element under one / all elements

DemoTest2

public class DemoTest2 {

public static void main(String[] args) throws Exception {
SAXReader reader = new SAXReader();
Document read = reader.read(new File("students.xml"));
Element root = read.getRootElement();  //Get root element
// System.out.println(root.getName());
// Element stu = root.element("student");
// System.out.println(stu.getName());
List<Element> list = root.elements();  //Gets all child elements of the root element
System.out.println(list.size());
Element element = list.get(1);
Element name = element.element("name");
System.out.println(name.getText());
}

}

3) Add a node with attributes and write the data back

public class DemoTest3 {

public static void main(String[] args) throws Exception {
SAXReader reader = new SAXReader();
Document read = reader.read(new File("students.xml"));
Element root = read.getRootElement();
List<Element> list = root.elements();
Element ce = DocumentHelper.createElement("address");
ce.setText("Beijing");
ce.addAttribute("id", "10");
list.get(1).add(ce);
//Data write back
XMLWriter writer = new XMLWriter(new FileOutputStream(new File("students.xml")),OutputFormat.createPrettyPrint());
writer.write(read);
writer.close();
}

}

4) Modify the content of a node (text content and attribute value of the node)

public class DemoTest4 {

public static void main(String[] args) throws Exception {
SAXReader reader = new SAXReader();
Document read = reader.read(new File("students.xml"));
Element root = read.getRootElement();
List<Element> list = root.elements();
Element student2 = list.get(1);
Element address = student2.element("address");
address.setText("Tianjin");
address.addAttribute("id", "20");
//Data write back
XMLWriter writer = new XMLWriter(new FileOutputStream(new File("students.xml")),OutputFormat.createPrettyPrint());
writer.write(read);
writer.close();
}

}

5) Delete a node

public static void main(String[] args) throws Exception {
SAXReader reader = new SAXReader();
Document read = reader.read(new File("students.xml"));
Element root = read.getRootElement();
List<Element> list = root.elements();
list.get(1).remove(list.get(1).element("address"));
//Data write back
XMLWriter writer = new XMLWriter(new FileOutputStream(new File("students.xml")),OutputFormat.createPrettyPrint());
writer.write(read);
writer.close();
}

Posted by Horizon88 on Tue, 07 Sep 2021 19:33:06 -0700