Java Web Learning Diary - SAX Parsing XML

Keywords: Java xml encoding

1. How SAX parses XML documents:

Unlike DOM parsing, DOM parsing is to allocate a tree structure in memory according to the hierarchical structure of XML, encapsulating the labels, attributes and text of XML as objects. The advantage is that it is very convenient to add, delete and modify. The disadvantage is that if the file is too large, it will cause memory overflow. SAX parses XML using event-driven, read-and-parse: top-down, line-by-line parsing, parsing to an object, returning the object name. The advantage is that it will not cause memory overflow and can easily implement query operations. The disadvantage is that it can not achieve add, delete and change operations.

2. Before using the sax parser in jaxp, you need to obtain the sax parser:

1) Call the SAXParserFactory.newInstanse() method to get the factory that created the SAX parser

2) Call the new SAXParse () method of the factory object to get the saxParse parser object

3) Parser () method of parser object is called to parse XML document. The parameters of this method are two, the path of XML document and the driving event object. That is: parse (File f, Default Handler dh)

Or: parse (String url, Default Handler dh)

It is important to note that before parsing an xml document, we need to customize a class that inherits from DefaultHandler and overrides the three methods startElement(), characters(), and endElement().

3. There are the following XML documents

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
< One Piece >
    "Straw-hat Pirate Regiment"
        Captain > Luffy </Captain >
        Chef > Yamaguchi </Chef >
        Navigator > Namei </ Navigator >
        Ship Doctor > Joba </ Ship Doctor >
        Musician > Brooke </ Musician >
        Competence > Lufei </ Competence >
        Ability > Joba </ Ability >
        "Capabilities" Brooke </Capabilities >
    </Straw-hat Pirate Regiment>
    "Straw-hat Pirate Regiment"
        <Vice Captain>Solon</Vice Captain>
        Archaeologist > Robin </Archaeologist >
        Sniper > Utop </ Sniper >
        Shipman > Frankie </ Shipman >
        Ability > Robin </Ability >
    </Straw-hat Pirate Regiment>
< One Piece >

Requirements: Get the content of the first <Capabilities> tag

The code is as follows:

package cn.roger.Jaxp;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

/*
 * Parsing XML documents using sax in Jaxp
 */
public class SaxTest1 {
    public static void main(String[] args) throws Exception {
        // Create parsers
        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser parser = factory.newSAXParser();
        //Create event-driven
        MyHandler dh = new MyHandler();
        parser.parse("src/OnePiece.xml", dh);
    }
}

class MyHandler extends DefaultHandler {
    boolean flag = false;
    int index = 0;

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if (qName.equals("Capable person")) {
            flag = true;
            index++;
        }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        if (flag && index == 1) {
            System.out.println(new String(ch, start, length));
            flag = false;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        // System.out.print("</" + qName + ">");
    }

}

Operation results:

Posted by EZbb on Sat, 18 May 2019 22:48:56 -0700