URI Source Code Analysis

Keywords: Fragment Java ascii

You need to know in advance what a URI is and the differences between URIs and URL s.
The difference between URI, URL and URN

URI references consist of up to three parts: schemas, schema-specific parts, and fragment identifiers. Generally:
Patterns: Patterns Specific Part: Fragments
If the pattern is omitted, the URI reference is relative. If the fragment identifier is omitted, the URI reference is a pure URI.

URI is an abstraction of URLs, which includes not only the URLs of uniform resource locators, but also the URNs of uniform resource names. In fact, most of the URIs used are URLs. In java, URIs are represented by java.net.URI classes. URI classes can only identify resources and parse URIs, but can not obtain resources identified by URIs (URNs cannot locate resources).

structure

public URI(String str) throws URISyntaxException {
    new Parser(str).parse(false);
}

public URI(String scheme, String host, String path, String fragment)
    throws URISyntaxException{
    this(scheme, null, host, -1, path, null, fragment);
}

public URI(String scheme,
           String authority,
           String path, String query, String fragment)
    throws URISyntaxException{
....
}

public URI(String scheme,
           String userInfo, String host, int port,
           String path, String query, String fragment)
     throws URISyntaxException{
...
}

public URI(String scheme, String ssp, String fragment) throws URISyntaxException {
    new Parser(toString(scheme, ssp, null, null, null, -1, null, null, fragment)).parse(false);
}

URI class provides the construction method in 5

  1. Construct a URI object based on a URI string provided.
  2. Mainly for hierarchical URIs. URI is constructed by mode, server address, file path and fragment identifier.
  3. Mainly for hierarchical URIs. URI is constructed by mode, authorization authority, file path, query condition and fragment identification.
  4. Mainly for hierarchical URIs. URI is constructed by mode, user information, server address, port, file path, query condition and fragment identification.
  5. Mainly for non-hierarchical URIs. Create URIs through pattern, pattern specific parts, and fragment identifiers.

create method

public static URI create(String str) {
    try {
        return new URI(str);
    } catch (URISyntaxException x) {
        throw new IllegalArgumentException(x.getMessage(), x);
    }
}

If you can confirm that the URI is in the correct format, you can use the factory method of create to create the URI. Because this method does not throw URISyntaxException exceptions.

Transparent URL

URIs are usually hierarchical (with "/"), but there are also opaque (without "/"). The hierarchical URIs contain modes, hosts, sites and other parts. Of course, some parts may not, but opaque URIs only contain three parts, Scheme,Scheme-specific-part,Fragment.
For example: mailto:jijianshuai@infcn.com.cn

public boolean isOpaque() {
    return path == null;
}

To determine whether path is empty or not, if it is empty, it is opaque, indicating that there is no "/" in the URI.


Paste_Image.png


Parse URI in URI constructor, code: new Parser(str).parse(false);

Determine whether the "/" symbol exists in the URI, and if it exists, it is a hierarchical URI.
If "/" exists, parseAuthority method is called to parse the path.

URI Gets Part Information

  1. Acquisition pattern
    public String getScheme();
  2. Obtain pattern specific parts
    public String getSchemeSpecificPart();
    public String getRawSchemeSpecificPart();
  3. Obtain fragments
    public String getFragment();
    public String getRawFragment();
  4. Authorized bodies
    public String getAuthority();
    public String getRawAuthority()
    Authorization agencies include: user information, server address (domain name or ip), port
    user:password@localhost:80
  5. Get fragment identifiers
    public String getFragment()
    public String getRawFragment()
  6. Get the server address (domain name or ip)
    public String getHost()
  7. Getting Path
    public String getPath()
    public String getRawPath()
    The path includes (directory structure and file section). For example: / dir/index.html
  8. Get Port
    public int getPort()
    If there is no port, return - 1.
  9. Query String for Getting URI
    public String getQuery()
    public String getRawQuery()
  10. Getting User Information
    public String getUserInfo()
    public String getRawUserInfo()

If the URI is opaque, only 1 to 3 information can be obtained.
If the URI is a hierarchy, all information can be obtained.

In this method, Raw is used to obtain part of the encoded URI information. Non-ascii characters need to be coded, and the method without Raw is to decode the information.

The three methods of getScheme, getHost and getPort do not have Raw method because these three parts do not appear non-ascii characters.

resolve method

The resolve method can convert relative URIs into absolute URIs. Examples are as follows:

URI a = URI.create("http://localhost:8080/index.html");
URI b = URI.create("user/userInfo.html");
URI c = a.resolve(b);
System.out.println(c);

Obtain the absolute path of b according to a

The printing results are as follows: http://localhost:8080/user/userInfo.html

The source code is as follows
public URI resolve(URI uri) {
    return resolve(this, uri);
}

private static URI resolve(URI base, URI child) {
    // check if child if opaque first so that NPE is thrown
    // if child is null.
    if (child.isOpaque() || base.isOpaque())
        return child;

    // 5.2 (2): Reference to current document (lone fragment)
    if ((child.scheme == null) && (child.authority == null)
        && child.path.equals("") && (child.fragment != null)
        && (child.query == null)) {
        if ((base.fragment != null) && child.fragment.equals(base.fragment)) {
            return base;
        }
        URI ru = new URI();
        ru.scheme = base.scheme;
        ru.authority = base.authority;
        ru.userInfo = base.userInfo;
        ru.host = base.host;
        ru.port = base.port;
        ru.path = base.path;
        ru.fragment = child.fragment;
        ru.query = base.query;
        return ru;
    }

    // 5.2 (3): Child is absolute
    if (child.scheme != null)
        return child;
    URI ru = new URI();             // Resolved URI
    ru.scheme = base.scheme;
    ru.query = child.query;
    ru.fragment = child.fragment;

    // 5.2 (4): Authority
    if (child.authority == null) {
        ru.authority = base.authority;
        ru.host = base.host;
        ru.userInfo = base.userInfo;
        ru.port = base.port;

        String cp = (child.path == null) ? "" : child.path;
        if ((cp.length() > 0) && (cp.charAt(0) == '/')) {
            // 5.2 (5): Child path is absolute
            ru.path = child.path;
        } else {
            // 5.2 (6): Resolve relative path
            ru.path = resolvePath(base.path, cp, base.isAbsolute());
        }
    } else {
        ru.authority = child.authority;
        ru.host = child.host;
        ru.userInfo = child.userInfo;
        ru.host = child.host;
        ru.port = child.port;
        ru.path = child.path;
    }

    // 5.2 (7): Recombine (nothing to do here)
    return ru;
}
  1. Is it an opaque URI and, if so, returns directly to the child?
  2. Determine whether a child has only fragment (fragment identifier). Execute 2.1 if the child has only fragment identifiers. Otherwise, execute 3.
    2.1 If the fragment of the child is identical to the fragment identifier of the base, the url of the base is returned directly.
    2.2 Construct a new URI to return fragments that do not contain fragments and fragments of child ren.
  3. If the scheme of a child is not empty, it returns directly to the child. Not empty means he is the absolute path.
  4. Construct the absolute path URI of the child according to the parts of the base URI and return it.

relativize method

The relativize method can convert the URI of absolute paths into the URI of relative paths.

URI a = URI.create("http://localhost:8080/");
URI b = URI.create("http://localhost:8080/index.html");
URI c = a.relativize(b);
System.out.println(c);

Get the relative path of b relative to a.

The result of printing is: index.html

private static URI relativize(URI base, URI child) {
    // check if child if opaque first so that NPE is thrown
    // if child is null.
    if (child.isOpaque() || base.isOpaque())
        return child;
    if (!equalIgnoringCase(base.scheme, child.scheme)
        || !equal(base.authority, child.authority))
        return child;

    String bp = normalize(base.path);
    String cp = normalize(child.path);
    if (!bp.equals(cp)) {
        if (!bp.endsWith("/"))
            bp = bp + "/";
        if (!cp.startsWith(bp))
            return child;
    }

    URI v = new URI();
    v.path = cp.substring(bp.length());
    v.query = child.query;
    v.fragment = child.fragment;
    return v;
}
  1. Determine if the child is not a transparent URI, and if not, return directly to the child. Uri, which is not hierarchical, has no relative path.
  2. Determine whether the scheme s and authorizations of the two URI s are different, and if they are different, return directly to the child.
  3. Determine if base ends with "/", and if not, add "/"
  4. Determine whether the child starts with base or returns directly to the child if not.
  5. Return to child, not including the base part, and construct a new URI return.

Posted by damanic on Sun, 09 Jun 2019 19:13:18 -0700