Unidbg linker partial source code analysis

Unidbg linker part source code analysis (Part 2)

outline

In the previous article, we analyzed the loadInternal method in the AndroidElfLoader class. It's very long and large. You can't read it until you bear it. In the next article, let's analyze other details of Linker in Unidbg

align

In the above, the ARM.align method is called in many places for page alignment. Let's take a look at the implementation of this method

public static Alignment align(long addr, long size, long alignment) {
    // addr: starting address
    // Size: size
    // alignment: page size
    long mask = -alignment;
    // Calculate right boundary
    long right = addr + size;
    right = (right + alignment - 1) & mask;
    // addr is the left boundary
    addr &= mask;
    size = right - addr;
    // This line feels unnecessary, because after the left and right boundaries are aligned, the difference is also aligned, so there is no need to align again
    size = (size + alignment - 1) & mask;
    // Encapsulated object
    return new Alignment(addr, size);
}

public class Alignment {
    public final long address;
    public final long size;

    public Alignment(long address, long size) {
        this.address = address;
        this.size = size;
    }
}

resolveLibrary

How does Unidbg load dependent libraries? Let's look at the following code, which is the dependency library loading part of our analysis in the previous article

// The modules field stores all the loaded libraries. Here we are looking for whether the So has been loaded
LinuxModule loaded = modules.get(neededLibrary);
if (loaded != null) {
    // If it is loaded, add the reference count to the neededLibraries variable
    loaded.addReferenceCount();
    neededLibraries.put(FilenameUtils.getBaseName(loaded.name), loaded);
    continue;
}
// If the dependency has not been loaded, start looking for the dependency file and find it in the path of the current So first
LibraryFile neededLibraryFile = libraryFile.resolveLibrary(emulator, neededLibrary);

// If it is not found in the current path, go to the library parser
if (libraryResolver != null && neededLibraryFile == null) {
    neededLibraryFile = libraryResolver.resolveLibrary(emulator, neededLibrary);
}

First, analyze the libraryFile.resolveLibrary method

public LibraryFile resolveLibrary(Emulator<?> emulator, String soName) {
    // Directly find the file with the corresponding name under the current so path
    File file = new File(elfFile.getParentFile(), soName);
    // If not, null is returned
    return file.canRead() ? new ElfLibraryFile(file, is64Bit) : null;
}

Look at the following search methods

if (libraryResolver != null && neededLibraryFile == null) {
    neededLibraryFile = libraryResolver.resolveLibrary(emulator, neededLibrary);
}

This code uses libraryResolver to parse a So file, and this libraryResolver is what we use

memory.setLibraryResolver(new AndroidResolver(23));

Created, how is it parsed? Keep looking

public LibraryFile resolveLibrary(Emulator<?> emulator, String libraryName) {
    if (needed == null) {
        return null;
    }

    if (!needed.isEmpty() && !needed.contains(libraryName)) {
        return null;
    }
    // Call the following overload
    return resolveLibrary(emulator, libraryName, sdk);
}

static LibraryFile resolveLibrary(Emulator<?> emulator, String libraryName, int sdk) {
    final String lib = emulator.is32Bit() ? "lib" : "lib64";
    // Obviously, find the Library under the following path. Let's find a path to see. In fact, it is some system libraries
    String name = "/android/sdk" + sdk + "/" + lib + "/" + libraryName.replace('+', 'p');
    URL url = AndroidResolver.class.getResource(name);
    if (url != null) {
        return new URLibraryFile(url, libraryName, sdk, emulator.is64Bit());
    }
    return null;
}

VirtualModule

The virtual module is used to register a virtual module when the target So depends on a So, but the So has little effect or can't even be used, So as to prevent So dependence from reporting errors

Unidbg provides two virtual modules. You can also implement the VirtualModule interface yourself

  • libandroid.so
  • libjnigraphics.so
    Some simple processing has been done

If libjnigraphics.so is used in the target So to be analyzed, and it does not affect our analysis, and it is not even used or generally not used, we can do So

new JniGraphics(emulator, vm).register(memory);

Then there will be no error when the target So is loaded. Let's see how Unidbg handles it

 public Module register(Memory memory) {
    if (name == null || name.trim().length() < 1) {
        throw new IllegalArgumentException("name is empty");
    }
    if (symbols.isEmpty()) {
        throw new IllegalArgumentException("symbols is empty");
    }

    if (log.isDebugEnabled()) {
        log.debug(String.format("Register virtual module[%s]: (%s)", name, symbols));
    }
    return memory.loadVirtualModule(name, symbols);
}

The above calls the memory.loadVirtualModule method, and finally returns to the loadVirtualModule method of AndroidElfLoad

public Module loadVirtualModule(String name, Map<String, UnidbgPointer> symbols) {
    LinuxModule module = LinuxModule.createVirtualModule(name, symbols, emulator);
    modules.put(name, module);
    if (maxSoName == null || name.length() > maxSoName.length()) {
        maxSoName = name;
    }
    return module;
}

The processing method is simple and rough. Create a Linux module and load it directly into our modules. This module saves the So we have loaded, So when the target So is loaded, we can find the virtual module without error

initFunctions call

There are so many loading parts to be added in the previous article. Let's analyze the remaining processes and continue to look at a method in our last class

protected final LinuxModule loadInternal(LibraryFile libraryFile, boolean forceCallInit) {
    // The File object is encapsulated as a LibraryFile object
    try {
        // Then the loadinternal method is called to continue the loading process
        LinuxModule module = loadInternal(libraryFile);
        // Processing symbols (about repositioning)
        resolveSymbols(!forceCallInit);
        // callInitFunction defaults to true
        if (callInitFunction || forceCallInit) {
            // Call initialization function
            for (LinuxModule m : modules.values().toArray(new LinuxModule[0])) {
                boolean forceCall = (forceCallInit && m == module) || m.isForceCallInit();
                if (callInitFunction) {
                    m.callInitFunction(emulator, forceCall);
                } else if (forceCall) {
                    m.callInitFunction(emulator, true);
                }
                m.initFunctionList.clear();
            }
        }
        // Add reference count
        module.addReferenceCount();
        return module;
    } catch (IOException e) {
        throw new IllegalStateException(e);
    }
}

We have only analyzed one sentence of this whole method, and then we have analyzed the content of the whole previous article. Next, we continue to analyze downward

resolveSymbols(!forceCallInit);

We won't talk about this method. What it does is to relocate all non relocated places and finally determine the operation

Remember that in the process of loading So, the init function was only added to a list and saved in the Linux module object, but it has not been executed yet, right? Let's look at the following code

// First, make a judgment on callinitfunction (default) 𞓜 forcecallinit (this parameter is passed in)
if (callInitFunction || forceCallInit) {
    // Traverse all modules
    for (LinuxModule m : modules.values().toArray(new LinuxModule[0])) {
        // Two cases are true
        // 1. The module is loaded by ourselves and the forceCallInit parameter is set to true
        // 2. The module itself has a forceCallInit parameter, which is true by default
        boolean forceCall = (forceCallInit && m == module) || m.isForceCallInit();

        // Call initialization function
        if (callInitFunction) {
            m.callInitFunction(emulator, forceCall);
        } else if (forceCall) {
            m.callInitFunction(emulator, true);
        }

        // Remove all initialization functions under the module
        m.initFunctionList.clear();
    }
}

Therefore, after analyzing the above, we know that the forceCallInit parameter is useless only for initialization, right? Because the callInitFunction is true by default, it needs to be called to take effect

memory.disableCallInitFunction();

To disable the default initialization, continue to analyze the callInitFunction method

void callInitFunction(Emulator<?> emulator, boolean mustCallInit) throws IOException {
    // If it is not necessary to initialize So and there are unprocessed symbols, it is OK, and it is not initialized
    if (!mustCallInit && !unresolvedSymbol.isEmpty()) {
        for (ModuleSymbol moduleSymbol : unresolvedSymbol) {
            log.info("[" + name + "]" + moduleSymbol.getSymbol().getName() + " symbol is missing before init relocationAddr=" + moduleSymbol.getRelocationAddr());
        }
        return;
    }
    // Otherwise, the initialization function will be executed next to it. This initFunctionList is the list of initialization functions we analyzed in the previous article
    while (!initFunctionList.isEmpty()) {
        InitFunction initFunction = initFunctionList.remove(0);
        initFunction.call(emulator);
    }
}

summary

That's all for the next analysis. We have already analyzed the loading module in Unidbg. In fact, it's easier to analyze the 'Linker' in Unidbg after we have learned the Linker part in the Android source code. I believe that after reading these two articles, you can master all kinds of problems in loading So by Unidbg. Old rule V:roysue

Posted by MatthewBJones on Fri, 29 Oct 2021 03:13:10 -0700