What did Runtime do during the startup of Runtime objc4-779.1 App?

Keywords: Swift iOS Attribute OS X

Knowledge preparation

First of all, we need to know a scenario. We click on the application on the screen, and then we can see that the application is fully displayed and can be operated. In this process, the system, runtime and our own code have done a lot of work, and many excellent blogs have described this process in detail, such as In depth understanding of the startup process of iOS App - by for yourself and for the future In fact, most of the content of this article comes from one of Apple's WWDC2016 Official Video , if you are not familiar with the App startup process, you can take a look at the combination of the two

What did Runtime do during startup?

First, we need to know the steps involved in the startup process

  • 1. Dyld loads the executable
  • 2. Load dynamic library (recursively load all)
  • 3. Rebase (to solve the problem of pointer pointing of Mach-O internal symbols caused by ASLR)
  • 4. Bind (to solve the problem of pointer pointing of external symbols caused by ASLR in Mach-O)
  • 5. Objc (solve class, class loading, etc.)
  • 6, initializers (methods that perform load, C/C + + static initialization, and mark as attribute(constructor))

It is self-evident that what we wrote in this blog is the detailed analysis of step 5

Register dyld to inform objc-os.mm line 913

void _objc_init(void)
{
    ....Various initialization procedures are omitted
	
	// Register the notice corresponding to map, load and unmap image methods. According to the process that dyld initializes ImageLoader to load various images during App startup, it is obvious to cooperate with dyld to do some image loading operations
    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

As you can see here, we can conclude that in addition to necessary environment initialization during App startup, the main operations of Runtime are map images and load images

image objc-runtime-new.mm line 2938 for dyld mapping

void
map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    mutex_locker_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}

// This method is relatively long, about 150 lines, we only leave the key code
void 
map_images_nolock(unsigned mhCount, const char * const mhPaths[],
                  const struct mach_header * const mhdrs[])
{
	// If the number of all images is greater than 0
    if (hCount > 0) {
    	// Here we go to the core logic 
        _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);
    }
}

Core logic 1 ﹣ read ﹣ images objc-runtime-new.mm line 3244

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
	// image header information
    header_info *hi;
    uint32_t hIndex;
    size_t count;
    size_t i;
    Class *resolvedFutureClasses = nil;
    size_t resolvedFutureClassCount = 0;
    static bool doneOnce;
    bool launchTime = NO;
    TimeLogger ts(PrintImageTimes);

    runtimeLock.assertLocked();

// Define an each header macro to facilitate each header in for mation
#define EACH_HEADER \
    hIndex = 0;         \
    hIndex < hCount && (hi = hList[hIndex]); \
    hIndex++

    if (!doneOnce) {
        doneOnce = YES;
        launchTime = YES;

#if SUPPORT_NONPOINTER_ISA
        // Disable non-pointer isa under some conditions.

# if SUPPORT_INDEXED_ISA
        // Disable nonpointer isa if any image contains old Swift code
        // Version processing before Swift3.0
        for (EACH_HEADER) {
            if (hi->info()->containsSwift()  &&
                hi->info()->swiftUnstableVersion() < objc_image_info::SwiftVersion3)
            {
                DisableNonpointerIsa = true;
                if (PrintRawIsa) {
                    _objc_inform("RAW ISA: disabling non-pointer isa because "
                                 "the app or a framework contains Swift code "
                                 "older than Swift 3.0");
                }
                break;
            }
        }
# endif

// If it's an OSX system, we don't need to watch it here
# if TARGET_OS_OSX
		// 
        // Disable non-pointer isa if the app is too old
        // (linked before OS X 10.11)
        
        if (dyld_get_program_sdk_version() < DYLD_MACOSX_VERSION_10_11) {
            DisableNonpointerIsa = true;
            if (PrintRawIsa) {
                _objc_inform("RAW ISA: disabling non-pointer isa because "
                             "the app is too old (SDK version " SDK_FORMAT ")",
                             FORMAT_SDK(dyld_get_program_sdk_version()));
            }
        }

        // Disable non-pointer isa if the app has a __DATA,__objc_rawisa section
        // New apps that load old extensions may need this.
        for (EACH_HEADER) {
            if (hi->mhdr()->filetype != MH_EXECUTE) continue;
            unsigned long size;
            if (getsectiondata(hi->mhdr(), "__DATA", "__objc_rawisa", &size)) {
                DisableNonpointerIsa = true;
                if (PrintRawIsa) {
                    _objc_inform("RAW ISA: disabling non-pointer isa because "
                                 "the app has a __DATA,__objc_rawisa section");
                }
            }
            break;  // assume only one MH_EXECUTE image
        }
# endif

#endif

        if (DisableTaggedPointers) {
            disableTaggedPointers();
        }
        // Initializing tag pointer obfuscator, the execution object of ALSR Technology
        initializeTaggedPointerObfuscator();

		// Print loading class quantity information
        if (PrintConnecting) {
            _objc_inform("CLASS: found %d classes during launch", totalClasses);
        }
		
		// This is to prepare a hash map for class loading, because the load factor of this map is 4 / 3, that is, when its content occupies 3 / 4 of the table capacity, it can't continue to store in the normal way, and it will be expanded in general (the expansion method is generally the capacity doubling, you can refer to the expansion method of the hash table of the associated object)
        // namedClasses
        // Preoptimized classes don't go in this table.
        // 4/3 is NXMapTable's load factor
        int namedClassesSize = 
            (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3;
        gdb_objc_realized_classes =
            NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);

        ts.log("IMAGE TIMES: first time tasks");
    }

	// Register all sels in the hash table, which is not the same as the hash table prepared for class in the previous step
    // Fix up @selector references
    static size_t UnfixedSelectors;
    {
        mutex_locker_t lock(selLock);
        for (EACH_HEADER) {
            if (hi->hasPreoptimizedSelectors()) continue;

            bool isBundle = hi->isBundle();
            SEL *sels = _getObjc2SelectorRefs(hi, &count);
            UnfixedSelectors += count;
            for (i = 0; i < count; i++) {
                const char *name = sel_cname(sels[i]);
                // Register SEL operations
                SEL sel = sel_registerNameNoLock(name, isBundle);
                if (sels[i] != sel) {
                    sels[i] = sel;
                }
            }
        }
    }

    ts.log("IMAGE TIMES: fix up selector references");

    // Discover classes. Fix up unresolved future classes. Mark bundle classes.
    bool hasDyldRoots = dyld_shared_cache_some_image_overridden();


	// Traverse all lazy loaded classes and implement
    for (EACH_HEADER) {
        if (! mustReadClasses(hi, hasDyldRoots)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }
		
		// Reading the pointer position of the classlist from the header information
        classref_t const *classlist = _getObjc2ClassList(hi, &count);

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->hasPreoptimizedClasses();

        for (i = 0; i < count; i++) {
            Class cls = (Class)classlist[i];
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            if (newCls != cls  &&  newCls) {
                // Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class));
                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
            }
        }
    }

    ts.log("IMAGE TIMES: discover classes");

    // Fix up remapped classes
    // Class list and nonlazy class list remain unmapped.
    // Class refs and super refs are remapped for message dispatching.
    if (!noClassesRemapped()) {
        for (EACH_HEADER) {
            Class *classrefs = _getObjc2ClassRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
            // fixme why doesn't test future1 catch the absence of this?
            classrefs = _getObjc2SuperRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
        }
    }

    ts.log("IMAGE TIMES: remap classes");

// Fix old function pointer legacy
#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
    for (EACH_HEADER) {
        message_ref_t *refs = _getObjc2MessageRefs(hi, &count);
        if (count == 0) continue;

        if (PrintVtables) {
            _objc_inform("VTABLES: repairing %zu unsupported vtable dispatch "
                         "call sites in %s", count, hi->fname());
        }
        for (i = 0; i < count; i++) {
            fixupMessageRef(refs+i);
        }
    }

    ts.log("IMAGE TIMES: fix up objc_msgSend_fixup");
#endif

    bool cacheSupportsProtocolRoots = sharedCacheSupportsProtocolRoots();
	
	// Traverse all protocol lists and load them into the hash table of the protocol
    // Discover protocols. Fix up protocol refs.
    for (EACH_HEADER) {
        extern objc_class OBJC_CLASS_$_Protocol;
        Class cls = (Class)&OBJC_CLASS_$_Protocol;
        ASSERT(cls);
        NXMapTable *protocol_map = protocols();
        bool isPreoptimized = hi->hasPreoptimizedProtocols();

        // If the image is in the shared cache, skip this step, because there is a shared cache in the loading process of dyld, which stores all the dynamic libraries, which not only improves the image loading process of the application, but also speeds up the application startup speed, and makes the system run more efficiently
        if (launchTime && isPreoptimized && cacheSupportsProtocolRoots) {
            if (PrintProtocols) {
                _objc_inform("PROTOCOLS: Skipping reading protocols in image: %s",
                             hi->fname());
            }
            continue;
        }

        bool isBundle = hi->isBundle();

		// Read and initialize protocol in compiler
        protocol_t * const *protolist = _getObjc2ProtocolList(hi, &count);
        for (i = 0; i < count; i++) {
            readProtocol(protolist[i], cls, protocol_map, 
                         isPreoptimized, isBundle);
        }
    }

    ts.log("IMAGE TIMES: discover protocols");

    // Fix up @protocol references
    // Preoptimized images may have the right 
    // answer already but we don't know for sure.
    // Repair the protocol list reference. The optimized images may be correct, but they are not sure
    for (EACH_HEADER) {
        // At launch time, we know preoptimized image refs are pointing at the
        // shared cache definition of a protocol.  We can skip the check on
        // launch, but have to visit @protocol refs for shared cache images
        // loaded later.
        if (launchTime && cacheSupportsProtocolRoots && hi->isPreoptimized())
            continue;
        protocol_t **protolist = _getObjc2ProtocolRefs(hi, &count);
        for (i = 0; i < count; i++) {
            remapProtocolRef(&protolist[i]);
        }
    }

    ts.log("IMAGE TIMES: fix up @protocol references");


	// Traverse all categories and process
    // Discover categories.
    for (EACH_HEADER) { // Find the current class, find the category array corresponding to the class
        bool hasClassProperties = hi->info()->hasCategoryClassProperties();
		
		// Traverse all classes
        auto processCatlist = [&](category_t * const *catlist) {
        	// Traverse the category array corresponding to the class
            for (i = 0; i < count; i++) {
                category_t *cat = catlist[i];
                Class cls = remapClass(cat->cls);
                locstamped_category_t lc{cat, hi};
                // No corresponding class found
                if (!cls) {
                    // Category's target class is missing (probably weak-linked).
                    // Ignore the category.
                    if (PrintConnecting) {
                        _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                     "missing weak-linked target class",
                                     cat->name, cat);
                    }
                    continue;
                }
                
                // 
                // Process this category.
                if (cls->isStubClass()) {
                    // /Stub classes will never be implemented.
                    // Stub classes do not know their metaclasses before initialization,
                    // So we have to add classes with class methods or properties to the stub itself.
                    // methodizeClass () will find them and add them to the metaclass as appropriate.
                    if (cat->instanceMethods ||
                        cat->protocols ||
                        cat->instanceProperties ||
                        cat->classMethods ||
                        cat->protocols ||
                        (hasClassProperties && cat->_classProperties))
                    {
                        objc::unattachedCategories.addForClass(lc, cls);
                    }
                } else {
                   // If the target class has been implemented, first merge the content of category into the target class,
                   // If it is not implemented, perform the merge operation after implementation
                    if (cat->instanceMethods ||  cat->protocols
                        ||  cat->instanceProperties)
                    {
                        if (cls->isRealized()) {
                            attachCategories(cls, &lc, 1, ATTACH_EXISTING);
                        } else {
                            objc::unattachedCategories.addForClass(lc, cls);
                        }
                    }
                    
                    // The logic here is the same as the above logic. It's just the operation of meta class. The above logic is the operation of ordinary class
                    // So we can also add category to meta class
                    if (cat->classMethods  ||  cat->protocols
                        ||  (hasClassProperties && cat->_classProperties))
                    {
                        if (cls->ISA()->isRealized()) {
                            attachCategories(cls->ISA(), &lc, 1, ATTACH_EXISTING | ATTACH_METACLASS);
                        } else {
                            objc::unattachedCategories.addForClass(lc, cls->ISA());
                        }
                    }
                }
            }
        };
        processCatlist(_getObjc2CategoryList(hi, &count));
        processCatlist(_getObjc2CategoryList2(hi, &count));
    }

    ts.log("IMAGE TIMES: discover categories");

    // Category discovery MUST BE Late to avoid potential races
    // when other threads call the new category code before
    // this thread finishes its fixups.

    // +load handled by prepare_load_methods()

	// Implement all non blue loaded classes, load methods and static instance variables
    // Realize non-lazy classes (for +load methods and static instances)
    for (EACH_HEADER) {
        classref_t const *classlist = 
            _getObjc2NonlazyClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (!cls) continue;
			
			// Register class information into hash table
            addClassTableEntry(cls);
			
			// Handling swift classes
            if (cls->isSwiftStable()) {
                if (cls->swiftMetadataInitializer()) {
                    _objc_fatal("Swift class %s with a metadata initializer "
                                "is not allowed to be non-lazy",
                                cls->nameForLogging());
                }
                // fixme also disallow relocatable classes
                // We can't disallow all Swift classes because of
                // classes like Swift.__EmptyArrayStorage
            }
            realizeClassWithoutSwift(cls, nil);
        }
    }

    ts.log("IMAGE TIMES: realize non-lazy classes");

    // Realize newly-resolved future classes, in case CF manipulates them
    if (resolvedFutureClasses) {
        for (i = 0; i < resolvedFutureClassCount; i++) {
            Class cls = resolvedFutureClasses[i];
            if (cls->isSwiftStable()) {
                _objc_fatal("Swift class is not allowed to be future");
            }
            realizeClassWithoutSwift(cls, nil);
            cls->setInstancesRequireRawIsaRecursively(false/*inherited*/);
        }
        free(resolvedFutureClasses);
    }

    ts.log("IMAGE TIMES: realize future classes");

    if (DebugNonFragileIvars) {
        realizeAllClasses();
    }

	// Print pre optimization information
    // Print preoptimization statistics
    if (PrintPreopt) {
        static unsigned int PreoptTotalMethodLists;
        static unsigned int PreoptOptimizedMethodLists;
        static unsigned int PreoptTotalClasses;
        static unsigned int PreoptOptimizedClasses;

        for (EACH_HEADER) {
            if (hi->hasPreoptimizedSelectors()) {
                _objc_inform("PREOPTIMIZATION: honoring preoptimized selectors "
                             "in %s", hi->fname());
            }
            else if (hi->info()->optimizedByDyld()) {
                _objc_inform("PREOPTIMIZATION: IGNORING preoptimized selectors "
                             "in %s", hi->fname());
            }

            classref_t const *classlist = _getObjc2ClassList(hi, &count);
            for (i = 0; i < count; i++) {
                Class cls = remapClass(classlist[i]);
                if (!cls) continue;

                PreoptTotalClasses++;
                if (hi->hasPreoptimizedClasses()) {
                    PreoptOptimizedClasses++;
                }
                
                const method_list_t *mlist;
                if ((mlist = ((class_ro_t *)cls->data())->baseMethods())) {
                    PreoptTotalMethodLists++;
                    if (mlist->isFixedUp()) {
                        PreoptOptimizedMethodLists++;
                    }
                }
                if ((mlist=((class_ro_t *)cls->ISA()->data())->baseMethods())) {
                    PreoptTotalMethodLists++;
                    if (mlist->isFixedUp()) {
                        PreoptOptimizedMethodLists++;
                    }
                }
            }
        }

        _objc_inform("PREOPTIMIZATION: %zu selector references not "
                     "pre-optimized", UnfixedSelectors);
        _objc_inform("PREOPTIMIZATION: %u/%u (%.3g%%) method lists pre-sorted",
                     PreoptOptimizedMethodLists, PreoptTotalMethodLists, 
                     PreoptTotalMethodLists
                     ? 100.0*PreoptOptimizedMethodLists/PreoptTotalMethodLists 
                     : 0.0);
        _objc_inform("PREOPTIMIZATION: %u/%u (%.3g%%) classes pre-registered",
                     PreoptOptimizedClasses, PreoptTotalClasses, 
                     PreoptTotalClasses 
                     ? 100.0*PreoptOptimizedClasses/PreoptTotalClasses
                     : 0.0);
        _objc_inform("PREOPTIMIZATION: %zu protocol references not "
                     "pre-optimized", UnfixedProtocolReferences);
    }

#undef EACH_HEADER
}

The summary of "read" images is as follows:

  • 1. Initialize tag pointer obfuscator
  • 2. Initialize hashmap for class registration
  • 3. Register all sels in hashmap
  • 4. Initialize all lazy loaded classes
  • 5. Remapping all classes
  • 6. Traverse all protocol lists, initialize them and load them into the hash table of the protocol
  • 7. Process all categories and add them to the corresponding class structure (including ordinary class and meta class)
  • 8. Implement all non blue loaded classes, load methods and static instance variables
  • 9. Print pre optimization information

From the above source code analysis, we can see that the main task of the "read images" method is to do some pre optimization actions to facilitate the subsequent load images and the convenience and efficiency of the entire program running process,

Core logic 2 load image objc-runtime-new.mm line 2955

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Find all the load methods. The load method here is the load method often used in our OC class es
    {
        mutex_locker_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }

    // Execute all the load methods,
    call_load_methods();
}

Core logic 2.1 find the class and category that have the load method, and record. Prepare ﹣ load ﹣ methods objc-runtime-new.mm line 3698
void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertLocked();
	
	// Find the list of non lazy loaded classes
    classref_t const *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    for (i = 0; i < count; i++) {
    	// Record all the loaded method classes into loadable_classes. In this method, schedule_class_load (CLS - > superclass) will be called recursively. Therefore, the order of the records is that the subclass goes to the inheritor chain of the parent class
        schedule_class_load(remapClass(classlist[i]));
    }
	// Find the list of non lazy loaded categories
    category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        if (cls->isSwiftStable()) {
            _objc_fatal("Swift class extensions and categories on Swift "
                        "classes are not allowed to have +load methods");
        }
        realizeClassWithoutSwift(cls, nil);
        ASSERT(cls->ISA()->isRealized());
        // Record all categories with load methods in loadable_categories
        add_category_to_loadable_list(cat);
    }
}

Core logic 2.2 call load methods objc loadmethod.mm line 337 according to the classes and categories recorded in core logic 2.1
void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;
	
	// Auto release pool
    void *pool = objc_autoreleasePoolPush();

    do {
        // load method of class that executes all records
        while (loadable_classes_used > 0) {
            call_class_loads();
        }

        // 2. Perform a class load
        more_categories = call_category_loads();

        // 3. If there are more classes or no classes that have executed the load method, continue
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

In core logic 2.1, when recording the classes and categories with load, it is the subclass - > parent class - > parent class's parent class that is recorded in the array
When the load method is executed in core logic 2.2, it is cycled by subtracting one from the index at the end of the array. Because the class load method is executed first and then the category, the execution order of the load method is parent class, subclass and category

So far, we have analyzed the actions of Runtime during App loading

Sum up

  • In the process of preprocessing, hashmap is widely used, which is the same in the associated objects. Officially, because of the high efficiency of this data structure, it is favored by developers. In our actual development, we can also convert some complex data relationships into hashmap or dictionary to deal with it. In this way, although some space will be lost, the time complexity of O(1) is still very fragrant
  • ASLR is a kind of security protection technology for buffer overflow. By randomizing the linear layout of heap, stack, shared library mapping and so on, and by increasing the difficulty of the attacker in predicting the destination address, it can prevent the attacker from directly locating the attack code location, so as to prevent the overflow attack. This technology is applied in common systems, Because its application has Rebase and Bind steps in the loading process of dyld,
  • Loading order of load method has always been a hot issue in iOS. Now we can also know why its execution order is parent class, subclass and category. The execution order of load method between different categories is related to the order of dyld loading category. We can control this in XCode, Baidu and many other data about this
  • There are a lot of logic to deal with the old version and swift in the code during the loading process, which are historical problems. However, we can see that frequent version changes have a great impact on the code's neatness, readability and structure. Apple developers have used a lot of macro definitions and precompiled methods to deal with version compatibility, which is worth learning
143 original articles published, 36 praised, 410000 visitors+
His message board follow

Posted by smerny on Tue, 10 Mar 2020 04:09:32 -0700