Parsing and Implementing a Simple webpack at a Glance

Keywords: node.js Webpack JSON less npm

Previous Situation Review

Understand the basic configuration of webpack at a glance
Advanced configuration and optimization of Web pack at a glance

1. Introduction

This paper mainly describes the working principle of web pack, its packaging process, step by step analysis of its packaging process, and then simulate the implementation of a simple web pack, mainly for a deeper understanding of its packaging process, in order to fully reflect its significance in the mountains, so the name is web-pack.

2. Some features of webpack

  1. The configuration file for webpack is a.js file, which uses the node syntax, mainly exports a configuration object, and it uses the commonjs2 specification to export the configuration object, i.e. module.exports={}, in order to parse the configuration file object, webpack will find the configuration textThe object is then read as a require.
  2. All the resources in the webpack can be introduced by require, such as require a picture, require a css file, a scss file, and so on.
  3. The loader in webpack is a function that mainly converts source code, so the loader function takes source code as a parameter, such as converting ES6 to ES5, converting less to css, and then converting CSS to js so that it can be embedded in an html file. The plugin is a class with an apply() method in it, mainly for PlThe installation of Ugin allows you to listen for events from the compiler and do something at the right time.

3. Analysis of the Packaging Principle of Web Package

Webpack simulates require statements in Node.js by customizing a function that can execute u webpack_require_ in both node and browser environments, replacing all require statements in source code with u webpack_require_u, traversing from the entry file to find the entry file dependencies, and traversing the entry file and its dependenciesThe path and corresponding source code of the module are mapped to a module object. When u webpack_require_ executes, the id of the entry file is passed in first, and the source code is taken from the module object and executed. Since the require statements in the source code are replaced with u webpack_require_u function, whenever u webpack_require is encounteredThe require_u function takes the corresponding source code from the modules object and executes it, thus packaging the modules and guaranteeing that the source code executes in the same order.

4. webpack Packaging Process Analysis

webpack startup file:

The webpack first finds the webpack.config.js configuration file in the project, obtains the entire config configuration object as required (configPath), then creates the compiler object for the webpack, and passes the obtained config object as a parameter into the compiler object, that is, config pairs when the Compiler object is createdLike passed into the constructor of a Compiler class as a parameter, the compiler invokes its run() method to perform the compilation when it is created.

Compiler constructor:

What the compiler constructor does: When the compiler is created, the config object is passed into the compiler's constructor, so the config object is saved, and then two particularly important data are saved:
One is the id of the entry file, i.e. the relative path of the entry file to the root directory, because the output file of the webpack package is an anonymous self-executing function that starts with the entry file and calls the function u webpack_require_u (entryId), so you need to tell the webpack entry fileRoute.
Another is the module object, whose property is the relative path of the entry file and all its dependent files to the root directory, because when a module is u webpack_require_u (the relative path of a module), the webpack takes the corresponding source code from the module object and executes it based on this relative path, and the object belongs toThe sex value is a function whose content is the eval of the current module.

In summary, the module object stores the path and source code correspondence of the entry file and its dependent modules. When the webpack package output file bundle.js executes, it executes u webpack_require_u (entryId) in the anonymous self-execution function, finds the source code corresponding to the entry file from the module object, and executes the entry fileWhen a dependency is found, the execution of u webpack_require_u (dependId) continues, and then the source code execution of the dependId is obtained from the modules object until all the dependencies are executed.

Another very important thing to do in the compiler constructor is install the plug-in, which iterates through the array of plugins configured in the configuration file, then calls the apply() method of the plug-in object, which is passed into the compiler object and can be listened on through the incoming compiler objectEvents emitted by the compiler allow the plug-in to choose to complete something at a specific time.

Compiler run:

The main things in the compiler's run() method are buildModule and emitFile.What buildModule does is pass in the absolute path of the entry file, get the source content of the entry file according to the path of the entry file, and parse the source code.
The source acquisition process is divided into two steps: first read the source code content directly from the file, then match according to the configured loader, match successfully, and pass to the corresponding loader function for processing. After the loader process is completed, return the final processed source code.
The main purpose of source code parsing is to convert the content of the source code processed by loader into an AST abstract grammar tree, then traverse the AST abstract grammar tree, find the require statement in the source code, and replace it with the webpack's own require method, webpack_require, and replace the path of require() with the relative path relative to the root directory.After replacement, the replaced source code content is regenerated, all dependencies of the module are found during traversal, the replaced source code and the found dependencies are returned after parsing is completed, and if there are dependencies, the dependencies are traversed so that their dependent modules also execute buildModule() once until all dependencies of the entry file are buildModuleE Complete.
Once the entry file and its dependent modules are built, emitFile can be emitted. First, the output template file is read, then the entryId and modules objects are passed in to render the data. The main purpose is to traverse the modules object to generate the parameter object of the webpack anonymous self-execution function, and fill in the webpack anonymous self-execution function to execute it.The u webpack_require_u (entryId) entry file id to execute later.

5. Implement a simple webpack

1 Make the web-pack command executable

To make the web-pack command executable, we need to configure bin in its package.json with the name of the command, web-pack, and the value of the property as the web-pack startup file, ". /bin/index.js", so that after the web-pack is installed or after the npm link command is executed, the corresponding command will be produced in the / usr/local/bin directory, making w-packThe eb-pack command can be used globally, such as:

// package.json

{
    "bin": {
        "web-pack": "./bin/index.js"
    },
}

(2) Let the web-pack startup file execute directly from the command line

Although the web-pack command can be executed, the file linked to the command is'. /bin/index.js', that is, the JS file'. /bin/index.js'is executed when the web-pack command is entered and cannot be executed directly in the terminal environment, so you need to tell the terminal that the execution environment of the file is node, so you need to'. /bin/index.js'.js "Add # at the beginning of the file!/usr/bin/env node, executed in the node environment". /bin/index.js "The contents of the file, such as:

// ./bin/index.js

#! /usr/bin/env node

(3) Obtain configuration files, create compilers and execute them

// ./bin/index.js

#! /usr/bin/env node
const path = require("path");
const config = require(path.resolve("webpack.config.js")); // Get the configuration file for webpack.config.js in the project root directory
const Compiler = require("../lib/Compiler.js");// Introducing Compiler Compiler Class
const compiler = new Compiler(config); // Pass in config configuration object and create compiler object
compiler.run(); // Compiler object calls run() method to execute

Compiler constructor

As mentioned earlier, the compiler's constructors are primarily config objects, entry module id s, all module dependencies (path and source mappings), plug-in installations.

// ../lib/Compiler.js

class Compiler {
    constructor(config) {
        this.config = config; // 1. Save profile object
        this.entryId; // (2) Save the entry module id
        this.modules = {} // (3) Save all module dependencies (path and source mapping)
        this.entry = config.entry; // Entry path, which is the path to the entry file for the profile configuration
        this.root = process.cwd(); // The working path to run the web-pack, which is the root directory of the project to be packaged
        // (4) Traverse the configured plug-ins and install them
        const plugins = this.config.plugins; // Get the plugins used
        if(Array.isArray(plugins)) {
            plugins.forEach((plugin) => {
                plugin.apply(this); // Call the apply() method of the plugin to install the plugin
            });
        }
    }
}

Compiler run() method

The compiler run() method mainly completes the buildModule and emitFile. The buildModule needs to start with the entry file, that is, the absolute path to the incoming file. If the entry file is dependent, the buildModule() will be called recursively, that is, the build dependent module. Since the entry file id also needs to be saved, it needs toThere is a variable that tells the incoming module whether it is an entry file or not.

// add run() method

class Compiler {
    run() {
        this.buildModule(path.resolve(this.root, this.entry), true); // The absolute path to the incoming entry file, and the second parameter is ture, which is the entry module
        this.emitFile(); // After the module build s, the file is sent, and the result of the package is written to the output file
    }
}

Implement the buildModule() method

The main method of buildModule is to get the source code content, parse the source code content, get the parsed source code and the dependencies of the current module after parsing, save the parsed source code into the modules object, and traverse the dependencies, continue to build Module, such as:

// add buildModule() method

class Compiler {
    buildModule(modulePath, isEntry) { // Construction module
        const source = this.getSource(modulePath); // Get the corresponding source code content based on the absolute path of the module
        const moduleName = "./" + path.relative(this.root, modulePath); // Gets the relative path of the current build module relative to the root directory
        if (isEntry) { // If it is an entry module
            this.entryId = moduleName; // Save relative path of entry as entryId
        }
        const {sourceCode, dependencies} = this.parse(source, path.dirname(moduleName)); // Parse Source to get the parsed source and the dependent array of the current module
        this.modules[moduleName] = sourceCode; // Save parsed source code content into modules
        dependencies.forEach((dep) => { // Traverse the dependencies of the current module and continue build ing dependent modules if there are any dependencies
            this.buildModule(path.join(this.root, dep), false); // Dependent module is a non-entry module, so false is passed in and does not need to be saved to entryId
        });
    }
}

Implement getSource() method to get source content

The main thing to get the source code is to read the content of the source code, traverse the configured rules, and match the file format of the source code according to the test regular expression in the rule. If the match is successful, it is handed over to the corresponding loader for processing. If there are multiple loaders, all loads are executed in turn from the last loader recursive callEr.

// add getSource() method

class Compiler {
    getSource(modulePath) {
        let content = fs.readFileSync(modulePath, "utf8"); // Read Source Content
        const rules = this.config.module.rules; // Get the rules configured in the configuration file
        for (let i = 0; i< rules.length; i++) { // Traversing rules
            const rule = rules[i];
            const {test, use} = rule;
            let len = use.length -1; // Gets the index number of the last loader that processes the current file
            if (test.test(modulePath)) { // Match the loader configuration based on the path of the source file and hand it to the matching loader for processing
                function startLoader() { // Start executing loader
                    // Introduce loader, loader is a function, and pass the source code content as a parameter to the loader function for processing
                    const loader = require(use[len--]);
                    content = loader(content);
                    if (len >= 0) { // If there are more than one loader, proceed to the next loader.
                        startLoader(); // Recursively invoke all loaders starting from the last loader
                    }
                }
                startLoader(); // Start executing loader
            }
         }
     }
}

Resolve the source code and get the dependency of the current source code

Parsing the source code mainly converts the source code into an AST abstract grammar tree, then traverses the AST abstract grammar tree, finds the require call expression node, replaces it with u webpack_require_u, and finds the require parameter node, which is a string constant node, replaces the require parameter with relative to the rootRecorded paths, when operating on AST grammar tree nodes, cannot be assigned a string constant directly. Instead, a string constant node should be generated from the string constant to replace it.When the require node is found, the dependencies of the current module are also found and saved back to traverse the dependencies.

// add parse() method

const babylon = require("babylon"); // Parse source code into AST abstract grammar tree
const traverse = require("@babel/traverse").default; // Traverse AST Grammar Tree Nodes
const types = require("@babel/types"); // Generate a variety of AST nodes
const generator = require("@babel/generator").default; // Convert AST grammar tree back to source

class Compiler {
    parse(source, parentPath) {
        const dependencies = []; // Save current module dependencies
        const ast = babylon.parse(source); // Parse source code into AST abstract grammar tree
        traverse(ast, {
            CallExpression(p) { // Find require expression
                const node = p.node; // Corresponding Node
                if (node.callee.name == "require") { // Replace require with webpack's own require method, u webpack_require_u that is
                    node.callee.name = "__webpack_require__"; 
                    let moduleName = node.arguments[0].value; // Get require's module name
                    if (moduleName) {
                        const extname = path.extname(moduleName) ? "" : ".js";
                        moduleName = moduleName + extname; // If the module introduced does not have a suffix name written, add a suffix name to it
                        moduleName = "./" + path.join(parentPath, moduleName);
                        dependencies.push(moduleName); // Save Module Dependencies
                        // Replace the path to the dependent file with the directory relative to the entry file
                        node.arguments = [types.stringLiteral(moduleName)];// Generate a string constant node to replace it, where the arguments parameter node is the string constant node corresponding to the require file path
                    }
                }
            }
        });
        const sourceCode = generator(ast).code; // Regenerate Source
        return {sourceCode, dependencies};
    }
}

emitFile launch file

Get the output template content, using the ejs template, then pass in the entryId (entry file Id) and the modules object (path and source mapping object), render the final output of the template, and write it to the output file, bundle.js.

// template.ejs

(function(modules) { // webpackBootstrap
     // The module cache
     var installedModules = {};
     // The require function
     function __webpack_require__(moduleId) {
         // Check if module is in cache
         if(installedModules[moduleId]) {
             return installedModules[moduleId].exports;
         }
         // Create a new module (and put it into the cache)
         var module = installedModules[moduleId] = {
             i: moduleId,
             l: false,
             exports: {}
         };
         // Execute the module function
         modules[moduleId].call(module.exports, module, module.exports, __webpack_require__);
         // Flag the module as loaded
         module.l = true;
         // Return the exports of the module
         return module.exports;
     }
     // Load entry module and return exports
     return __webpack_require__(__webpack_require__.s = "<%-entryId%>");
 })
 ({
    <%for(let key in modules) {%>
        "<%-key%>":
            (function(module, exports, __webpack_require__) {
                eval(`<%-modules[key]%>`);
            }),
        <%}%>
 });

// add emitFile() method

const ejs = require("ejs");
class Compiler {
    emitFile() { // Launch packaged output result file
        // Get Output File Path
        const outputFile = path.join(this.config.output.path, this.config.output.filename);
        // Get Output File Template
        const templateStr = this.getSource(path.join(__dirname, "template.ejs"));
        // Render Output File Template
        const code = ejs.render(templateStr, {entryId: this.entryId, modules: this.modules});
        this.assets = {};
        this.assets[outputFile] = code;
        // Write rendered code to output file
        fs.writeFileSync(outputFile, this.assets[outputFile]);
    }
}

There is no judgement as to whether the output file exists, so you need to create an empty output file in advance.

Write loader

For testing purposes, write a simple loader to handle css, which is style-loader. We already know that loader is actually a function that receives source code for corresponding conversion, that is, it passes CSS source code to style-loader for processing, and the execution of CSS needs to be placed in the style tag, so it needs to pass through JSCreate a style tag and embed the CSS source code into the style tag, such as:

// style-loader

function loader(source) {
    const style = `
        let style = document.createElement("style");
        style.innerHTML = ${JSON.stringify(source)};
        document.head.appendChild(style);
    `;
    return style;
}
module.exports = loader;

_Writing Plugin

For testing purposes, here's a simple plug-in structure that doesn't handle the specific content, but allows the plug-in to function properly. We already know that the plug-in is a class with an apply() method inside. The webpack plug-in mainly uses the tapable module, which provides various hooks that can be createdVarious hook objects, and then at compile time emit events by calling the call() method of the hook object, and then the plug-in listens for these events to do something specific.

// plugin.js

class Plugin {
    apply(compiler) {
        compiler.hooks.emit.tap("emit", function() { // Get emit hooks from compiler objects and listen for emit events
            console.log("received emit hook.");
        });
    }
}
module.exports = Plugin;

The principle of tapable is to publish a subscription mechanism. When tap is called, it registers events, stores event functions in an array, and when the call() method is called, it traverses the stored event functions to execute in turn, that is, the event is emitted.

6. Complete compiler source code

const fs = require("fs");
const path = require("path");
// babylon converts source code to AST syntax tree
const babylon = require("babylon");
// @babel/traverse traverses AST nodes
const traverse = require("@babel/traverse").default;
// @babel/types generates a variety of AST nodes
const types = require("@babel/types");
// @babel/generator converts the AST grammar tree back to source
const generator = require("@babel/generator").default;

const ejs = require("ejs");

const {SyncHook} = require("tapable");

class Compiler {
    constructor(config) {
        this.config = config; // Save Profile Object
        // Path to save entry file
        this.entryId; // "./src/index.js"
        // Stores all module dependencies, including dependencies on entry files and entry files, since all modules are executed
        this.modules = {}
        this.entry = config.entry; // Entry path, which is the path to the entry file for the profile configuration
        this.root = process.cwd(); // The working path to run wb-pack, which is the root directory of the project to be packaged
        this.hooks = {
            entryOption: new SyncHook(),
            compile: new SyncHook(),
            afterCompile: new SyncHook(),
            afterPlugins: new SyncHook(),
            run: new SyncHook(),
            emit: new SyncHook(),
            done: new SyncHook()
        }
        // Traverse the configured plug-ins and install them
        const plugins = this.config.plugins; // Get the plugins used
        if(Array.isArray(plugins)) {
            plugins.forEach((plugin) => {
                plugin.apply(this); // Call the apply() method of the plugin
            });
        }
        this.hooks.afterPlugins.call(); // Hook after plugin installation
    }
    // Get the content of the source code, the process of getting the source code will give the corresponding loader to handle the matching files according to the configuration of the loader
    getSource(modulePath) {
        console.log("get source start.");
        // Get Source Content
        let content = fs.readFileSync(modulePath, "utf8");
        // Traverse loader
        const rules = this.config.module.rules;
        for (let i = 0; i< rules.length; i++) {
            const rule = rules[i];
            const {test, use} = rule;
            let len = use.length -1;
            if (test.test(modulePath)) { // Match the loader configuration based on the path of the source file and hand it to the matching loader for processing
                function startLoader() {
                    // Introduce loader, loader is a function, and pass the source code content as a parameter to the loader function for processing
                    const loader = require(use[len--]);
                    content = loader(content);
                    // console.log(content);
                    if (len >= 0) { // Continue with the next loader if there are multiple loaders
                        startLoader();
                    }
                }
                startLoader();
            }
        }
        return content;
    }
    // Parse source content and get its dependencies
    parse(source, parentPath) {
        console.log("parse start.");
        console.log(`before parse ${source}`);
        // 1. Parse source code content into AST abstract grammar tree
        const ast = babylon.parse(source);
        // console.log(ast);
        const dependencies = []; // Save Module Dependencies
        // (2) Traversing the AST abstract grammar tree
        traverse(ast, {
            CallExpression(p) { // Find require statement
                const node = p.node; // Corresponding Node
                if (node.callee.name == "require") { // Replace require with webpack's own require method, u webpack_require_u that is
                    node.callee.name = "__webpack_require__"; 
                    let moduleName = node.arguments[0].value; // Get require's module name
                    if (moduleName) {
                        const extname = path.extname(moduleName) ? "" : ".js";
                        moduleName = moduleName + extname; // If the module introduced does not have a suffix name written, add a suffix name to it
                        moduleName = "./" + path.join(parentPath, moduleName);
                        // console.log(moduleName);
                        dependencies.push(moduleName);
                        // Replace the path to the dependent file with the directory relative to the entry file
                        console.log(`moduleName is ${moduleName}`);
                        console.log(`types.stringLiteral(moduleName) is ${JSON.stringify(types.stringLiteral(moduleName))}`);
                        console.log(node);
                        console.log(node.arguments);
                        node.arguments = [types.stringLiteral(moduleName)];
                    }
                }
            }
        });
        // Regenerate source code after processing AST
        const sourceCode = generator(ast).code;
        console.log(`after parse ${sourceCode}`);
        // Return processed source and entry file dependencies
        return {sourceCode, dependencies};

    }
    // Get the source code, hand it to the loader, parse the source code for some modifications and replacements, find the module dependency, traverse the dependency and continue parsing the dependency
    buildModule(modulePath, isEntry) { // Create Module Dependencies
        console.log("buildModule start.");
        console.log(`modulePath is ${modulePath}`);
        // Get module content, that is, source code
        const source = this.getSource(modulePath);
        // Get the relative path of the module
        const moduleName = "./" + path.relative(this.root, modulePath); // By subtracting the project root path from the absolute path of the module, you get the relative path of the module relative to the root directory
        if (isEntry) {
            this.entryId = moduleName; // Save relative path of entry as entryId
        }
        // Parse the source code content, transform the dependent paths in the source code, and return to the dependency list
        // console.log(path.dirname(moduleName)); //Remove the extension to return the directory name,'. /src'
        const {sourceCode, dependencies} = this.parse(source, path.dirname(moduleName));
        console.log("source code");
        console.log(sourceCode);
        console.log(dependencies);
        this.modules[moduleName] = sourceCode; // Save Source
        // Recursive Find Dependencies
        dependencies.forEach((dep) => {
            this.buildModule(path.join(this.root, dep), false);//("./src/a.js", false)("./src/index.less", false)
        });
    }
    emitFile() { // Launch packaged output result file
        console.log("emit file start.");
        // Get Output File Path
        const outputFile = path.join(this.config.output.path, this.config.output.filename);
        // Get Output File Template
        const templateStr = this.getSource(path.join(__dirname, "template.ejs"));
        // Render Output File Template
        const code = ejs.render(templateStr, {entryId: this.entryId, modules: this.modules});
        this.assets = {};
        this.assets[outputFile] = code;
        // Write rendered code to output file
        fs.writeFileSync(outputFile, this.assets[outputFile]);
    }
    run() {
        this.hooks.compile.call(); // Pre-compile hook
        // Absolute path to incoming entry file
        this.buildModule(path.resolve(this.root, this.entry), true); 
        this.hooks.afterCompile.call(); // Hook after compilation
        // console.log(this.modules, this.entryId);
        this.emitFile();
        this.hooks.emit.call(); // Hook after execution file launch
        this.hooks.done.call(); // Hook after packaging
    }
}
module.exports = Compiler;

Posted by Perad on Tue, 10 Sep 2019 09:05:58 -0700