How does ThinkPHP 6.0 analyze IIS logs?

Keywords: Programming IIS

Scenario: there is a server with IIS architecture. I want to know how one js (index.js) is called by other websites every day. First, Download all the IIS log schema a TP6.0 program and put it into the runtime/log folder.

public function checkIndexJs()
    {
        $file = root_path() . DIRECTORY_SEPARATOR . 'runtime' . DIRECTORY_SEPARATOR . 'log';
        $temp = scandir($file);
        // traverse folder 
        $result = [];
        $resultAll = [];
        foreach ($temp as $v) {
            $log = $file . DIRECTORY_SEPARATOR . $v;
            if (file_exists($log) && $v !== '.' && $v !== '..') {
                // Read file contents
                $info = fopen($log, "r");
                // Output all lines in the text until the end of the file.
                while (!feof($info)) {
                    // The fgets() function reads a line from the file pointer
                    $itemStr = fgets($info);
                    // Determine whether index.js is included
                    if (strpos($itemStr, 'index.js') !== false) {
                        preg_match("/(http|https):\/\/([\w\d\-_]+[\.\w\d\-_]+)[:\d+]?([\/]?[\w\/\.]+)/i", $itemStr, $domain);
                        if (isset($domain[2])) {
                            // Put in array, convenient to calculate and eliminate repetition
                            $a = $result[$v] ?? [];
                            // Record to current array
                            if (!in_array($domain[2], $a)) {
                                $result[$v][] = $domain[2];
                            }
                            if (!in_array($domain[2], $resultAll)) {
                                $resultAll[] = $domain[2];
                            }
                        }
                    }
                }
                fclose($info);
            }
        }
        dump($result, $resultAll);
    }

Realization goal and principle: loop every file in the log directory, read every line, judge whether it contains index.js, then read the source domain name and store it in a non repeated array, so we get the result of all domain names that have visited (called) index.js and a call every day.

Posted by prawn_86 on Sat, 14 Mar 2020 09:14:53 -0700