Balanced match source code analysis

Keywords: Javascript JSON

Balanced match source code analysis

text

0. Basic information

0.1 Usage

The goal of the balanced match library is very simple: match the first pair of strings that meet the conditions, and disassemble them into three parts: front, middle and back

0.2 Version: v2.0.0

This library is relatively stable, and there is nothing to change

This paper studies the latest version: v2.0.0

0.2 Doc

Relevant documents are written in READEME, which is relatively concise

Portal: balanced-match - npm

1. Source code analysis

1.0 source code project structure

The whole project is very small, just an index.js

1.1 main entrance

  • index.js (reading notes: / index.js/0_structure.js)
'use strict';

function balanced(a, b, str) {}

function maybeMatch(reg, str) {}

balanced.range = range;

function range(a, b, str) {}

module.exports = balanced;

There are three methods in the whole project, and two methods are exported: balanced and balanced.range

1.2 balanced

Next, let's look at the details of the main entrance

  • index.js (reading notes: / index.js/1_balanced.js)
/**
 * @param {string | RegExp} a
 * @param {string | RegExp} b
 * @param {string} str
 */
function balanced(a, b, str) {
  // Adapt regexp input
  if (a instanceof RegExp) a = maybeMatch(a, str);
  if (b instanceof RegExp) b = maybeMatch(b, str);

  // Search scope
  const r = range(a, b, str);

  // Results before and after calculation
  return (
    r && {
      start: r[0],
      end: r[1],
      pre: str.slice(0, r[0]),
      body: str.slice(r[0] + a.length, r[1]),
      post: str.slice(r[1] + b.length),
    }
  );
}

The function allows both incoming strings and regular expressions at the same time, so it first adapts with the maybeMatch method, then calls the range method to get the range, and finally extracts the result from the original string.

1.3 maybeMatch

  • index.js (reading notes: / index.js/2_maybeMatch.js)
/**
 * @param {RegExp} reg
 * @param {string} str
 */
function maybeMatch(reg, str) {
  // match[0] is the first matching string
  const m = str.match(reg);
  return m ? m[0] : null;
}

The author doesn't use too many strange techniques for regular expressions. In fact, he goes directly to the original string to find whether there is a match, and then directly converts it back to the string

1.4 range

The range function can be said to be the core of the library, which is to find the first non nested pair according to a and b. It is divided into several steps below

  • index.js (reading notes: / index.js/3_range.js)
/**
 * @param {string} a
 * @param {string} b
 * @param {string} str
 */
function range(a, b, str) {
  let begs, beg, left, right, result;
  let ai = str.indexOf(a);
  let bi = str.indexOf(b, ai + 1);
  let i = ai;

  // There is at least one pair of results
  if (ai >= 0 && bi > 0) {
    // Are they equal
    if (a === b) {
      return [ai, bi];
    }
    begs = [];
    left = str.length;

At the beginning, ensure that there is at least one pair. At the same time, if the same subscript is matched, it will be returned directly (there can be no other nested pairs between the front and back)

    while (i >= 0 && !result) {
      // Collect all subscripts that match a
      if (i === ai) {
        begs.push(i);
        ai = str.indexOf(a, i + 1);
      } else if (begs.length === 1) {
        // Output the result when there is only one begs left
        result = [begs.pop(), bi];
      } else {
        beg = begs.pop();
        if (beg < left) {
          // For each pop-up a, record the last pair of result subscripts
          left = beg;
          right = bi;
        }

        bi = str.indexOf(b, i + 1);
      }

      i = ai < bi && ai >= 0 ? ai : bi;
    }

The next step is the loop process. First, collect all strings conforming to a; Then pop up a step by step to match the next b, and finally return the first qualified a and b pairs

    // For the case that the begs are not used up (a occurrence times > b)
    if (begs.length) {
      result = [left, right];
    }
  }

  return result;
}

When a occurs more than b, find the last qualified pair of a and b (that is, the first pair of strings in the outermost layer) according to the left and right recorded above

Other resources

Reference connection

TitleLink
balanced-match - npmhttps://www.npmjs.com/package/balanced-match
balanced-match - Githubhttps://github.com/juliangruber/balanced-match
String.prototype.indexOf() - MDNhttps://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/indexOf

Reading notes reference

https://github.com/superfreeeee/Blog-code/tree/main/source_code_research/balanced-match-2.0.0

Posted by mikebr on Tue, 16 Nov 2021 00:22:34 -0800