Regular expression {concise and easy to understand}

Keywords: Javascript ECMAScript regex

Why use regular expressions?

Regular matching can realize fuzzy matching. In actual development, it is often necessary to match a certain range of data, such as verifying whether the mailbox entered by the user is correct;

In addition to @ and other fixed characters, we can't measure and count the other characters we input. At this time, there is nothing we can do about accurate matching, so we can only use fuzzy matching;

How to create regular expressions:

If you want to use regular expressions in js, you need to create regular expression objects.

1. Constructor: new RegExp("regexp")   Common modifiers:

/i ignore case  / g global match body ", [modifier])

  2. Literal amount: / regular body and / modifier

Method of regular object:

1.test: Detects whether a string matches a rule.
	​ Return value: match return true,Mismatch return false. 
let reg = /hello/g;
let str = "hello,hanmeimei,hello";
// let str = "hi,hanmeimei";
console.log(reg.test(str));
2.exec: Capture matched content
    ​ 0: Matching text
    ​ ndex: Returns where the text appears
    ​ input: Full line text
let reg = /hello/g;
let str = "hello,hanmeimei,hello";
console.log(reg.exec(str));

Two kinds of fuzzy matching

Transverse fuzzy matching

  1. Horizontal fuzzy matching means that the length of a regular matching string is not fixed, and can be in many cases

  2. The implementation method is to use quantifiers, such as {M,N}, which represents the minimum number of occurrences M times and the maximum number of occurrences N times;

Case:

let string = "abc abbc abbbc abbbbc abbbbbc abbbbbbc";
let regex = /ab{2,5}c/g.test(string);
//ab{2,5}c / means to match such a string: the first character is "a"
//This is followed by 2 to 5 characters "b", followed by the character "c". The tests are as follows:
console.log(regex); // => ["abbc", "abbbc", "abbbbc", "abbbbbc"]

[note]: there is one more / ab{2,5}c/g in the above code, which is a regular modifier, and G represents global matching,

Longitudinal fuzzy matching

  1. Vertical ambiguity refers to a regular matching string. When it is specific to a certain character, it can not be a certain character, and there can be many possibilities.

  2. It is implemented by using character groups. For example, [abc] indicates that the character can be any of the characters "a", "b" and "c".

Case:  

let string = "a0b a1b a2b a3b a4b";
let regex = /a[123]b/g.test(string);
console.log(regex); // => ["a1b", "a2b", "a3b"]

Use regular in string

match()

  • Capture the content in the string that matches the regular string

  • If the modifier g is added, all matching contents will be returned to the array at one time. If not, it will be the same as the return value of exec

  • Matching failed, null returned

Case:

let str = "Lorem ipsum dolor sitip amet";
console.log(str.match(/ip/g));//The returned is ip, ip;

replace()

  • Replace some characters with others

  Case:

let str = "Lorem ipsum dolor sitip amet";
//Change ip to id
str = str.replace("ip","id");
console.log(str);//Run result / / Lorem idsum dolor sitip amet

Split (character / regular) segmentation

Case:

let str = "Lorem ipsum dolor sitip amet";
//Dividing character strings from ip locations
let arr = str.split("ip")
// let arr = str.split(/ip/)
console.log(arr))//Running result: 'lorem', 'sum Dolor sit', 'AMET']

search:

  • Find the first occurrence of a qualifying character

Case:

let str = "Lorem ipsum dolor sitip amet";
//Find the IP location from the above string, subscript
let index = str.search("ip")
// let index = str.search(/ip/)    console.log(index);//6

Metacharacter

  1. Metacharacter: a regular basic symbol with special meaning

  2. ". wildcard: matches any single character

    • [range] any one of the matching ranges

    • [A-Z] lowercase letter [A-Z] uppercase letter [0-9] any number

  3. [^ range] is the inverse of [range], which matches the characters of any range.

Common abbreviations

  • \d matching a single number is equivalent to [0-9]
  • \D matches a single non number, equivalent to 0-9;
let str = "1$"//true
let str = "111adasd1"//true
console.log(/\S/.test(str));
\d:Match a single number{amount to[0-9]},
//Any number in the string is true
    

let str = "1asd$"//true
let str = "1111"//false
console.log(/\S/.test(str));
\D:Match a single non number,{amount to[^0-9]},
//If it is all numbers, it is false. As long as it takes a non numeric beauty, it returns true;
  • \w matches a single number, letter and underline, which is equal to [a-zA-Z0-9#];
  • \W matches a single non numeric, non alphabetic, non underlined, equivalent to a-zA-Z0-9_
let str = "$"//false
let str = "1"//true
console.log(/\S/.test(str));
\w Match individual numbers, letters, underscores equal to[a-zA-Z0-9_];
//If it is a number, underscore or letter, it is true as long as it is not a special character


let str = "$"//true
let str = "1 "//false
console.log(/\S/.test(str));
\W Match single non numeric, non alphabetic, non underlined, equivalent to[^a-zA-Z0-9_]
//As long as it is not a number, letter or underscore, it is true as long as it is a special character. If it is a number, letter or underscore, it is false
  • \s matches a single white space character, space wrap (\ n), tab (\ t)
  • \S matches a single non white space character
let str = "a";//Is false
let str = "1 "//true
console.log(/\S/.test(str));
\s Matches a single white space character, and spaces wrap(\n),Tabulation(\t)
//The newline tab is true as long as there are spaces in the string, and false without these
                          
let str = "";//Is false
let str = "1 "//true
console.log(/\S/.test(str));
\S Match a single non white space character//If nothing is written, it is false. If a single number or letter symbol is added with a space, it is true

Qualifier character

  1. Qualifying metacharacter: written after an ordinary metacharacter or letter to modify the number of occurrences of the previous character

  2. Quantifier: quantifier is also called repetition. After mastering the exact meaning of {m,n}, you only need to remember some simplified forms

Common abbreviations

{m}

{m}Indicates that at least m Times;//Equivalent to {m, m}
let pattern = /ab/,
str = 'ab aabb aaabbbbbbbaaa';
 console.log(pattern.test(str)); //ab ab ab ab

{m,}

{m,}No upper limit, at least match m Times;
let pattern = /ab{2,}/,
    //Indicates that a occurs once and b has no upper limit
str = 'ab aabb aaabbbbbbbaaa';
console.log(pattern.test(str));//ab abbbbb

{m,n}

{m,n} At least match m Times, matches at most n Times.
let pattern = /ab{2,4}/,
    //Indicates that a occurs once and b occurs 4 times at most
	str = 'ab aabb aaabbbbbaaa';
console.log(pattern.test(str));//The result is abb abbbb

?

 ? Equivalent to{0,1} Either not or once.
 let  pattern = /ab?/,
     //Either not or once.
 str = 'ab aabb aaabbbbbbcc';
 console.log(pattern.test(str));//Operation results ab a ab

*

* Equivalent to{0,} Matches any number of times (including zero times).
let pattern = /ab*/,
    //Matches any number of times (including zero times).
    str = 'ab aabb aaabbbbbbcc';
console.log(pattern.test(str));

+

 + Equivalent to{1,} At least once.
 let pattern = /ab*/,
     //At least once
     str = 'ab aabb aaabbbbbbcc';
console.log(pattern.test(str));

[note]:

                If you need to modify the number of occurrences of multiple characters, wrap them with ().

Line beginning and end qualifier

[note]: the beginning and end of the line are used together, indicating that the characters must match the rules in the middle.

^Line start matching must begin with the following character

//The beginning of the line must be ccb
let pattern = /^ccb/,
    str = 'ccbcc';
console.log(pattern.test(str));

$end of line matching must end with the preceding character

//The beginning and end of the line are used together to indicate that the characters must match the rules in the middle
var pattern = /^ccb$/,
    str = 'ccb';
console.log(pattern.test(str));

Greed and non greed

greedy:

  1. When a string is matched with regularity, it will be matched as much as possible. What regularity makes people happy is the greedy pattern

//Where regular / \ d{2,5} /, means that the number appears 2 to 5 times continuously. It will match 2-bit, 3-bit, 4-bit and 5-bit consecutive numbers.

//But it is greedy. It will match as many as possible. If you can give me 6, I want 5. If you can give me 3, I want 3. Anyway, as long as it is within the scope of ability, the more, the better.
var string = "123 1234 12345 123456";
var regex = /\d{2,5}/g.test(string);
console.log(regex); // => ["123", "1234", "12345", "12345"]

Non greedy:

  1. When a string uses regular matching, it will be matched as little as possible.

    When writing a qualifier, just add a "? After it;

  Inert matching can be achieved by adding a question mark after the quantifier. Therefore, all inert matching situations are as follows:

var string = "123 1234 12345 123456";
//Where / \ d{2,5}? / means that although 2 to 5 times are OK, when 2 is enough, you won't try again.
var regex = /\d{2,5}?/g;
console.log( string.match(regex) ); // => ["12", "12", "34", "12", "34", "12", "34", "56"]

  Inert matching can be achieved by adding a question mark after the quantifier. Therefore, all inert matching situations are as follows:

{m,n}?
{m,}?\
*??
+?
*?

Posted by Dollar on Wed, 24 Nov 2021 11:36:37 -0800