c++ new feature: regular expression

Keywords: Windows

After c++11, regular expressions are formally included in the standard library, including the function modules of regex [match], regex [search] and regex [replace]. Through regular expressions, you can easily complete the operation of serializing sequence of specific patterns. Before using regular expressions, we need to have the syntax foundation of regular expressions. Here is a ticket for the required partner regular expression .

Differences between regular expressions and wildcards (such as windows Platform file search box)

Wildcards are mainly used to manipulate file names.
Regular expressions operate primarily on text data.

regex

For convenience, the examples in this article are used uniformly

std::string poem("if you weeped for the missing sunset you would miss\
	all the shining stars. someday,you will find the one,who will wa\
	tch every sunrise with you until the sunset of your life")

As a match source. If we want to find out whether there are words beginning with sun in poem, we need to define the regular expression std::regex rgx("sun [^] *") or std::regex seg_rgx("(sun)([^] *"). There is one difference between the two methods. We call the second method group writing, which divides the parts to be matched into sun and For two parts, if we need to operate two parts precisely, especially when using the regex [replace] function, we may need to write the regular expression into the component group mode. Another point is that std::cmatch and STD:: Smash are two typedef s respectively.

typedef match_results<const char*> cmatch;
typedef match_results<string::const_iterator> smatch;

regex_search

The function of regex [search is to find out whether the expression of a specific rule is in a paragraph of text. For function definitions, please refer to the end of the article for reference. I will not paste any more.

if (std::regex_search(poem, rgx)) {
	std::cout << "regex_search : find a word start with sun int the poem" << std::endl;
}

if (std::regex_search(poem.begin(),poem.end(), rgx)) {
	std::cout << "regex_search : find a word start with sun int the poem" << std::endl;
}

In addition to the two most basic use methods mentioned above, you can also get the matching details by using regular expressions with overloaded versions with std::cmatch and STD:: Smash parameters.

auto temp_poem = poem;
std::cout << "reg search ......" << std::endl;
while (std::regex_search(temp_poem, sm, rgx)) {
	for (auto ref : sm) {
		std::cout << "    " << ref << std::endl;
	}
	temp_poem = sm.suffix().str();
}

temp_poem = poem;
std::cout << "seg_reg search ......" << std::endl;
while (std::regex_search(temp_poem, sm, seg_rgx)) {
	for (auto ref : sm) {
		std::cout << "    " << ref << std::endl;
	}
	temp_poem = sm.suffix().str();
}

The part of grouping matching pattern is used to store the matching details in the iterator std::smatch. Since it is defined as match_results < string:: const_iterator >, the statement after matching is taken out by sm.suffix().str(), and repeated until the last matching. The result outputs the sub parts of the matching result of regular expression and each group of regular expression.

reg search ......
    sunset
    sunrise
    sunset
seg_reg search ......
    sunset
    sun
    set
    sunrise
    sun
    rise
    sunset
    sun
    set

It can be seen that although three words are matched, we use the regular expression of grouping pattern to match, and we additionally get the result of each group of regular expression matching.

regex_match

The use of regex ﹣ match function is basically the same as that of regex ﹣ search. In order to distinguish function functions better, we need to define two regular expressions: STD:: regex not ﹣ RgX ("(?!. * shipping) (. *)"), STD:: regex yes ﹣ RgX ("(. *) (Sunset) (. *)")

if (!std::regex_match(poem.begin(),poem.end(), not_rgx)) {
	std::cout << "regex_match : find a word shining int the poem" << std::endl;
}

if (std::regex_match(poem, yes_rgx)) {
	std::cout << "regex_match : find a ......sunset......sunrise...... format int the poem" << std::endl;
}

if (std::regex_match(poem.begin(), poem.end(), yes_rgx)) {
	std::cout << "regex_match : find a ......sunset......sunrise...... format int the poem" << std::endl;
}

std::regex_match(poem, sm, yes_rgx);
std::regex_match(poem.c_str(), cm, yes_rgx);

std::cout << "cm.size() == sm.size() ? (0 | 1) : " << (cm.size() == sm.size()) << std::endl;

The output of the function is as follows. Note that when std::cmatch is used, the matching source needs to be a c-style const char T *.

regex_match : find a word shining int the poem
regex_match : find a ......sunset......sunrise...... format int the poem
regex_match : find a ......sunset......sunrise...... format int the poem
cm.size() == sm.size() ? (0 | 1) : 1

regex_replace

Many people are very confused about the function of regex [replace] when looking up the documents. I was also confused when I first looked up the parameter description,
What does $1, ,$n do? It's about the grouping pattern of regular expressions at the beginning, when we are matching target
When $* is written in, the matching group of our regular expression will not be replaced. At the extreme point, for example, our regular expression has only two groups,
We write $1 $2 in the target string of the replacement. At this time, the replacement has no effect. For example, you replace the source string with the source string. This design
When we need to operate on each group or some part of the matching pattern group, we can first match a group of contents, and then in the regular table
The expression is divided into two parts.

std::cout << std::regex_replace(poem, rgx, "") << std::endl;
std::cout << std::regex_replace(poem, seg_rgx, "") << std::endl;

std::string replace_poem;
std::regex_replace(std::back_inserter(replace_poem), poem.begin(), poem.end(), rgx, "");
std::cout << "std::back_inserter mode : " << std::endl << replace_poem << std::endl;

std::cout << std::regex_replace(poem, seg_rgx, "$1down") << std::endl;
std::cout << std::regex_replace(poem, seg_rgx, "rain$2") << std::endl;
std::cout << std::regex_replace(poem, seg_rgx, "raindown") << std::endl;
std::cout << std::regex_replace(poem, seg_rgx, "[$1 - $2]", std::regex_constants::format_no_copy) << std::endl;

In the above code, REG and SEG reg found three matches of sunset sunset. And the grouping pattern is matched in the three matches

Grouping $1 $2
sunset sun set
sunrise sun rise
sunset sun set

So when we write the target pattern as $1down rain] raindown [$ - ]] respectively, the roles are

  1. Reserve three matching sun, and replace set rise set with down at the same time;
  2. Three matching set rise set s are reserved, sun is replaced with rain at the same time;
  3. Replace sunset sunset with raindown;
  4. Add a connection number between sun and {set rise set} and enclose the matching text in brackets.
    You can expect the output to be
if you weeped for the missing  you would miss all the shining stars. someday,you will find the one,who will watch every  with you until the  of your life
if you weeped for the missing  you would miss all the shining stars. someday,you will find the one,who will watch every  with you until the  of your life
std::back_inserter mode :
if you weeped for the missing  you would miss all the shining stars. someday,you will find the one,who will watch every  with you until the  of your life
if you weeped for the missing sundown you would miss all the shining stars. someday,you will find the one,who will watch every sundown with you until the sundown of your life
if you weeped for the missing rainset you would miss all the shining stars. someday,you will find the one,who will watch every rainrise with you until the rainset of your life
if you weeped for the missing raindown you would miss all the shining stars. someday,you will find the one,who will watch every raindown with you until the raindown of your life
[sun - set][sun - rise][sun - set]

Code

See for complete code c + + regular expression

Reference resources

cplusplus
cpprefeance
regular expression

Published 4 original articles, won praise 2, visited 685
Private letter follow

Posted by LHBraun on Mon, 13 Jan 2020 05:46:38 -0800