问题
I use a various regexes to parse a C source file, line by line. First i read all the content of file in a string:
ifstream file_stream("commented.cpp",ifstream::binary);
std::string txt((std::istreambuf_iterator<char>(file_stream)),
std::istreambuf_iterator<char>());
Then i use a set of regex, which should be applied continusly until the match found, here i will give only one for example:
vector<regex> rules = { regex("^//[^\n]*$") };
char * search =(char*)txt.c_str();
int position = 0, length = 0;
for (int i = 0; i < rules.size(); i++) {
cmatch match;
if (regex_search(search + position, match, rules[i],regex_constants::match_not_bol | regex_constants::match_not_eol))
{
position += ( match.position() + match.length() );
}
}
But it don't work. It will match the comment not in the current line, but it will search whole string, for the first match, regex_constants::match_not_bol
and regex_constants::match_not_eol
should make the regex_search
to recognize ^$
as start/end of line only, not end start/end of whole block. So here is my file:
commented.cpp:
#include <stdio.h>
//comment
The code should fail, my logic is with those options to regex_search, the match should fail, because it should search for pattern in the first line:
#include <stdio.h>
But instead it searches whole string, and immideatly finds //comment
. I need help, to make regex_search
match only in current line. The options match_not_bol
and match_not_eol
do not help me. Of course i can read a file line by line in a vector, and then do match of all rules on each string in vector, but it is very slow, i have done that, and it take too long time to parse a big file like that, that's why i want to let regex deal with new lines, and use positioning counter.
回答1:
If it is not what you want please comment so I will delete the answer
What you are doing is not a correct way of using a regex library.
Thus here is my suggestion for anyone that wants to use std::regex
library.
- It only supports
ECMAScript
that somehow is a little poor than all modernregex
library. It has bugs as many as you like ( just I found ):
- the same regex but different results on Linux and Windows only C++
- std::regex and ignoring flags
- std::regex_match and lazy quantifier with strange behavior
In some cases (I test specifically with
std::match_results
) It is 200 times slower in comparison tostd.regex
in d language- It has very confusing
flag-match
and almost it does not work (at least for me)
conclusion: do not use it at all.
But if anyone still demands to use c++ anyway then you can:
use
boost::regex
about Boost library because:- It is
PCRE
support - It has less bug ( I have not seen any )
- It is smaller in bin file ( I mean executable file after compiling )
- It is faster then
std::regex
- It is
use
gcc version 7.1.0
and NOT below. The last bug I found is in version6.3.0
- use
clang version 3
or above
If you have enticed (= persuade) to NOT use c++ then you can use:
Use d regular expression link library for large task:
std.regex
and why:- Fast Faster Command Line Tools in D
- Easy
- Flexible drn
Use native
pcre
orpcre2
link that have been written in c- Extremely fast but a little complicated
- Use perl for a simple task and specially Perl one-liner link
回答2:
#include <stdio.h> //comment
The code should fail, my logic is with those options to regex_search, the match should fail, because it should search for pattern in the first line:
#include <stdio.h>
But instead it searches whole string, and immideatly finds //comment. I need help, to make regex_search match only in current line.
Are you trying to match all //
comments in a source code file, or only the first line?
The former can be done like this:
#include <iostream>
#include <fstream>
#include <regex>
int main()
{
auto input = std::ifstream{"stream_union.h"};
for(auto line = std::string{}; getline(input, line); )
{
auto submatch = std::smatch{};
auto pattern = std::regex(R"(//)");
std::regex_search(line, submatch, pattern);
auto match = submatch.str(0);
if(match.empty()) continue;
std::cout << line << std::endl;
}
std::cout << std::endl;
return EXIT_SUCCESS;
}
And the later can be done like this:
#include <iostream>
#include <fstream>
#include <regex>
int main()
{
auto input = std::ifstream{"stream_union.h"};
auto line = std::string{};
getline(input, line);
auto submatch = std::smatch{};
auto pattern = std::regex(R"(//)");
std::regex_search(line, submatch, pattern);
auto match = submatch.str(0);
if(match.empty()) { return EXIT_FAILURE; }
std::cout << line << std::endl;
return EXIT_SUCCESS;
}
If for any reason you're trying to get the position of the match, tellg() will do that for you.
来源:https://stackoverflow.com/questions/46087665/std-regex-search-to-match-only-current-line