Grep for lines not beginning with “//”

a 夏天 提交于 2021-02-11 12:11:54

问题


I'm trying but failing to write a regex to grep for lines that do not begin with "//" (i.e. C++-style comments). I'm aware of the "grep -v" option, but I am trying to learn how to pull this off with regex alone. I've searched and found various answers on grepping for lines that don't begin with a character, and even one on how to grep for lines that don't begin with a string, but I'm unable to adapt those answers to my case, and I don't understand what my error is.

> cat bar.txt
hello
//world
> cat bar.txt | grep "(?!\/\/)"
-bash: !\/\/: event not found

I'm not sure what this "event not found" is about. One of the answers I found used paren-question mark-exclamation-string-paren, which I've done here, and which still fails.

> cat bar.txt | grep "^[^\/\/].+"
(no output)

Another answer I found used a caret within square brackets and explained that this syntax meant "search for the absence of what's in the square brackets (other than the caret). I think the ".+" means "one or more of anything", but I'm not sure if that's correct and if it is correct, what distinguishes it from ".*"

In a nutshell: how can I construct a regex to pass to grep to search for lines that do not begin with "//" ?

To be even more specific, I'm trying to search for lines that have "#include" that are not preceeded by "//".

Thank you.


回答1:


The first line tells you that the problem is from bash (your shell). Bash finds the ! and attempts to inject into your command the last you entered that begins with \/\/. To avoid this you need to escape the ! or use single quotes. For an example of !, try !cat, it will execute the last command beginning with cat that you entered.

You don't need to escape /, it has no special meaning in regular expressions. You also don't need to write a complicated regular expression to invert a match. Rather, just supply the -v argument to grep. Most of the time simple is better. And you also don't need to cat the file to grep. Just give grep the file name. eg.

grep -v "^//" bar.txt | grep "#include"

If you're really hungup on using regular expressions then a simple one would look like (match start of string ^, any number of white space [[:space:]]*, exactly two backslashes /{2}, any number of any characters .*, followed by #include):

grep -E "^[[:space:]]*/{2}.*#include" bar.txt



回答2:


  1. You're using negative lookahead which is PCRE feature and requires -P option
  2. Your negative lookahead won't work without start anchor
  3. This will of course require gnu-grep.
  4. You must use single quotes to use ! in your regex otherwise history expansion is attempted with the text after ! in your regex, the reason of !\/\/: event not found error.

So you can use:

grep -P '^(?!\h*//)' file
hello

\h matches 0 or more horizontal whitespace.

Without -P or non-gnu grep you can use grep -v:

grep -v '^[[:blank:]]*//' file
hello



回答3:


To find #include lines that are not preceded by // (or /* …), you can use:

grep '^[[:space:]]*#[[:space:]]*include[[:space:]]*["<]'

The regex looks for start of line, optional spaces, #, optional spaces, include, optional spaces and either " or <. It will find all #include lines except lines such as #include MACRO_NAME, which are legitimate but rare, and screwball cases such as:

#/*comment*/include/*comment*/<stdio.h>
#\
include\
<stdio.h>

If you have to deal with software containing such notations, (a) you have my sympathy and (b) fix the code to a more orthodox style before hunting the #include lines. It will pick up false positives such as:

/* Do not include this:
#include <does-not-exist.h>
*/

You could omit the final [[:space:]]*["<] with minimal chance of confusion, which will then pick up the macro name variant.


To find lines that do not start with a double slash, use -v (to invert the match) and '^//' to look for slashes at the start of a line:

grep -v '^//'



回答4:


You have to use the -P (perl) option:

cat bar.txt | grep -P '(?!//)'



回答5:


For the lines not beginning with "//", you could use (^[^/]{2}.*$).




回答6:


If you don't like grep -v for this then you could just use awk:

awk '!/^\/\//' file

Since awk supports compound conditions instead of just regexps, it's often easier to specify what you want to match with awk than grep, e.g. to search for a and b in any order with grep:

grep -E 'a.*b|b.*a`

while with awk:

awk '/a/ && /b/'


来源:https://stackoverflow.com/questions/36110886/grep-for-lines-not-beginning-with

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!