how to find a search term in source code

前端 未结 3 1675
北荒
北荒 2020-11-30 14:54

I\'m looking for a way to search for a given term in a project\'s C/C++ code, while ignoring any occurrences in comments and strings.

As the code base is rather larg

3条回答
  •  情深已故
    2020-11-30 15:08

    The robust way to do this should be with cscope (http://cscope.sourceforge.net/) in line-oriented mode using the find this C symbol option but I haven't used that on a variety of C standards so if that doesn't work for you or if you can't get cscope then do this:

    find . -type f -print |
    while IFS= read -r file
    do
        sed 's/a/aA/g; s/__/aB/g; s/#/aC/g' "$file" |
        gcc -P -E - |
        sed 's/aC/#/g; s/aB/__/g; s/aA/a/g' |
        awk -v file="$file" -v OFS=': ' '/\/{print file, $0}'
    done
    

    The first sed replaces all hash (#) and __ symbols with unique identifier strings, so that the preprocessor doesn't do any expansion of #include, etc. but we can restore them after preprocessing.

    The gcc preprocesses the input to strip out comments.

    The second sed replaces the hash-identifier string that we previously added with an actual hash sign.

    The awk actually searches for float within word-boundaries and if found prints the file name plus the line it was found on. This uses GNU awk for word-boundaries \< and \>.

    The 2nd sed's job COULD be done as part of the awk command but I like the symmetry of the 2 seds.

    Unlike if you use cscope, this sed/gcc/sed/awk approach will NOT avoid finding false matches within strings but hopefully there's very few of those and you can weed them out while post-processing manually anyway.

    It will not work for file names that contain newlines - if you have those you can but the body in a script and execute it as find .. -print0 | xargs -0 script.

    Modify the gcc command line by adding whatever C or C++ version you are using, e.g. -ansi.

提交回复
热议问题