grep | 易学教程

Merging word counts with Bash and Unix

阅读更多关于 Merging word counts with Bash and Unix

问题 I made a Bash script that extracts words from a text file with grep and sed and then sorts them with sort and counts the repetitions with wc , then sort again by frequency. The example output looks like this: 12 the 7 code 7 with 7 add 5 quite 3 do 3 well 1 quick 1 can 1 pick 1 easy Now I'd like to merge all words with the same frequency into one line, like this: 12 the 7 code with add 5 quite 3 do well 1 quick can pick easy Is there any way to do that with Bash and standard Unix toolset? Or

Remove words from a subtitle file that aren't in a wordlist (of common words)

阅读更多关于 Remove words from a subtitle file that aren't in a wordlist (of common words)

问题 I have some subtitle files, and I'm not intending to learn every single word in these subtitles, there is no need to learn some hard terms like: cleidocranial, dysplasia... I found this script here: Remove words from a cell that aren't in a list. But I have no idea how to modify it or run it. (I'm using linux) Here is our example: subtitle file (.srt): 2 00:00:13,000 --> 00:00:15,000 People with cleidocranial dysplasia are good. wordlist of 3000 common words (.txt): ... people with are good .

Only extract those words from a list that include no repeating letters, using regex

阅读更多关于 Only extract those words from a list that include no repeating letters, using regex

问题 I have a large word list file with one word per line. I would like to filter out the words with repeating alphabets. INPUT: abducts abe abeam abel abele OUTPUT: abducts abe abel I'd like to do this using Regex (grep or perl or python). Is that possible? 回答1: It's much easier to write a regex that matches words that do have repeating letters, and then negate the match: my @input = qw(abducts abe abeam abel abele); my @output = grep { not /(\w).*\1/ } @input; (This code assumes that @input

why questionmark comes in the end of filename when i create .txt file through shell script? [duplicate]

阅读更多关于 why questionmark comes in the end of filename when i create .txt file through shell script? [duplicate]

问题 This question already has answers here : Shell Scripting unwanted '?' character at the end of file name (2 answers) Closed 4 years ago . I am writing one shell script in which I am supposed to create 1 text file. When I do this, a question mark comes at the end of file name. what is the reason? I am trying below methods in bash script. 1) grep ERROR a1* > text.txt 2) touch text.txt In both the methods, instead of text.txt , there is a file generated as text.txt? what should I do to overcome

why questionmark comes in the end of filename when i create .txt file through shell script? [duplicate]

阅读更多关于 why questionmark comes in the end of filename when i create .txt file through shell script? [duplicate]

grep multiple patterns single file argument list too long

阅读更多关于 grep multiple patterns single file argument list too long

问题 I am currently searching for multiple patterns in a file. The file is of 90GB in size, I am searching on a particular field(from position 6-17 in each line). I am trying to get all the lines that contain any of a particular list of numbers. The current syntax I am using is: grep '^.\{6\}0000000012345\|^.\{6\}0000000012543' somelargeFile.txt > outputFile.txt For small number of patterns this works. For a large number of patterns I get the "Argument list too long" error. One alternative I have

grep multiple patterns single file argument list too long

阅读更多关于 grep multiple patterns single file argument list too long

Using grep from python console

阅读更多关于 Using grep from python console

问题 Using python how can I make this happen? python_shell$> print myPhone.print_call_log() | grep 555 The only thing close that I've seen is using "ipython console", assigning output to a variable, and then using a .grep() function on that variable. This is not really what I'm after. I want pipes and grepping on anything in the output (including errors/info). 回答1: Python's interactive REPL doesn't have grep , nor process pipelines, since it's not a Unix shell. You need to work with Python objects

Using grep from python console

阅读更多关于 Using grep from python console

How does bgrep work?

阅读更多关于 How does bgrep work?

问题 I am studying the command bgrep found here. I run bgrep "fafafa" test_27.6.2015.bin | less -M on the the binary data called test_27.6.2015.bin but I get test_27.6.2015.bin: 00005ee4 test_27.6.2015.bin: 0000bd3c I would suspect to get matches containing the term fafafafa . Two matches is the correct amount of matches. These hex numbers are probably of some segment containing fafafafa . How does bgrep form its search result? 回答1: bgrep's search result are formatted this way: printf("%s: %08llx