grep

grep command with a lookahead pattern does not select anything

只愿长相守 提交于 2020-01-21 14:39:07
问题 I was trying to use the following grep command: grep '(.*)(?=(png|html|jpg|js|css)(?:\s*))(png|html|jpg|js|css.*\s)' file File contains the following: http://manage.bostonglobe.com/GiftTheGlobe/LandingPage.html https://manage.bostonglobe.com/cs/mc/login.aspx?p1=BGFooter https://www.bostonglobe.com/bgcs /newsletters?p1=BGFooter_Newsletters https://bostonglobe.custhelp.com/app/home?p1=BGFooter https://bostonglobe.custhelp.com/app/answers/list?p1=BGFooter /tools/help/stafflist?p1=BGFooter https:

How to make grep [A-Z] independent of locale?

谁说胖子不能爱 提交于 2020-01-21 07:07:10
问题 I was doing some everyday grepping and suddenly discovered that something seemingly trivial does not work: $ echo T | grep [A-Z] No match. How come T is not within A-Z range? I changed the regex a tiny bit: $ echo T | grep [A-Y] A match! Whoa! How is T within A-Y but not within A-Z? Apparently this is because my environment is set to Estonian locale where Y is at the end of the alphabet but Z is somewhere in the middle: ABCDEFGHIJKLMNOPQRSŠZŽTUVWÕÄÖÜXY $ echo $LANG et_EE.UTF-8 This all came

How to make grep [A-Z] independent of locale?

◇◆丶佛笑我妖孽 提交于 2020-01-21 07:06:05
问题 I was doing some everyday grepping and suddenly discovered that something seemingly trivial does not work: $ echo T | grep [A-Z] No match. How come T is not within A-Z range? I changed the regex a tiny bit: $ echo T | grep [A-Y] A match! Whoa! How is T within A-Y but not within A-Z? Apparently this is because my environment is set to Estonian locale where Y is at the end of the alphabet but Z is somewhere in the middle: ABCDEFGHIJKLMNOPQRSŠZŽTUVWÕÄÖÜXY $ echo $LANG et_EE.UTF-8 This all came

Why is this “can't break line” warning from grep of gcc man page?

爷,独闯天下 提交于 2020-01-21 05:09:43
问题 I was trying to find a line ending with -s with the following command but got warnings: $ man gcc | grep '\-s$' <standard input>:4808: warning [p 54, 13.2i]: can't break line $ man gcc | egrep '\-s$' <standard input>:4808: warning [p 54, 13.2i]: can't break line Below is my development environment: $ uname -a Linux localhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u1 (2015-12-14) x86_64 GNU/Linux $ gcc --version gcc (Debian 4.9.2-10) 4.9.2 Copyright (C) 2014 Free Software Foundation,

“'\w' is an unrecognized escape” in grep

自闭症网瘾萝莉.ら 提交于 2020-01-21 03:02:56
问题 I'm using grep in some projects in R (which uses a perl=TRUE flag) and for the life of me I can't figure out why R keeps throwing errors. My query is as follows: d$SomeColumn[grep("(?ix)<VNW[^;]*;(dis|dat)> \w*<N\(", d$Right, perl=TRUE)] <- 1 However, R throws the following error: Error: '\w' is an unrecognized escape in character string starting ""<VNW[^;]*;(dis|dat)> \w" 回答1: You need to escape the backslashes one more time in r. d$SomeColumn[grep("(?ix)<VNW[^;]*;(dis|dat)> \\w*<N\\(", d

“'\w' is an unrecognized escape” in grep

雨燕双飞 提交于 2020-01-21 03:02:41
问题 I'm using grep in some projects in R (which uses a perl=TRUE flag) and for the life of me I can't figure out why R keeps throwing errors. My query is as follows: d$SomeColumn[grep("(?ix)<VNW[^;]*;(dis|dat)> \w*<N\(", d$Right, perl=TRUE)] <- 1 However, R throws the following error: Error: '\w' is an unrecognized escape in character string starting ""<VNW[^;]*;(dis|dat)> \w" 回答1: You need to escape the backslashes one more time in r. d$SomeColumn[grep("(?ix)<VNW[^;]*;(dis|dat)> \\w*<N\\(", d

Remove all rows where length of string is more than n

人盡茶涼 提交于 2020-01-20 19:30:31
问题 I have a dataframe m and I want to remove all the rows where the f_name column has an entry greater than 3. I assume I can use something similar to m <- m[-grep("nchar(m$f_name)>3", m$f_name] 回答1: To reword your question slightly, you want to retain rows where entries in f_name have length of 3 or less. So how about: subset(m, nchar(as.character(f_name)) <= 3) 回答2: Try this: m[!nchar(as.character(m$f_name)) > 3, ] 回答3: For those looking for a tidyverse approach, you can use dplyr::filter : m

Remove all rows where length of string is more than n

人走茶凉 提交于 2020-01-20 19:29:55
问题 I have a dataframe m and I want to remove all the rows where the f_name column has an entry greater than 3. I assume I can use something similar to m <- m[-grep("nchar(m$f_name)>3", m$f_name] 回答1: To reword your question slightly, you want to retain rows where entries in f_name have length of 3 or less. So how about: subset(m, nchar(as.character(f_name)) <= 3) 回答2: Try this: m[!nchar(as.character(m$f_name)) > 3, ] 回答3: For those looking for a tidyverse approach, you can use dplyr::filter : m

Shell grep命令详解

十年热恋 提交于 2020-01-20 13:13:35
grep的全称是global regular expression print,是linux中最强大的文本搜索命令之一,常用于搜索文本文件中是否含有某些特定模式的字符串。该命令以行为单位读取文本并使用正则表达式进行匹配,匹配成功后打印出该行文本。 命令格式 grep [option] "string_to_find" filename 常见选项: (1)-i:忽略搜索字符串的大小写 (2)-v:取反,即输出不匹配的那些文本行 (3)-n:输出行号 (4)-l:输出能够匹配模式的文件名,相反的选项为-L (5)-q:静默输出 (6)-c:统计找到的符合条件的字符串的次数; (7)-A 数字:列出符合条件的行,并列出后续的 n 行; (8)-B 数字:列出符合条件的行,并列出前面的 n 行; (9)--color=auto:搜索出的关键字用颜色显示; (10)-o:只输出匹配到的文本部分 (11)-r:grep的参数filename为目录时可以加上本选项表示递归搜索 (12)-e:该选项加上正则表达式就是一个需要匹配的模式 (13)--include:指定需要搜索的文件 --exclude:排除需要搜索的文件--exclude-dir:排除需要搜索的目录 (14)-Z:设定输出的文本之间以'\0'作为分隔符 来源: https://www.cnblogs.com/tingxin/p

Linux速查手册

99封情书 提交于 2020-01-20 10:45:49
文件相关 文件查看 文件操作 文件查找 压缩解压 内存进程 进程相关 系统信息 磁盘信息 日志相关 日志处理 其他 其他操作 toc 文件相关 文件查看 #显示隐藏文件 ls -al #显示包含数字的文件名和目录名 ls *[0-9]* #查看文件有多少行 wc -l filename cat xxx|wc -i #查看文件夹内的文件数目 ls -l | grep '^-' | wc -l #查看文件前10行 head -n 10 xxx.txt #从第3000行开始,显示1000行。即显示3000~3999行 cat filename | tail -n +3000 | head -n 1000 #显示1000行到3000行 cat filename| head -n 3000 | tail -n +1000 #查看当前文件夹大小 du -sh #统计当前文件夹(目录)大小,并按文件大小排序 du -sh * | sort -n #查看指定文件大小 du -sk filename #查找当前目录中的所有jar文件 ls -l | grep '.jar' 文件操作 #复制文件 cp source dest #递归复制整个文件夹 cp -r sourceFolder targetFolder cp -R /home/jenkins_home/. /var/html