string-matching

Extract strings in a text file using grep

纵饮孤独 提交于 2019-12-11 19:21:19
问题 I have file.txt with names one per line as shown below: ABCB8 ABCC12 ABCC3 ABCC4 AHR ALDH4A1 ALDH5A1 .... I want to grep each of these from an input.txt file. Manually i do this one at a time as grep "ABCB8" input.txt > output.txt Could someone help to automatically grep all the strings in file.txt from input.txt and write it to output.txt. 回答1: for line in `cat text.txt`; do grep $line input.txt >> output.txt; done Contents of text.txt : ABCB8 ABCC12 ABCC3 ABCC4 AHR ALDH4A1 ALDH5A1 Edit : A

Join dataframes based on partial string-match between columns

故事扮演 提交于 2019-12-11 17:07:57
问题 I have a dataframe which I want to compare if they are present in another df. after_h.sample(10, random_state=1) movie year ratings 108 Mechanic: Resurrection 2016 4.0 206 Warcraft 2016 4.0 106 Max Steel 2016 3.5 107 Me Before You 2016 4.5 I want to compare if the above movies are present in another df. FILM Votes 0 Avengers: Age of Ultron (2015) 4170 1 Cinderella (2015) 950 2 Ant-Man (2015) 3000 3 Do You Believe? (2015) 350 4 Max Steel (2016) 560 I want something like this as my final output

regex to separate HTML GET parameters

荒凉一梦 提交于 2019-12-11 13:43:22
问题 How can I use a regular expression to separate GET parameters in a URI and extract a certain one? Specifically, I'm trying to get just the v= part of a YouTube watch URI. I've come up with youtube.com\/watch\?(\w+=[\w-]+&?)*(v=[\w-]+)&?*(\w+=[\w-]+&?)* , but that looks awfully repetitive. Is there a better (shorter?) way to do this? 回答1: A simplified regex : ^(?:http://www.)?youtube.[^/]+?/watch?(. ?)(v=([^&]+))(. )$ 回答2: I know there are a lot of similar questions out there, but none has

Comparing two base64 image strings and removing matches?

白昼怎懂夜的黑 提交于 2019-12-11 11:47:55
问题 Not sure if what I'm trying to do will work out, or is even possible. Basically I'm creating a remote desktop type app which captures the screen as a jpeg image and sends it to the client app for displaying. I want to reduce the amount of data sent each time by comparing the image to the older one and only sending the differences. For example: var bitmap = new Bitmap(1024, 720); string oldBase = ""; using (var stream = new MemoryStream()) using (var graphics = Graphics.FromImage(bitmap)) {

Subset a df using partial match with multiple criteria

北城余情 提交于 2019-12-11 08:37:49
问题 This is the dataset: company <- c("Coca-Cola Inc.", "DF, CocaCola", "COCA-COLA", "PepsiCo Inc.", "Beverages Distribution") brand <- c("Coca-Cola Zero","N/A", "Coca-Cola", "Pepsi", "soft drink") vol <- c("2456","1653", "19", "2766", "167") data <-data.frame(company, brand, vol) data Which results in: company brand vol 1 Coca-Cola Inc. Coca-Cola Zero 2456 2 DF, CocaCola N/A 1653 3 COCA-COLA CocaCola 19 4 PepsiCo Inc. Pepsi 2766 5 Beverages Distribution soft drink 167 Let's say, this is imported

Mathematica Lists - Search Level Two and Return Level One?

☆樱花仙子☆ 提交于 2019-12-11 07:48:31
问题 I need to string match in the second level of a list but have true cases returned at the first level (there's information in the first level that I need to categorize the returns). First /@ GatherBy[#, #[[3]] &] &@ Cases[#, x_List /; MemberQ[x, s_String /; StringMatchQ[s, ("*PHYSICAL EXAMINATION*"), IgnoreCase -> True]], {2}] &@ Cases[MemoizeTable["Diagnostic_table.txt"], {_, 11111, __}] The GatherBy command at the top is just organizing all the entries by date so I don't get any duplicates.

Excel Return multiple partial string matches in array(s)

ぐ巨炮叔叔 提交于 2019-12-11 06:37:21
问题 Challenge: To have a formula (most likely an array formula) that will return multiple partial matches from a column/row. Parameters: Cannot use INDIRECT , as this is not scalable and will break if data is moved or inserted Formula must be expandable with minimum effort (ie: drag the corner in the direction you need to expand it to display the next partial match) Notes: Unfortunately the INDEX / MATCH function combination does not work for a partial string match I have a solution based on

Optimizing near-duplicate value search

泄露秘密 提交于 2019-12-11 06:12:30
问题 I'm trying to find near duplicate values in a set of fields in order to allow an administrator to clean them up. There are two criteria that I am matching on One string is wholly contained within the other, and is at least 1/4 of its length The strings have an edit distance less than 5% of the total length of the two strings The Pseudo-PHP code: foreach($values as $value){ $matches = array(); foreach($values as $match){ if( ( $value['length'] < $match['length'] && $value['length'] * 4 >

How to detect substrings from multiple lists within a string in R

天大地大妈咪最大 提交于 2019-12-11 04:09:22
问题 I am attempting to try and find if a string called "values" contains substrings from two different lists. This is my current code: for (i in 1:length(value)){ for (j in 1:length(city)){ if (str_detect(value[i],(city[j]))) == TRUE){ for (k in 1:length(school)){ if (str_detect(value[i],(school[j]))) == TRUE){ ........................................................... } } } } } city and school are separate vectors of different length, each containing string elements. city <- ("Madrid", "London"

CLOSED!! How i can detect the type from a string in Scala?

空扰寡人 提交于 2019-12-11 04:04:21
问题 I'm trying to parse the csv files and I need to determine the type of each field starting from its string value. for examples: val row: Array[String] = Array("1/1/06 0:00","3108 OCCIDENTAL DR","3","3C","1115") this is what I would get: row(0) --> Date row(1) --> String row(2) --> Int Ecc.... how can I do? ------------------------------------ SOLUTION ------------------------------------ This is the solution I've found to recognize the fields String, Date, Int, Double and Boolean. I hope that