string-matching | 易学教程

Extract numbers from String

阅读更多关于 Extract numbers from String

I have to parse a String to create a PathSegmentCollection . The string is composed of numbers separated by comas and/or (any) whitespaces (like newline, tab, etc...), also the numbers can be written using scientific notation. This is an example: "9.63074,9.63074 -5.55708e-006 0 ,0 1477.78" And the points are: P1(9.63074, 9.63074), P2(-0,555708, 0), P3(0, 1477.78) To extract numbers I use a regular expression: Dim RgxDouble As New Regex("[+-]?\b[0-9]+(\.[0-9]+)?(e[+-]?[0-9]+)?\b") Dim Matches As MatchCollection = RgxDouble.Matches(.Value) Dim PSegmentColl As New PathSegmentCollection Dim

find lines from one file in another

阅读更多关于 find lines from one file in another

问题 So I have a file1.txt with a list of names, and a file2.txt with another list of names and I need a list with the names that are in both files. I tried grep-f file1.txt file2.txt > newlist.txt but for some reason it isn't working, and the newlist.txt has names that are not in file1. Does anyone know why this is happening and what i could do to get only the names that are on both lists? thank you. 回答1: If file1.txt and file2.txt are sorted, you could use 'comm' comm -12 file1.txt file2.txt >

C++ match string in file and get line number

阅读更多关于 C++ match string in file and get line number

I have a file with the top 1000 baby names. I want to ask the user for a name...search the file...and tell the user what rank that name is for boy names and what rank for girl names. If it isn't in boy names or girl names, it tells the user it's not among the popular names for that gender. The file is laid out like this: Rank Boy-Names Girl-Names 1 Jacob Emily 2 Michael Emma . . . Desired output for input Michael would be: Michael is 2nd most popular among boy names. If Michael is not in girl names it should say: Michael is not among the most popular girl names Though if it was, it would say:

find lines from one file in another

阅读更多关于 find lines from one file in another

So I have a file1.txt with a list of names, and a file2.txt with another list of names and I need a list with the names that are in both files. I tried grep-f file1.txt file2.txt > newlist.txt but for some reason it isn't working, and the newlist.txt has names that are not in file1. Does anyone know why this is happening and what i could do to get only the names that are on both lists? thank you. If file1.txt and file2.txt are sorted, you could use 'comm' comm -12 file1.txt file2.txt > newlist.txt If each the names in each list are unique, then you can find their intersection as follows: sort

Identifying substrings based on complex rules

阅读更多关于 Identifying substrings based on complex rules

问题 Assume I have text strings that look something like this: A-B-C-I1-I2-D-E-F-I1-I3-D-D-D-D-I1-I1-I2-I1-I1-I3-I3 Here I want to identify sequences of markers ( A is a marker, I3 is a marker etc.) that leads up to a subsequence consisting only of IX markers (i.e. I1 , I2 , or I3 ) that contains an I3 . This subsequence can have a length of 1 (i.e. be a single I3 marker) or it can be of unlimited length, but always needs to contain at least 1 I3 marker, and can only contain IX markers. In the

C#: How to Delete the matching substring between 2 strings?

阅读更多关于 C#: How to Delete the matching substring between 2 strings?

If I have two strings .. say string1="Hello Dear c'Lint" and string2="Dear" .. I want to Compare the strings first and delete the matching substring .. the result of the above string pairs is: "Hello c'Lint" (i.e, two spaces between "Hello" and "c'Lint" ) for simplicity, we'll assume that string2 will be the sub-set of string1 .. (i mean string1 will contain string2).. Sumeet Do this only: string string1 = textBox1.Text; string string2 = textBox2.Text; string string1_part1=string1.Substring(0, string1.IndexOf(string2)); string string1_part2=string1.Substring( string1.IndexOf(string2)+string2

Shortest Repeating Sub-String

阅读更多关于 Shortest Repeating Sub-String

I am looking for an efficient way to extract the shortest repeating substring. For example: input1 = 'dabcdbcdbcdd' ouput1 = 'bcd' input2 = 'cbabababac' output2 = 'ba' I would appreciate any answer or information related to the problem. Also, in this post , people suggest that we can use the regular expression like re=^(.*?)\1+$ to find the smallest repeating pattern in the string. But such expression does not work in Python and always return me a non-match (I am new to Python and perhaps I miss something?). --- follow up --- Here the criterion is to look for shortest non-overlap pattern whose

Shortest Repeating Sub-String

阅读更多关于 Shortest Repeating Sub-String

问题 I am looking for an efficient way to extract the shortest repeating substring. For example: input1 = 'dabcdbcdbcdd' ouput1 = 'bcd' input2 = 'cbabababac' output2 = 'ba' I would appreciate any answer or information related to the problem. Also, in this post, people suggest that we can use the regular expression like re=^(.*?)\1+$ to find the smallest repeating pattern in the string. But such expression does not work in Python and always return me a non-match (I am new to Python and perhaps I

R String match for address using stringdist, stringdistmatrix

阅读更多关于 R String match for address using stringdist, stringdistmatrix

I have two large datasets, one around half a million records and the other one around 70K. These datasets have address. I want to match if any of the address in the smaller data set are present in the large one. As you would imagine address can be written in different ways and in different cases so it is quite annoying to see that there is not a match when it should have matched and there is a match when it should not have matched. I did some research and figured out the package stringdist that can be used. However I am stuck and I feel I am not using to its fullest capabilities and some

R String match for address using stringdist, stringdistmatrix

阅读更多关于 R String match for address using stringdist, stringdistmatrix

问题 I have two large datasets, one around half a million records and the other one around 70K. These datasets have address. I want to match if any of the address in the smaller data set are present in the large one. As you would imagine address can be written in different ways and in different cases so it is quite annoying to see that there is not a match when it should have matched and there is a match when it should not have matched. I did some research and figured out the package stringdist