pattern-matching | 易学教程

Implementation of string pattern matching using Suffix Array and LCP(-LR)

阅读更多关于 Implementation of string pattern matching using Suffix Array and LCP(-LR)

问题 During the last weeks I tried to figure out how to efficiently find a string pattern within another string. I found out that for a long time, the most efficient way would have been using a suffix tree. However, since this data structure is very expensive in space, I studied the use of suffix arrays further (which use far less space). Different papers such as "Suffix Arrays: A new method for on-line string searches" (Manber & Myers, 1993) state, that searching for a substring can be realised

Split files based on file content and pattern matching

阅读更多关于 Split files based on file content and pattern matching

问题 I need your help with formate a txt file using bash/linux. The file looks like the following, it always has a line called Rate: Sth then it follows with the details in the very specific format. I'd like to split the file up with one rate for each file. In this example, I'd like to have 3 file, and each has the corresponding line says what the Rate value was. How will you approach this? line No. Main Text 1 Rate: GBP 2 12/01/1999,90.5911501,Validated ..... ..... 210 18/01/1999,90.954996

Problems with Pattern matching, implementing SplitAt in scala

阅读更多关于 Problems with Pattern matching, implementing SplitAt in scala

问题 I am trying to implement the scala splitAt using pattern matching and this is what I am trying to do: def split[T](someIndex:Int,someList:List[T]):(List[T],List[T]) = { def splitHelper[T](currentIndex:Int,someList:List[T],headList:List[T]):(List[T],List[T])= { (currentIndex,someList) match { case (someIndex,x::tail) => (x::headList,tail) case (currentIndex,x::y) => splitHelper(currentIndex+1,y,x::headList) case _ => (headList,headList) } } splitHelper(0,someList,List[T]()) } The compiler is

RegEx with preg_match to find and replace a SIMILAR string

阅读更多关于 RegEx with preg_match to find and replace a SIMILAR string

问题 I am using regular expressions with preg_replace() in order to find and replace a sentence in a piece of text. The $search_string contains plain text + html tags + elements. The problem is that only sometimes the elements convert to white space on run time, making it difficult to find and replace using str_replace() . So, I'm trying to build a pattern that is equal to the search string and will match anything like it which contains, or does not contain the elements; For example: $search

Python: How to get multiple elements inside square brackets

阅读更多关于 Python: How to get multiple elements inside square brackets

问题 I have a string/pattern like this: [xy][abc] I try to get the values contained inside the square brackets: xy abc There are never brackets inside brackets. Invalid: [[abc][def]] So far I've got this: import re pattern = "[xy][abc]" x = re.compile("\[(.*?)\]") m = outer.search(pattern) inner_value = m.group(1) print inner_value But this gives me only the inner value of the first square brackets. Any ideas? I don't want to use string split functions, I'm sure it's possible somehow with RegEx

Regex - nested patterns - within outer pattern but exclude inner pattern

阅读更多关于 Regex - nested patterns - within outer pattern but exclude inner pattern

问题 I have a file with the content below. <td> ${ dontReplaceMe } ReplaceMe ${dontReplaceMeEither} </td> I want to match 'ReplaceMe' if it is in the td tag, but NOT if it is in the ${ ... } expression. Can I do this with regex? Currently have: sed '/\${.*?ReplaceMe.*?}/!s/ReplaceMe/REPLACED/g' data.txt 回答1: This is not possible. Regex can be used for Type-3 Chomsky languages (regular language). Your sample code however is a Type-2 Chomsky language (context-free language). Pretty much as soon as

How do I delete a matching line and the previous one?

阅读更多关于 How do I delete a matching line and the previous one?

问题 I need delete a matching line and one previous to it. e.g In file below I need to remove lines 1 & 2. I tried "grep -v -B 1 "page. of. " 1.txt and I expected it to not print the matchning lines and the context. I tried the How do I delete a matching line, the line above and the one below it, using sed? but could not understand the sed usage. ---1.txt-- **document 1** -> 1 **page 1 of 2** -> 2 testoing testing super crap blah **document 1** **page 2 of 2** 回答1: You want to do something very

How to match over self in an enum?

阅读更多关于 How to match over self in an enum?

问题 I have an enum: enum Expr { Lit(u32), Var(Id), Ass(Id, u32), Add(u32, u32), Sub(u32, u32), Mul(u32, u32), } I'm trying to implement a method: impl Expr { fn eval(&self, env: &mut Env) -> Result<u32, String> { use Expr::*; match *self { Lit(l) => Ok(l), Var(id) => env.lookup(&id).ok_or_else(|| format!("undefined var {:?}", id)), Ass(id, v) => { env.assign(id, v); Ok(v) } Add(f, s) => Ok(f + s), Sub(f, s) => Ok(f - s), Mul(f, s) => Ok(f * s), } } } but I'm getting the following error: error

Using 'case' in PairRDDFunctions.reduceByKey()

阅读更多关于 Using 'case' in PairRDDFunctions.reduceByKey()

问题 This is the syntax for method reduceByKey def reduceByKey(func: (V, V) ⇒ V): RDD[(K, V)] In a word count program I am practicing, I see this code, val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y} The application works with (x, y) instead of case(x, y) . What is the use of case here. I did also check the answer from @ghik here. but not able to understand 回答1: Scala supports multiple ways of defining anonymous functions. The "case" version is so called Pattern

Regex to match the longest repeating substring

阅读更多关于 Regex to match the longest repeating substring

问题 I'm writing regular expression for checking if there is a substring, that contains at least 2 repeats of some pattern next to each other. I'm matching the result of regex with former string - if equal, there is such pattern. Better said by example: 1010 contains pattern 10 and it is there 2 times in continuous series. On other hand 10210 wouldn't have such pattern, because those 10 are not adjacent. What's more, I need to find the longest pattern possible, and it's length is at least 1. I