tokenize

how to get data between quotes in java?

浪子不回头ぞ 提交于 2019-11-26 16:34:30
I have this lines of text the number of quotes could change like: Here just one "comillas" But I also could have more "mas" values in "comillas" and that "is" the "trick" I was thinking in a method that return "a" list of "words" that "are" between "comillas" How I obtain the data between the quotes? The result should be: comillas mas, comillas, trick a, words, are, comillas You can use a regular expression to fish out this sort of information. Pattern p = Pattern.compile("\"([^\"]*)\""); Matcher m = p.matcher(line); while (m.find()) { System.out.println(m.group(1)); } This example assumes

Tokenizing strings in C

守給你的承諾、 提交于 2019-11-26 15:24:59
I have been trying to tokenize a string using SPACE as delimiter but it doesn't work. Does any one have suggestion on why it doesn't work? Edit: tokenizing using: strtok(string, " "); The code is like the following pch = strtok (str," "); while (pch != NULL) { printf ("%s\n",pch); pch = strtok (NULL, " "); } gbjbaanb Do it like this: char s[256]; strcpy(s, "one two three"); char* token = strtok(s, " "); while (token) { printf("token: %s\n", token); token = strtok(NULL, " "); } Note: strtok modifies the string its tokenising, so it cannot be a const char* . Evan Teran Here's an example of

Nested strtok function problem in C [duplicate]

戏子无情 提交于 2019-11-26 14:43:16
This question already has an answer here: Using strtok() in a loop in C? 3 answers I have a string like this: a;b;c;d;e f;g;h;i;j 1;2;3;4;5 and i want to parse it element by element. I used nested strtok function but it just splits first line and makes null the token pointer. How can i overcome this? Here is the code: token = strtok(str, "\n"); while(token != NULL && *token != EOF) { char a[128], b[128]; strcpy(a,token); strcpy(b,a); printf("a:%s\n",a); char *token2 = strtok(a,";"); while(token2 != NULL) { printf("token2 %s\n",token2); token2 = strtok(NULL,";"); } strcpy(token,b); token =

How to best split csv strings in oracle 9i

可紊 提交于 2019-11-26 14:41:37
I want to be able to split csv strings in Oracle 9i I've read the following article http://www.oappssurd.com/2009/03/string-split-in-oracle.html But I didn't understand how to make this work. Here are some of my questions pertaining to it Would this work in Oracle 9i, if not, why not? Is there a better way of going about splitting csv strings then the solution presented above? Do I need to create a new type? If so, do I need specific privilages for that? Can I declare the type w/in the function? Here's a string tokenizer for Oracle that's a little more straightforward than that page, but no

How do I read input character-by-character in Java?

笑着哭i 提交于 2019-11-26 14:31:50
I am used to the c-style getchar() , but it seems like there is nothing comparable for java. I am building a lexical analyzer, and I need to read in the input character by character. I know I can use the scanner to scan in a token or line and parse through the token char-by-char, but that seems unwieldy for strings spanning multiple lines. Is there a way to just get the next character from the input buffer in Java, or should I just plug away with the Scanner class? The input is a file, not the keyboard. Use Reader.read() . A return value of -1 means end of stream; else, cast to char . This

How to use stringstream to separate comma separated strings [duplicate]

会有一股神秘感。 提交于 2019-11-26 14:05:26
This question already has an answer here: How do I iterate over the words of a string? 76 answers I've got the following code: std::string str = "abc def,ghi"; std::stringstream ss(str); string token; while (ss >> token) { printf("%s\n", token.c_str()); } The output is: abc def,ghi So the stringstream::>> operator can separate strings by space but not by comma. Is there anyway to modify the above code so that I can get the following result? input : "abc,def,ghi" output : abc def ghi jrok #include <iostream> #include <sstream> std::string input = "abc,def,ghi"; std::istringstream ss(input); std

How do I tokenize a string sentence in NLTK?

蹲街弑〆低调 提交于 2019-11-26 12:45:17
问题 I am using nltk, so I want to create my own custom texts just like the default ones on nltk.books. However, I\'ve just got up to the method like my_text = [\'This\', \'is\', \'my\', \'text\'] I\'d like to discover any way to input my \"text\" as: my_text = \"This is my text, this is a nice way to input text.\" Which method, python\'s or from nltk allows me to do this. And more important, how can I dismiss punctuation symbols? 回答1: This is actually on the main page of nltk.org: >>> import nltk

how to convert csv to table in oracle

*爱你&永不变心* 提交于 2019-11-26 12:30:35
How can I make a package that returns results in table format when passed in csv values. select * from table(schema.mypackage.myfunction('one, two, three')) should return one two three I tried something from ask tom but that only works with sql types. I am using oracle 11g. Is there something built-in? The following works invoke it as select * from table(splitter('a,b,c,d')) create or replace function splitter(p_str in varchar2) return sys.odcivarchar2list is v_tab sys.odcivarchar2list:=new sys.odcivarchar2list(); begin with cte as (select level ind from dual connect by level <=regexp_count(p

Tokenizing and sorting with XSLT 1.0

我们两清 提交于 2019-11-26 11:35:42
问题 I have a delimited string (delimited by spaces in my example below) that I need to tokenize, sort, and then join back together and I need to do all this using XSLT 1.0. How would I do that? I know I need to use xsl:sort somehow, but everything I’ve tried so far has given me some sort of error. For example, if I run the code at the bottom of this posting, I get this: strawberry blueberry orange raspberry lime lemon What would I do if I wanted to get this instead?: blueberry lemon lime orange

C++ Templates Angle Brackets Pitfall - What is the C++11 fix?

让人想犯罪 __ 提交于 2019-11-26 11:30:28
问题 In C++11, this is now valid syntax: vector<vector<float>> MyMatrix; whereas previously, it had to be written like this (notice the space): vector<vector<float> > MyMatrix; My question is what is the fix that the standard uses to allow the first version? Could it be as simply as making > a token instead of >> ? If that\'s not it, what does not work with this approach? I consider that forms like myTemplate< x>>3 > are a non-problem, since you can disambiguate them by doing myTemplate<(x>>3)> .