delimiter

How to use python csv module for splitting double pipe delimited data

為{幸葍}努か 提交于 2019-11-29 10:32:25
I have got data which looks like: "1234"||"abcd"||"a1s1" I am trying to read and write using Python's csv reader and writer. As the csv module's delimiter is limited to single char, is there any way to retrieve data cleanly? I cannot afford to remove the empty columns as it is a massively huge data set to be processed in time bound manner. Any thoughts will be helpful. The docs and experimentation prove that only single-character delimiters are allowed. Since cvs.reader accepts any object that supports iterator protocol, you can use generator syntax to replace || -s with | -s, and then feed

How do I explode an integer

时光总嘲笑我的痴心妄想 提交于 2019-11-29 09:18:05
the answer to this could be easy. But I'm very fresh to programming. So be gentle... I'm at work trying to do a quick fix for one of your customers. I want to get the total numbers of digits in a integer, and then explode the integer: rx_freq = 1331000000 ( = 10 ) $array[0] = 1 $array[1] = 3 . . $array[9] = 0 rx_freq = 990909099 ( = 9 ) $array[0] = 9 $array[1] = 9 . . $array[8] = 9 I'm not able to use explode, as this function need a delimiter. I've searched the eyh'old Google and Stackoverflow. Basically: How do I explode an integer without delimiter, and how do I find the number of digits in

awk and special brackets delimiters

左心房为你撑大大i 提交于 2019-11-29 08:44:14
I have data in the following format: .......{INFO1}.....[INFO2].... For awk it should be really simple to pick up the INFO1 and INFO2 parts, but I'm really struggling with it. I have managed to get the [INFO2] part by using awk -F'[][]' '{ print $2 }' but the INFO1 just will not match for me. How do I specify {} as delimiters? Just use [][{}] to define that you can use either of these: [ , ] , { or } as field separators awk -F"[][{}]" '{print ...}' file In general, you say -F"[PATTERNS]" . Test $ echo ".......{INFO1}.....[INFO2]...." | awk -F"[][{}]" '{print $2}' INFO1 $ echo ".......{INFO1}..

sed rare-delimiter (other than & | / ?…)

两盒软妹~` 提交于 2019-11-29 06:06:27
I've to apply the Unix command sed on a string ( can contain #, !, /, ?, &, @ and all other characters ) which can contains all types of character (&, |, !, /, ? ...) Is it a complex delimiter (with two caracters ?) which can permits to outpass the error : sed: -e expression #1, char 22: unknown option to `s' Thanks in advance There is no such option for multi-character expression delimiters in sed, but I doubt you need that. The delimiter character should not occur in the pattern , but if it appears in the string being processed, it's not a problem. And unless you're doing something extremely

How to use cut with multiple character delimiter? unix

て烟熏妆下的殇ゞ 提交于 2019-11-29 05:22:59
问题 My file looks like this abc ||| xyz ||| foo bar hello world ||| spam ham jam ||| blah blah I want to extract a specific column, e.g. I could have done: sed 's/\s|||\s/\\t/g' file | cut -f1 But is there other way of doing that? 回答1: Since | is a valid regex expression, it need to be escaped \\| or put in square brackets [|] You can do this: awk -F' \\|\\|\\| ' '{print $1}' file Some other variation that work as well awk -F' [|][|][|] ' '{print "$1"}' file awk -F' [|]{3} ' '{print "$1"}' file

str.format() raises KeyError

半城伤御伤魂 提交于 2019-11-29 04:20:29
问题 The following code raises a KeyError exception: addr_list_formatted = [] addr_list_idx = 0 for addr in addr_list: # addr_list is a list addr_list_idx = addr_list_idx + 1 addr_list_formatted.append(""" "{0}" { "gamedir" "str" "address" "{1}" } """.format(addr_list_idx, addr)) Why? I am using Python 3.1. 回答1: The problem is those { and } characters you have there that don't specify a key for formatting. You need to double them up, so change your code to: addr_list_formatted.append(""" "{0}" {{

MySQL LOAD DATA INFILE: works, but unpredictable line terminator

只谈情不闲聊 提交于 2019-11-29 01:04:24
问题 MySQL has a nice CSV import function LOAD DATA INFILE . I have a large dataset that needs to be imported from CSV on a regular basis, so this feature is exactly what I need. I've got a working script that imports my data perfectly. .....except.... I don't know in advance what the end-of-line terminator will be. My SQL code currently looks something like this: LOAD DATA INFILE '{fileName}' INTO TABLE {importTable} FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n'

Least used delimiter character in normal text < ASCII 128

纵然是瞬间 提交于 2019-11-28 21:56:14
问题 For coding reasons which would horrify you (I'm too embarrassed to say), I need to store a number of text items in a single string. I will delimit them using a character. Which character is best to use for this, i.e. which character is the least likely to appear in the text? Must be printable and probably less than 128 in ASCII to avoid locale issues. 回答1: Assuming for some embarrassing reason you can't use CSV I'd say go with the data. Take some sample data, and do a simple character count

Hive load CSV with commas in quoted fields

感情迁移 提交于 2019-11-28 15:36:45
问题 I am trying to load a CSV file into a Hive table like so: CREATE TABLE mytable ( num1 INT, text1 STRING, num2 INT, text2 STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; LOAD DATA LOCAL INPATH '/data.csv' OVERWRITE INTO TABLE mytable; The csv is delimited by an comma (,) and looks like this: 1, "some text, with comma in it", 123, "more text" This will return corrupt data since there is a ',' in the first string. Is there a way to set an text delimiter or make Hive ignore the ',' in

How do I use a dot as a delimiter?

不打扰是莪最后的温柔 提交于 2019-11-28 14:12:14
import java.util.Scanner; public class Test{ public static void main(String[] args){ Scanner input = new Scanner(System.in); input.useDelimiter("."); String given = input.next(); System.out.println(given); } } When I run the above code and type in asdf. then enter, I get nothing. It works fine with "," ";" "\"" "\\\\" or whatever, but just not with "." ... So is there something about a dot or is it just a problem with Eclipse IDE or whatever? Scanner is using regular expression (regex) as delimiter and dot . in regex is special character which represents any character except line separators.