delimiter

How to programmatically guess whether a CSV file is comma or semicolon delimited

為{幸葍}努か 提交于 2019-11-30 01:51:54
问题 In most cases, CSV files are text files with records delimited by commas. However, sometimes these files will come semicolon delimited. (Excel will use semicolon delimiters when saving CSVs if the regional settings has the decimal separator set as the comma -- this is common in Europe. Ref: http://en.wikipedia.org/wiki/Comma-separated_values#Application_support) My question is, what is the best way to have a program guess whether to have it comma or semicolon separated? e.g. a line like 1,1;1

Hive load CSV with commas in quoted fields

和自甴很熟 提交于 2019-11-29 20:15:22
I am trying to load a CSV file into a Hive table like so: CREATE TABLE mytable ( num1 INT, text1 STRING, num2 INT, text2 STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; LOAD DATA LOCAL INPATH '/data.csv' OVERWRITE INTO TABLE mytable; The csv is delimited by an comma (,) and looks like this: 1, "some text, with comma in it", 123, "more text" This will return corrupt data since there is a ',' in the first string. Is there a way to set an text delimiter or make Hive ignore the ',' in strings? I can't change the delimiter of the csv since it gets pulled from an external source. The problem

How to split String before first comma?

为君一笑 提交于 2019-11-29 18:57:41
问题 I have an overriding method with String which returns String in format of: "abc,cde,def,fgh" I want to split the string content into two parts: String before first comma and String after first comma My overriding method is : @Override protected void onPostExecute(String addressText) { placeTitle.setText(addressText); } Now how do I split the string into two parts, so that I can use them to set the text in two different TextView ? 回答1: You may use the following code snippet String str ="abc

Issues converting csv to xls in Java? Only core Java experience needed - question not related to import

亡梦爱人 提交于 2019-11-29 18:17:18
First of all, I understand that it's unusual that I want to up-convert like this, but please bear with me. We get these csv files via website export and we have no options to get it in any other form. Now, onto the question: I have this old code that will do this process for me. It basically reads each line, then picks out each value between the , s. This worked great for some samples that I converted, but when it came down to working with the samples given, some values were out of place. I opened the files in Notepad++ and realized that some of the cells, themselves, contained , s. CSV files

Import CSV File Error : Column Value containing column delimiter

社会主义新天地 提交于 2019-11-29 18:09:37
I am trying to Import a Csv File into SQL SERVER using SSIS Here's an example how data looks like Student_Name,Student_DOB,Student_ID,Student_Notes,Student_Gender,Student_Mother_Name Joseph Jade,2005-01-01,1,Good listener,Male,Amy Amy Jade,2006-01-01,1,Good in science,Female,Amy .... Csv Columns are not containing text qualifiers (quotations) I Created a simple package using SSIS to import it into SQL but sometime the data in SQL looked like below Student_Name Student_DOB Student_ID Student_Notes Student_Gender Student_Mother_Name Ali Jade 2004-01-01 1 Good listener Bad in science Male,Lisa

What is the difference between `sep` and `delimiter` attributes in pandas.read_csv() method?

╄→尐↘猪︶ㄣ 提交于 2019-11-29 15:08:50
What is the difference between sep and delimiter attributes in pandas.read_csv() method? Also what is the situation when I would choose one over the other? In documentation I read something about Python builtin sniffer tool, also in delimiter, it says alternative argument name for sep , then why cant we have only one attribute? Confirmation that they are the same thing can be found in the source code : # Alias sep -> delimiter. if delimiter is None: delimiter = sep I agree with the other answer that it is best to stick to sep . It seems to be more commonly used, and it is more consistent with

Stored Procedures Using MySQL Workbench

若如初见. 提交于 2019-11-29 14:42:12
Very new to the environment, I have a question about a line that's added to the end of my code. The guide I'm following is: http://net.tutsplus.com/tutorials/an-introduction-to-stored-procedures/ If anyone has a better one regarding MySQL stored procedures, I'm all ears. Before I ask, this is the environment I'm using: OS: Windows 7 / WAMP (MySQL 5.5.24) / MySQL Workbench I'm instructed to define a delimiter; in my case I'm sticking with the default '$$.' The stored procedure I created is: DELIMITER $$ CREATE PROCEDURE test.`p2` () LANGUAGE SQL DETERMINISTIC COMMENT 'Adds "nson" to first and

How to parse a csv that uses ^A (i.e. \001) as the delimiter with spark-csv?

蹲街弑〆低调 提交于 2019-11-29 14:09:07
问题 Terribly new to spark and hive and big data and scala and all. I'm trying to write a simple function that takes an sqlContext, loads a csv file from s3 and returns a DataFrame. The problem is that this particular csv uses the ^A (i.e. \001) character as the delimiter and the dataset is huge so I can't just do a "s/\001/,/g" on it. Besides, the fields might contain commas or other characters I might use as a delimiter. I know that the spark-csv package that I'm using has a delimiter option,

Windows Batch: How to keep empty lines with loop for /f

三世轮回 提交于 2019-11-29 11:38:45
I'm searching how to keep empty lines when I browse a file with a for loop. for /f "tokens=1* delims=[" %%i in ('type "test1.txt" ^| find /v /n ""') do ( SET tmp=%%i echo !tmp! >> test2.txt ) Actually it works for everybody, but as far as I'm concerned it does not work. For instance if test1.txt content is : Hello I come from France I live in Paris I'm sorry I don't know english, could we speak french please ? If it doesn't bother you Thank you Result in test2.txt will be : [1 [2 [3 [4 [5 [6 [7 If I put off the "1" near the star "*", result is : [1]Hello I come from France [2]I live in Paris

Splitting textfile into section with special delimiter line - python

谁说我不能喝 提交于 2019-11-29 11:14:18
I have an input file as such: This is a text block start This is the end And this is another with more than one line and another line. The desired task is to read the files by section delimited by some special line, in this case it's an empty line, e.g. [out]: [['This is a text block start', 'This is the end'], ['And this is another','with more than one line', 'and another line.']] I have been getting the desired output by doing so: def per_section(it): """ Read a file and yield sections using empty line as delimiter """ section = [] for line in it: if line.strip('\n'): section.append(line)