MySQL has a nice CSV import function LOAD DATA INFILE
.
I have a large dataset that needs to be imported from CSV on a regular basis, so this feature is exactly what I need. I've got a working script that imports my data perfectly.
.....except.... I don't know in advance what the end-of-line terminator will be.
My SQL code currently looks something like this:
LOAD DATA INFILE '{fileName}'
INTO TABLE {importTable}
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
( {fieldList} );
This works great for some import files.
However, the import data is coming from multiple sources. Some of them have the \n
terminator; others have \r\n
. I can't predict which one I'll have.
Is there a way using LOAD DATA INFILE
to specify that my lines may be terminated with either \n
or \r\n
? How do I deal with this?
I'd just pre-process it. A global search/replace to change \r\n to \n done from a command line tool as part of the import process should be simple and performant.
You can specify line separator as '\n' and remove trailing '\r' separators if necessary from the last field during loading.
For example -
Suppose we have the 'entries.txt' file. The line separator is '\r\n', and only after line ITEM2 | CLASS3 | DATE2
the separator is '\n':
COL1 | COL2 | COL3
ITEM1 | CLASS1 | DATE1
ITEM2 | CLASS3 | DATE2
ITEM3 | CLASS1 | DATE3
ITEM4 | CLASS2 | DATE4
CREATE TABLE statement:
CREATE TABLE entries(
column1 VARCHAR(255) DEFAULT NULL,
column2 VARCHAR(255) DEFAULT NULL,
column3 VARCHAR(255) DEFAULT NULL
)
Our LOAD DATA INFILE query:
LOAD DATA INFILE 'entries.txt' INTO TABLE entries
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(column1, column2, @var)
SET column3 = TRIM(TRAILING '\r' FROM @var);
Show results:
SELECT * FROM entries;
+---------+----------+---------+
| column1 | column2 | column3 |
+---------+----------+---------+
| ITEM1 | CLASS1 | DATE1 |
| ITEM2 | CLASS3 | DATE2 |
| ITEM3 | CLASS1 | DATE3 |
| ITEM4 | CLASS2 | DATE4 |
+---------+----------+---------+
I assuming the you need information only through mysql no by any programming language. Before use load data covert the format to windows format \r\n ( CR LF ) if u have notepad++. And then process the Load data query. Make sure the LINES TERMINATED BY '\r\n'

Edit:
Since the editors are often unsuitable for converting larger files. For larger files the following command is often used both windows and linux
1) To convert into windows format in windows
TYPE [unix_file] | FIND "" /V > dos_file
2) To convert into windows format in linux
unix2dos [file]
The other commands also available
A windows format file can be converted to Unix format by simply removing all ASCII CR \r characters by tr -d '\r' < inputfile > outputfile
grep -PL $'\r\n' myfile.txt # show UNIX format style file (LF terminated)
grep -Pl $'\r\n' myfile.txt # show WINDOS format style file (CRLF terminated)
In linux/unix the file command detects the type of End-Of-Line (EOL) used. So the file type can be checked using this command
You could also look into one of the data integration packages out there. Talend Open Studio has very flexible data input routines. For example you could process the file with one set of delimiters and catch the rejects and process them another way.
If the first load has 0 rows, do the same statement with the other line terminator. This should be do-able with some basic counting logic.
At least it stays all in SQL, and if it works the first time you win. And could cause less headache that re-scanning all the rows and removing a particular character.
Why not first just take a peek at how the lines end?
$handle = fopen('inputFile.csv', 'r');
$i = 0;
if ($handle) {
while (($buffer = fgets($handle)) !== false) {
$s = substr($buffer,-50);
echo $s;
echo preg_match('/\r/', $s) ? 'cr ' : '-- ';
echo preg_match('/\n/', $s) ? 'nl<br>' : '--<br>';
if( $i++ > 5)
break;
}
fclose($handle);
}
You can use LINES STARTING to separate usual line endings in text and a new row:
LOAD DATA LOCAL INFILE '/home/laptop/Downloads/field3-utf8.csv'
IGNORE INTO TABLE Field FIELDS
TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '^'
LINES STARTING BY '^'
TERMINATED BY '\r\n'
(Id, Form_id, Name, Value)
For usual CSV files with " enclosing chars, it will be:
...
LINES STARTING BY '"'
...
来源:https://stackoverflow.com/questions/10935219/mysql-load-data-infile-works-but-unpredictable-line-terminator