MySQL has a nice CSV import function LOAD DATA INFILE
.
I have a large dataset that needs to be imported from CSV on a regular basis, so this feature is
Why not first just take a peek at how the lines end?
$handle = fopen('inputFile.csv', 'r');
$i = 0;
if ($handle) {
while (($buffer = fgets($handle)) !== false) {
$s = substr($buffer,-50);
echo $s;
echo preg_match('/\r/', $s) ? 'cr ' : '-- ';
echo preg_match('/\n/', $s) ? 'nl<br>' : '--<br>';
if( $i++ > 5)
break;
}
fclose($handle);
}
I'd just pre-process it. A global search/replace to change \r\n to \n done from a command line tool as part of the import process should be simple and performant.
You can use LINES STARTING to separate usual line endings in text and a new row:
LOAD DATA LOCAL INFILE '/home/laptop/Downloads/field3-utf8.csv'
IGNORE INTO TABLE Field FIELDS
TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '^'
LINES STARTING BY '^'
TERMINATED BY '\r\n'
(Id, Form_id, Name, Value)
For usual CSV files with " enclosing chars, it will be:
...
LINES STARTING BY '"'
...
If the first load has 0 rows, do the same statement with the other line terminator. This should be do-able with some basic counting logic.
At least it stays all in SQL, and if it works the first time you win. And could cause less headache that re-scanning all the rows and removing a particular character.
You could also look into one of the data integration packages out there. Talend Open Studio has very flexible data input routines. For example you could process the file with one set of delimiters and catch the rejects and process them another way.
You can specify line separator as '\n' and remove trailing '\r' separators if necessary from the last field during loading.
For example -
Suppose we have the 'entries.txt' file. The line separator is '\r\n', and only after line ITEM2 | CLASS3 | DATE2
the separator is '\n':
COL1 | COL2 | COL3
ITEM1 | CLASS1 | DATE1
ITEM2 | CLASS3 | DATE2
ITEM3 | CLASS1 | DATE3
ITEM4 | CLASS2 | DATE4
CREATE TABLE statement:
CREATE TABLE entries(
column1 VARCHAR(255) DEFAULT NULL,
column2 VARCHAR(255) DEFAULT NULL,
column3 VARCHAR(255) DEFAULT NULL
)
Our LOAD DATA INFILE query:
LOAD DATA INFILE 'entries.txt' INTO TABLE entries
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(column1, column2, @var)
SET column3 = TRIM(TRAILING '\r' FROM @var);
Show results:
SELECT * FROM entries;
+---------+----------+---------+
| column1 | column2 | column3 |
+---------+----------+---------+
| ITEM1 | CLASS1 | DATE1 |
| ITEM2 | CLASS3 | DATE2 |
| ITEM3 | CLASS1 | DATE3 |
| ITEM4 | CLASS2 | DATE4 |
+---------+----------+---------+