Bulk insert, SQL Server 2000, unix linebreaks

后端 未结 8 932
旧时难觅i
旧时难觅i 2020-12-13 13:05

I am trying to insert a .csv file into a database with unix linebreaks. The command I am running is:

BULK INSERT table_name
FROM \'C:\\file.csv\' 
WITH 
( 
         


        
相关标签:
8条回答
  • 2020-12-13 13:41

    It comes down to this. Unix uses LF (ctrl-J), MS-DOS/Windows uses CR/LF (ctrl-M/Ctrl-J).

    When you use '\n' on Unix, it gets translated to a LF character. On MS-DOS/Windows it gets translated to CR/LF. When the your import runs on the Unix formatted file, it sees only a LF. Hence, its often easier to run the file through unix2dos first. But as you said in you original question, you don't want to do this (I'll assume there is a good reason why you can't).

    Why can't you do:

    (ROWTERMINATOR = CHAR(10))
    

    Probably because when the SQL code is being parsed, it is not replacing the char(10) with the LF character (because it's already encased in single-quotes). Or perhaps its being interpreted as:

    (ROWTERMINATOR =
         )
    

    What happens when you echo out the contents of @bulk_cmd?

    0 讨论(0)
  • 2020-12-13 13:42

    Thanks to all who have answered but I found my preferred solution.

    When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown below.

    DECLARE @bulk_cmd varchar(1000)
    SET @bulk_cmd = 'BULK INSERT table_name
    FROM ''C:\file.csv''
    WITH (FIELDTERMINATOR = '','', ROWTERMINATOR = '''+CHAR(10)+''')'
    EXEC (@bulk_cmd)
    

    Why you can't say BULK INSERT ...(ROWTERMINATOR = CHAR(10)) is beyond me. It doesn't look like you can evaluate any expressions in the WITH section of the command.

    What the above does is create a string of the command and execute that. Neatly sidestepping the need to create an additional file or go through extra steps.

    0 讨论(0)
  • 2020-12-13 13:44

    One option would be to use bcp, and set up a control file with '\n' as the line break character.

    Although you've indicated that you would prefer not to, another option would be to use unix2dos to pre-process the file into one with '\r\n' line breaks.

    Finally, you can use the FORMATFILE option on BULK INSERT. This will use a bcp control file to specify the import format.

    0 讨论(0)
  • 2020-12-13 13:47

    I felt compelled to contribute as I was having the same issue, and I need to read 2 UNIX files from SAP at least a couple of times a day. Therefore, instead of using unix2dos, I needed something with less manual intervention and more automatic via programming.

    As noted, the Char(10) works within the sql string. I didn't want to use an sql string, and so I used ''''+Char(10)+'''', but for some reason, this didn't compile.

    What did work very slick was: with (ROWTERMINATOR = '0x0a')

    Problem solved with Hex!

    Hope this helps someone.

    0 讨论(0)
  • 2020-12-13 13:47

    I confirm that the syntax

    ROWTERMINATOR = '''+CHAR(10)+'''
    

    works when used with an EXEC command.

    If you have multiple ROWTERMINATOR characters (e.g. a pipe and a unix linefeed) then the syntax for this is:

    ROWTERMINATOR = '''+CHAR(124)+''+CHAR(10)+'''
    
    0 讨论(0)
  • 2020-12-13 13:48

    It's a bit more complicated than that! When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown above. I have just spent the best part of an hour figuring out why \n doesn't really mean \n when used with BULK INSERT!

    0 讨论(0)
提交回复
热议问题