SSIS reading LF as terminator when its set as CRLF

前端 未结 5 633
走了就别回头了
走了就别回头了 2020-11-30 14:28

using SSIS 2012. My flat file connection manager I have a delimited file where the row delimiter is set to CRLF, but when it processes the file, I have a text c

5条回答
  •  南方客
    南方客 (楼主)
    2020-11-30 14:55

    Before answering, i don't think that the column contains only LF because if the row delimiter is CRLF it will not consider it as delimiter. So it is probably CRLF, but i will give a solution for the two cases (CRLF or LF)

    Solution

    You can fix this situation with the following steps:

    1. First in the Flat File connection manager add only one column (of type DT_STR and length 4000) so you will consider each row as one column.
    2. In the data flow task you have to add a Script component that fix the file structure. and split row into columns.

    Simple Test

    I will consider a flat file with the following content

    ID;name;DOB;Notes;ClassID{CRLF}
    1;John;2001-01-01;;1{CRLF}
    2;Moh;2002-01-01;Very cool{LF}
    Genius;2{CRLF}
    3;Ali;2000-01-01;Calm;2{CRLF}
    
    1. First i will add a flat file connection manager with the following options:
      • Row Delimiter = {CRLF}
      • Header Row Delimiter = {CRLF}

    1. In the DataFlow Task i will add a Flat File Source, 2 x Script Component , OLEDB Destination

    2. In the first Script Component i will mark Column0 as input and i will add 5 output Columns ID,Name,DOB,Notes,ClassID and i will set the Output Synchronous Input as None

    1. In the first Script Component i will write a script that store each line in a memory variable and assign it to an output row when row is complete and another row is present.

      Dim strLine As String = String.Empty
      
      Dim strDelimiter As String = ";"
      
      Public Sub EmptyMemoryVariables()
      
      
          strLine = String.Empty
      
      
      End Sub
      
      Public Sub AssignMemoryVariablesToOutput()
      
          With Output0Buffer
      
              .AddRow()
              .NewRow = strLine
          End With
      
      End Sub
      
      Public Function AreVariablesEmpty() As Boolean
      
          If strLine = "" Then
      
              Return True
      
          Else
      
              Return False
      
          End If
      
      
      End Function
      Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
      
          Dim strColumns As String() = Row.Column0.Split(CChar(strDelimiter))
      
          If strColumns.Length = 5 Then
      
              If Not AreVariablesEmpty() Then
                  AssignMemoryVariablesToOutput()
                  EmptyMemoryVariables()
              End If
      
              strLine = Row.Column0
      
              AssignMemoryVariablesToOutput()
              EmptyMemoryVariables()
      
      
          Else
      
              If strLine.Split(CChar(strDelimiter)).Length = 5 Then
      
                  AssignMemoryVariablesToOutput()
                  EmptyMemoryVariables()
      
              End If
      
      
              strLine &= Row.Column0
      
      
      
      
      
      
      
          End If
      
    2. In the second Script COmponent i will split each row into Columns

        Dim strDelimiter As String = ";"
        Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    
            Dim strColumns As String() = Row.NewRow.Split(CChar(strDelimiter))
    
    
            Row.ID = strColumns(0)
            Row.NAME = strColumns(1)
            Row.DOB = strColumns(2)
            Row.NOTES = strColumns(3)
            Row.CLASSID = strColumns(4)
    
    
        End Sub
    

    Important Note: the provided code is not optimal it may need more validations or can be simpler and better but i am trying to give you the way you can think to solve this issue

提交回复
热议问题