How can I load in a pipe (|) delimited text file that has columns that sometimes contain line breaks?

落爺英雄遲暮 提交于 2020-01-06 06:24:27

问题


I have built an SSIS package that loads in several delimited text files into a SQL database. One of the files often contains line spaces in it, which breaks the standard data flow task of setting a flat file source and mapping to an ado.net destination since it thinks it is on a new line when it reaches a line break. The vendor sending over the files does not want to sent the file without any edits and can't do XML at this time. Is there any way to fix this? I was thinking of writing a small vb.net program that would correct the files so they would work in the SSIS package, but not sure how to write that logic. The file has 5 columns, the first 2 are big integer and always contain some long integer ID, then there is a small text column that just contains one short word, then a date, and then a long comments field that is causing the problem. The comments field is sometimes blank (which is ok), the problem are the rows that have line breaks. I never know how many line breaks are in the comments, some have none, some can have several, even multiple line breaks in a row, so was wondering if this is even possible.

5787626|6547599|Approved|1/10/2017|Applicant request for fee waiver approved 5443221|7742812|Active|11/5/2013| 3430962|7643957|Re-Scheduled|5/25/2016|REVISED TERMS AND CONDITIONS REJECTED Applicant has 30 DAYS To submit paperwork for extension. 34433624|7673715|Denied|1/24/2017| 34113575|7653748|Active|1/8/2014|New terms have been granted.

Sample File Format.


回答1:


As long as there is logic that you can program/predict, it will be possible.

I would do it using a Script Component as a source, which means you don't need to rewrite the file before processing it. It also provides a lot of flexibility, e.g., you can store values in variables while iterating over multiple lines in the file, etc.

I posted another answer recently that gives a lot of detail on how to go about this: SSIS import a Flat File to SQL with the first row as header and last row as a total.

An example of holding the values in variables until the row is ready to be written:-

For this example I am writing three columns, ID1, ID2 and Comments. The file looks like this:

1|2|Comment1
Comment2
4|5|Comment3
Comment4
Comment5
6|7|Comment6

The Script Component contains the following method.

public override void CreateNewOutputRows()
{
    System.IO.StreamReader reader = null;

    try
    {
        bool readFirstLine = false;
        int id1 = 0;
        int id2 = 0;
        string comments = null;

        reader = new System.IO.StreamReader(Variables.FilePath); // this refers to a package variable that contains the file path

        while (!reader.EndOfStream)
        {
            string line = reader.ReadLine();

            if (line.Contains("|"))
            {
                if (readFirstLine)
                {
                    Output0Buffer.AddRow();

                    Output0Buffer.ID1 = id1;
                    Output0Buffer.ID2 = id2;
                    Output0Buffer.Comments = comments;
                }
                else
                {
                    readFirstLine = true;
                }

                string[] fields = line.Split('|');

                id1 = Convert.ToInt32(fields[0]);
                id2 = Convert.ToInt32(fields[1]);
                comments = fields[2];
            }
            else
            {
                comments += " " + line;
            }

            if (reader.EndOfStream)
            {
                Output0Buffer.AddRow();

                Output0Buffer.ID1 = id1;
                Output0Buffer.ID2 = id2;
                Output0Buffer.Comments = comments;
            }
        }
    }
    catch
    {
        if (reader != null)
        {
            reader.Close();
            reader.Dispose();
        }

        throw;
    }
}

The result set is:

ID1    ID2    Comments
===    ===    ========
1      2      Comment1 Comment2
4      5      Comment3 Comment4 Comment5
6      7      Comment6


来源:https://stackoverflow.com/questions/46779816/how-can-i-load-in-a-pipe-delimited-text-file-that-has-columns-that-sometimes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!