How to split a text lines by fixed width C#

落爺英雄遲暮 提交于 2019-12-11 21:26:17

问题


Does anyone know how to split this file

1 TESTAAA      SERNUM    A DESCRIPTION
2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION
3 TESTXXX      BLAHBL

Each column has a fixed width and I'm planning to do it with a regex but I don't know how to do it exactly.

Having

{id} {firsttext} {serialhere} {description}
 4    22          6            30+

Someone recommend with a pattern like this (.{4})(.{22})(.{6})(.+)? then split it by split(' ') but the user stated that this won't work with a column has no value, but even that, he didn't do any example.

I heard also about the TextFieldParser but it has some issues about performance.

Can anyone tell me how to split by fixed width?

Thanks.


回答1:


Without seeing any reason not to, I would probably just use Substring.

Having said that, the Regex should work too.

The following example works on the input shown (rather than the numbers you've given) and assumes serial number is a required field, but may not take up its entire length + description is optional. Make adjustments following that principle if these assumptions are incorrect.

string input = @"1 TESTAAA      SERNUM    A DESCRIPTION
2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION
3 TESTXXX      BLAHBL";

var split = input.Split('\n').Select(s => new {
        Id = s.Substring(0, 2),
        FirstText = s.Substring(2, 13),
        Serial = s.Substring(15, Math.Min(s.Length-15, 10)),
        Description = s.Length > 25 ? s.Substring(25) : String.Empty
 });

Or as an explanatory example with more obvious naming and a slightly clearer example for serial length:

int idStart = 0;
int idLength = 2;
int firstTextStart = idStart + idLength;
int firstTextLength = 13;
int serialStart = firstTextStart + firstTextLength;
int serialLength = 10;
int descriptionStart = serialStart + serialLength;

var verboseSplit = input.Split('\n').Select(s => new {
    Id = s.Substring(idStart, idLength),
    FirstText = s.Substring(firstTextStart, firstTextLength),
    Serial = s.Length > descriptionStart
               ? s.Substring(serialStart, serialLength)
               : s.Substring(serialStart) 
    Description = s.Length > descriptionStart 
                    ? s.Substring(descriptionStart) 
                    : String.Empty
});

The output from either:

Id FirstText     Serial     Description 
1  TESTAAA       SERNUM     A DESCRIPTION

2  TESTBBB       ANOTHR     ANOTHER DESCRIPTION

3  TESTXXX       BLAHBL   



回答2:


Based on your sample try this, between each item there is a single white space

{id} {firsttext} {serialhere} {description}
 4    22          6            30+

string target = "1    TESTAAA                SERNUM A DESCRIPTION";
List<string> result = new List<string>(Regex.Split(target, @"(.{4})(.{1})(.{22})(.{1})(.{6})(.{1})(.+)?", RegexOptions.Singleline));



回答3:


How about this functional approach?

Start with these arrays:

var lines = new []
{
    "1 TESTAAA      SERNUM    A DESCRIPTION",
    "2 TESTBBB      ANOTHR    ANOTHER DESCRIPTION",
    "3 TESTXXX      BLAHBL",
};

var splits = new [] { 2, 13, 10, };

The splits I've used are different from your question because the length of the fields in each sample line does't match your splits.

Now define a recursive function to do the splitting of each line:

Func<string, IEnumerable<int>, IEnumerable<string>> f = null;
f =
    (t, ns) =>
    {
        if (ns.Any())
        {
            var n = ns.First();
            var i = System.Math.Min(n, t.Length);
            var t0 = t.Substring(0, i);
            var t1 = t.Substring(i);
            return new [] { t0.Trim(), }.Concat(f(t1, ns.Skip(1)));
        }
        else
            return new [] { t.Trim(), };
    };

Finally we can write a fairly trivial linq query to pull it all together:

var query =
    from line in lines
    let fields = f(line, splits).ToArray()
    select new
    {
        id = fields[0],
        firsttext = fields[1],
        serialhere = fields[2],
        description = fields[3],
    };

The result I get is:



来源:https://stackoverflow.com/questions/19649617/how-to-split-a-text-lines-by-fixed-width-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!