Grouping Lists into specific groups

问题

I'm wondering if it is possible to convert the listings into a specific groups to which I could place them in a table format later on.

This is the output that I needed to group, I converted them into a list so that I could easily divide them in table manner.

f=open("sample1.txt", "r")
f.read()

Here's the output:

'0245984300999992018010100004+14650+121050FM-12+004699999V0203001N00101090001CN008000199+02141+01971101171ADDAY141021AY241021GA1021+006001081GA2061+090001021GE19MSL   +99999+99999GF106991021999006001999999KA1120N+02111MD1210141+9999MW1051REMSYN10498430 31558 63001 10214 20197 40117 52014 70544 82108 333 20211 55062 56999 59012 82820 86280 555 60973=\n'

Here's what I have done already. I have managed to change it into a list which resulted in this output:

with open('sample1.txt', 'r') as file:
data = file.read().replace('\n', '')
print (list(data))

The Output:

['0', '2', '4', '5', '9', '8', '4', '3', '0', '0', '9', '9', '9', '9', '9', '2', '0', '1', '8', '0', '1', '0', '1', '0', '0', '0', '0', '4', '+', '1', '4', '6', '5', '0', '+', '1', '2', '1', '0', '5', '0', 'F', 'M', '-', '1', '2', '+', '0', '0', '4', '6', '9', '9', '9', '9', '9', 'V', '0', '2', '0', '3', '0', '0', '1', 'N', '0', '0', '1', '0', '1', '0', '9', '0', '0', '0', '1', 'C', 'N', '0', '0', '8', '0', '0', '0', '1', '9', '9', '+', '0', '2', '1', '4', '1', '+', '0', '1', '9', '7', '1', '1', '0', '1', '1', '7', '1', 'A', 'D', 'D', 'A', 'Y', '1', '4', '1', '0', '2', '1', 'A', 'Y', '2', '4', '1', '0', '2', '1', 'G', 'A', '1', '0', '2', '1', '+', '0', '0', '6', '0', '0', '1', '0', '8', '1', 'G', 'A', '2', '0', '6', '1', '+', '0', '9', '0', '0', '0', '1', '0', '2', '1', 'G', 'E', '1', '9', 'M', 'S', 'L', ' ', ' ', ' ', '+', '9', '9', '9', '9', '9', '+', '9', '9', '9', '9', '9', 'G', 'F', '1', '0', '6', '9', '9', '1', '0', '2', '1', '9', '9', '9', '0', '0', '6', '0', '0', '1', '9', '9', '9', '9', '9', '9', 'K', 'A', '1', '1', '2', '0', 'N', '+', '0', '2', '1', '1', '1', 'M', 'D', '1', '2', '1', '0', '1', '4', '1', '+', '9', '9', '9', '9', 'M', 'W', '1', '0', '5', '1', 'R', 'E', 'M', 'S', 'Y', 'N', '1', '0', '4', '9', '8', '4', '3', '0', ' ', '3', '1', '5', '5', '8', ' ', '6', '3', '0', '0', '1', ' ', '1', '0', '2', '1', '4', ' ', '2', '0', '1', '9', '7', ' ', '4', '0', '1', '1', '7', ' ', '5', '2', '0', '1', '4', ' ', '7', '0', '5', '4', '4', ' ', '8', '2', '1', '0', '8', ' ', '3', '3', '3', ' ', '2', '0', '2', '1', '1', ' ', '5', '5', '0', '6', '2', ' ', '5', '6', '9', '9', '9', ' ', '5', '9', '0', '1', '2', ' ', '8', '2', '8', '2', '0', ' ', '8', '6', '2', '8', '0', ' ', '5', '5', '5', ' ', '6', '0', '9', '7', '3', '=']

My goal is to group them into something like these:

0245,984300,99999,2018,01,01,0000,4,+1....

The number of digits belonging to each column is predetermined, for example there are always 4 digits for the first column and 6 for the second, and so on.

I was thinking of concatenating them. But I'm not sure if it would be possible.

回答1:

You can use operator.itemgetter

from operator import itemgetter

g = itemgetter(slice(0, 4), slice(4, 10))
with open('sample1.txt') as file:
    for line in file:
        print(g(line))

Or even better you can make the slices dynamically using zip and itertools.accumulate:

indexes = [4, 6, ...]
g = itemgetter(*map(slice, *map(accumulate, zip([0]+indexes, indexes))))

Then proceed as before

回答2:

I would recommend naming everything if you actually want to use this data, and double checking that all the lengths make sense. So to start you do

with open('sample1.txt', 'r') as file:
    data = file.read().rstrip('\n"')
    first, second, *rest = data.split()

    if len(first) != 163:
        raise ValueError(f"The first part should be 163 characters long, but it's {len(first)}")
    if len(second) != 163:
        raise ValueError(f"The second part should be  characters long, but it's {len(first)}")

So now you have 3 variables

first is "0245984300999992018010100004+14650+121050FM-12+004699999V0203001N00101090001CN008000199+02141+01971101171ADDAY141021AY241021GA1021+006001081GA2061+090001021GE19MSL"
second is "+99999+99999GF106991021999006001999999KA1120N+02111MD1210141+9999MW1051REMSYN10498430"
rest is ['31558', '63001', '10214', '20197', '40117', '52014', '70544', '82108', '333', '20211', '55062', '56999', '59012', '82820', '86280', '555', '60973']

And then repeat that idea

date, whatever, whatever2, whatever3 = first.split('+')

and then for parsing the first part I would just have a list like

something = date[0:4]
something_else = date[4:10]
third_thing = date[10:15]
year = [15:19]
month = [19:21]
day = [21:23]

and so on. And then you can use all these variables in the code that analyzes them.

If this is some sort of standard, you should look for a library that parses strings like that or write one yourself.

Obviously name the variables better

来源：https://stackoverflow.com/questions/58835192/grouping-lists-into-specific-groups

标签

python

python-3.x

grouping

listings