How to parse this text file format into CSV format?

问题

I have a text file laid out as such where every field is a new line:

id = 606149
Category Name = Structural Columns
Family Name = Concrete-Square-Column
Type Name = EXIST RH C1 16 x 16
Document = 15050 Peavy Struct v2016_detached
Attachment Justification At Top = Minimum Intersection
Image = <None>
Offset From Attachment At Top = 0
id = 606151
Category Name = Structural Columns
Family Name = Concrete-Square-Column
Type Name = EXIST RH C2 16 x 16
Document = 15050 Peavy Struct v2016_detached
Attachment Justification At Top = Minimum Intersection
Image = <None>
Offset From Attachment At Top = 0

In my code I open the text file for reading and print out the first three lines for testing. When I try to append a comma to the end of the line I get the comma in the line below:

def main():
   count = 0
   filename = "test.txt"
   file = open(filename, "r")
   for line in file:
      if count == 3:
         break
      count = count + 1
      line += ','
      print line

With this code I get the result:

id = 606149
,
Category Name = Structural Columns
,
Family Name = Concrete-Square-Column
,

When I add a line strip to strip new lines before I concatenate the comma:

line = line.strip('\n')"

I get this result:

,id = 606149
,ategory Name = Structural Columns
,amily Name = Concrete-Square-Column

I'm having trouble parsing this file into a CSV format.

回答1:

You can do like this to get desired o/p, But that count you have to mention :

with open('j.txt', 'r') as f:
d =f.readlines()
for i in d:
    i = i.rstrip('\n')
    i+=','
    print(i)

I have used rstrip here and it will print all the lines, for first three lines you can give some loop or condition. O/P is something like this :

id = 606149, Category Name = Structural Columns, Family Name = Concrete-Square-Column, Type Name = EXIST RH C1 16 x 16, Document = 15050 Peavy Struct v2016_detached,
Attachment Justification, At Top = Minimum Intersection, Image = Offset From Attachment At Top = 0,

回答2:

You can read whole file and split lines

filename = "text.txt"
file = open(filename, "r")

f = file.read().splitlines()
for line in f:
    print(line)

回答3:

If your data file is structured as shown above you could use the '=' to separatte each key value pair, store these to a dictionary for each row then after you've read each record completely (i.e. found the 'Offset...' key value) start another row.

Once you have all the data use the csv module to write your csv file.

import csv

data = []
with open('test.txt') as fin:
    row = {}
    for line in fin:
        key, val = line.strip().split(' = ')
        row[key] = val
        if key == 'Offset From Attachment At Top':
            data.append(row)
            row = {}

fieldnames = data[0].keys()
with open('test.csv') as fout:
    cw = csv.DictWriter(fout, fieldnames)
    cw.writerows(data)

You probably want to add some error checking and may want to constrain the order of the field names in the call to DictWriter. I suggest you make each row an OrderedDict

回答4:

This should work

line.rstrip("\n") + ","

来源：https://stackoverflow.com/questions/48537209/how-to-parse-this-text-file-format-into-csv-format

标签

python

csv

parsing

formatting