Creating nested list from string data with two delimiters in Python

女生的网名这么多〃 提交于 2019-12-12 04:47:22

问题


I am trying to take a text file that looks like this:

1~Hydrogen~H~1.008~1~1|2~Helium~He~4.002~18~1|3~Lithium~Li~6.94~1~2|4~Beryllium~ Be~9.0122~2~2|

and turn it into a nested list that looks like this:

[[1, Hydrogen, H, 1.008, 1, 1], [2, Helium, He, 4.002, 18, 1], [3, Lithium, Li, 6.94, 1, 2], [4, Beryllium, Be, 9.0122, 2, 2]]

The code I have looks like:

class Parser:
    def __init__(self, path):
        self.file = open(path, "r")
        self.unparsed_info = self.file.read()
        self.parsed_by_element = []
        self.parsed_info = []
        self.parse_list('|', '~')

    def parse_list(self, delimiter1, delimiter2):
        for elements in self.unparsed_info.split(delimiter1):
            e = elements.strip(delimiter1)
            if e != '':
                self.parsed_by_element.append(e)
            for properties in e.split(delimiter2):
                p = properties.strip(delimiter2)
                if p != '':
                    self.parsed_by_element.insert("something that represents location of current element being manipulated", p)

but I can't figure out how to fill in the blank for the insertion on the last line. Does anybody have any suggestions? Or a better way to do this?


回答1:


You can try this:

s= "1~Hydrogen~H~1.008~1~1|2~Helium~He~4.002~18~1|3~Lithium~Li~6.94~1~2|4~Beryllium~ Be~9.0122~2~2|"
final_data = [b for b in [i.split('~') for i in s.split('|')] if b[0]]

Output:

[['1', 'Hydrogen', 'H', '1.008', '1', '1'], ['2', 'Helium', 'He', '4.002', '18', '1'], ['3', 'Lithium', 'Li', '6.94', '1', '2'], ['4', 'Beryllium', ' Be', '9.0122', '2', '2']]



回答2:


s = '1~Hydrogen~H~1.008~1~1|2~Helium~He~4.002~18~1|3~Lithium~Li~6.94~1~2|4~Beryllium~ Be~9.0122~2~2|'
[i.split('~') for i in s.split('|') if i]
#Output
[['1', 'Hydrogen', 'H', '1.008', '1', '1'],
 ['2', 'Helium', 'He', '4.002', '18', '1'],
 ['3', 'Lithium', 'Li', '6.94', '1', '2'],
 ['4', 'Beryllium', ' Be', '9.0122', '2', '2']]



回答3:


Try this: Supposing variable data is the string:

data = "1~Hydrogen~H~1.008~1~1|2~Helium~He~4.002~18~1|3~Lithium~Li~6.94~1~2|4~Beryllium~ Be~9.0122~2~2|"
parsed_data = [x.split('~') for x in data.split('|') if x]

Modified your code to:

class Parser:
    def __init__(self, path):
        self.file = open(path, "r")
        self.unparsed_info = self.file.read()
        self.parsed_by_element = []
        self.parsed_info = []
        self.parse_list('|', '~')

    def parse_list(self, delimiter1, delimiter2):
        for elements in self.unparsed_info.split(delimiter1):
            if elements:
                self.parsed_by_element.append(elements.split(delimiter2))

        print(self.parsed_by_element)

        #OR simply

        self.parsed_by_element = [elements.split(delimiter2) for elements in self.unparsed_info.split(delimiter1) if elements]

        print(self.parsed_by_element)



回答4:


You can do this in a much simpler way, also I'm assuming you need int and float conversions? because your desired output looks that way.

def parse(path):
    list_of_lists = []
    with open(path) as file_handle:
        for line in file_handle:
            for string in line.split("|"):
                if string:
                    list_of_lists.append([int(e) if e.isdigit() else float(e) if "." in e else str(e) for e in string.rstrip().split("~") if e != " "])
    return list_of_lists

my_filepath = "mytxt.txt"
my_list_of_lists = parse(my_filepath)

results:

for sublist in my_list_of_lists:
    print (sublist)

[1, 'Hydrogen', 'H', 1.008, 1, 1]
[2, 'Helium', 'He', 4.002, 18, 1]
[3, 'Lithium', 'Li', 6.94, 1, 2]
[4, 'Beryllium', ' Be', 9.0122, 2, 2]


来源:https://stackoverflow.com/questions/47001912/creating-nested-list-from-string-data-with-two-delimiters-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!