问题
I have a text file that contains a smaller dataset(taken from csv file) like so -
2020-05-24T10:44:37.613168#[ 0. 0. -0.06210425 0. ]
2020-05-24T10:44:37.302214#[1. 1. 0. 0.]
2020-05-24T10:44:36.192222#[0. 0. 0. 0.]
Then read from it using
data = f.readlines()
for row in data:
img_id, label = row.strip("\n").split("#")
where in label is a string list which looks like
[ 0. 0. -0.24604772 0. ]
[ 0. 0. -0.24604772 0. ]
[1. 1. 0. 0.]
I'd like to convert each string element to float. However, the square brace [] and decimal . preventing me from converting.
Tried so far -
Removing
[]so -label = label[1:-1]but I would need them as an array later. Then doing thisprint([list(map(float, i.split())) for i in label])resulted in errorValueError: could not convert string to float: '.'Using
ast.literal_eval.label = ast.literal_eval(row.strip("\n").split("#")). GettingValueError: malformed node or string: ['2020-05-24T10:57:52.882241 [0. 0. 0. 0.]']
Referred
Need to read string into a float array
Cannot convert list of strings to list of floats in python using float()
How do you convert a list of strings to a list of floats using Python?
Convert list of strings to numpy array of floats
When to use ast.literal_eval
So,
- What else should I try in order to convert them to float array which is iterable? Or what am I doing wrong? Should I have to remove the square braces?
- If I can make things much easier, how can I store the data in txt file? Is CSV better than txt in this case?
- I need to extend this logic to 110,000 entries. Will any of steps cause problems then?
Thank you. Any help will be greatly appreciated. Please help.
回答1:
For each line, trim off the first and last char with line[1:-1], split by whitespace with .split(), and parse each float with float().
line = "[ 0. 0. -0.24604772 0. ]"
floats = [float(item) for item in line[1:-1].split()]
print(floats)
>>> [0.0, 0.0, -0.24604772, 0.0]
回答2:
for row in data:
img_id, label = row.strip("\n").split("#")
# >>>[ 0. 0. -0.24604772 0. ]
label = label[1:-1] # Cuts the first and last letter
# >>> 0. 0. -0.24604772 0.
label = label.strip() # Remove all spaces before and after label
# >>>0. 0. -0.24604772 0.
labelElements = label.split() # Cuts the string on every space(s)
# >>>["0.", "0.", "-0.24604772", "0."]
labelFloats = []
for L in labelElements:
labelFloats.append(float(L)) # for example: "1." -> 1.0
By the way:
The variable [label] does not have a list of lines (You called it a "string list"), its one line:
# label = [ 0. 0. -0.24604772 0. ]
回答3:
I think given your case, I think I would go with regular expressions to extract the desired numbers. I would do something as follows:
import re
f = open('your_file.txt')
lines = f.read().splitlines()
f.close()
floats = []
for line in lines:
img_id, label = line.split("#")
floats.append([*map(float, re.findall('-?[\d]+\.?[\d]*', label))])
Printing floats outputs:
[[0.0, 0.0, -0.06210425, 0.0], [1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0]]
来源:https://stackoverflow.com/questions/62250600/python-convert-list-of-string-to-float-square-braces-and-decimal-point-causi