ValueError: could not convert string to float: id

匿名 (未验证) 提交于 2019-12-03 02:08:02

问题:

I'm running the following python script:

#!/usr/bin/python  import os,sys from scipy import stats import numpy as np  f=open('data2.txt', 'r').readlines() N=len(f)-1 for i in range(0,N):     w=f[i].split()     l1=w[1:8]     l2=w[8:15]     list1=[float(x) for x in l1]     list2=[float(x) for x in l2]     result=stats.ttest_ind(list1,list2)     print result[1] 

However I got the errors like:

ValueError: could not convert string to float: id 

I'm confused by this. When I try this for only one line in interactive section, instead of for loop using script:

>>> from scipy import stats >>> import numpy as np >>> f=open('data2.txt','r').readlines() >>> w=f[1].split() >>> l1=w[1:8] >>> l2=w[8:15] >>> list1=[float(x) for x in l1] >>> list1 [5.3209183842, 4.6422726719, 4.3788135547, 5.9299061614, 5.9331108706, 5.0287087832, 4.57...] 

I works well.

Can anyone explain a little bit about this? thx

回答1:

Obviously some of your lines don't have valid float data, specifically some line have text id which can't be converted to float.

When you try it in interactive prompt you are trying only first line, so best way is to print the line where you are getting this error and you will know the wrong line e.g.

#!/usr/bin/python  import os,sys from scipy import stats import numpy as np  f=open('data2.txt', 'r').readlines() N=len(f)-1 for i in range(0,N):     w=f[i].split()     l1=w[1:8]     l2=w[8:15]     try:         list1=[float(x) for x in l1]         list2=[float(x) for x in l2]     except ValueError,e:         print "error",e,"on line",i     result=stats.ttest_ind(list1,list2)     print result[1] 


回答2:

My error was very simple: the text file containing the data had some space (so not visible) character on the last line.
As an output of grep, I had instead of just 45

The classic stupid thing that makes you waste hours. :-)



回答3:

This error is pretty verbose:

ValueError: could not convert string to float: id 

Somewhere in your text file, a line has the word id in it, which can't really be converted to a number.

Your test code works because the word id isn't present in line 2.


If you want to catch that line, try this code. I cleaned your code up a tad:

#!/usr/bin/python  import os, sys from scipy import stats import numpy as np  for index, line in enumerate(open('data2.txt', 'r').readlines()):     w = line.split(' ')     l1 = w[1:8]     l2 = w[8:15]      try:         list1 = map(float, l1)         list2 = map(float, l2)     except ValueError:         print 'Line {i} is corrupt!'.format(i = index)'         break      result = stats.ttest_ind(list1, list2)     print result[1] 


回答4:

Your data may not be what you expect -- it seems you're expecting, but not getting, floats.

A simple solution to figuring out where this occurs would be to add a try/except to the for-loop:

for i in range(0,N):     w=f[i].split()     l1=w[1:8]     l2=w[8:15]     try:       list1=[float(x) for x in l1]       list2=[float(x) for x in l2]     except ValueError, e:       # report the error in some way that is helpful -- maybe print out i     result=stats.ttest_ind(list1,list2)     print result[1] 


回答5:

Perhaps your numbers aren't actually numbers, but letters masquerading as numbers?

In my case, the font I was using meant that "l" and "1" looked very similar. I had a string like 'l1919' which I thought was '11919' and that messed things up.



回答6:

I got the same error while working with a .csv file with 69190 rows scrapped from amazon. I was trying to implement RNN.

When I looked carefully, a column which containing integers also had non-numeric values in some rows. I replaced that with numeric values and everything worked fine afterwards.

So first check your dataset for these errors first.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!