reading tab-delimited data without header in pandas

假装没事ソ 提交于 2019-12-10 13:27:08

问题


I'm having trouble using pandas to open tab-delimited data without headers.

My test data (actually contains 200 lines, of which I am showing the first 10):

Tag19184    CTAAC   hffef   1   a   36  -   chr1    10006   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10012   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10018   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10024   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10030   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10036   0   36M 36
Tag19184    CTAAC   hffef   1   a   36  -   chr1    10042   0   36M 36
Tag20198    CTAAC   hffef   1   a   36  -   chr1    10048   0   36M 36
Tag20198    CTAAC   hffef   1   a   36  -   chr1    10054   0   36M 36
Tag45093    CTAAC   hffef   1   a   36  -   chr1    10060   0   36M 36

My code:

import pandas as pd
df = pd.read_csv('in_test.txt',sep='\t',header=None)
print df

However, I get the following output, which I don't think I can use to further process data (?):

<class 'pandas.core.frame.DataFrame'>
Int64Index: 200 entries, 0 to 199
Data columns:
X.1     200  non-null values
X.2     200  non-null values
X.3     200  non-null values
X.4     200  non-null values
X.5     200  non-null values
X.6     200  non-null values
X.7     200  non-null values
X.8     200  non-null values
X.9     200  non-null values
X.10    200  non-null values
X.11    200  non-null values
X.12    200  non-null values
dtypes: int64(5), object(7)

The tutorial here suggests that print df should just give me the corresponding data frame. What am I doing wrong?


回答1:


I think you are getting the it read correctly, but:

  1. See: change pandas 0.13.0 "print dataframe" to print dataframe like in earlier versions, this is what pandas do in the older versions. So, update will solve it.
  2. You can use ipython notebook, where DataFrames will show up as HTML tables.
  3. You can use df.head(5) (similar to r's head) to get the first a few rows just to make sure your DataFrame is correct.


来源:https://stackoverflow.com/questions/24582329/reading-tab-delimited-data-without-header-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!