发表新帖

发表新帖

Select specific CSV columns (Filtering) - Python/pandas

前端未结

关注

 3  609

I have a very large CSV File with 100 columns. In order to illustrate my problem I will use a very basic example.

Let\'s suppose that we have a CSV file.

相关标签:

3条回答

臣服心动

2020-12-31 07:14
As Wai Yip Tung said, you can filter your dataframe while reading by specifying the name of the columns, for example:
```
import pandas as pd
data = pd.read_csv("ThisFile.csv")[['value','d']]
```
This solved my problem.
0 讨论(0)
发布评论:

提交评论
- 加载中...
走了就别回头了

2020-12-31 07:20
This selects the second and fourth columns (since Python uses 0-based indexing):
```
In [272]: df.iloc[:,(1,3)]
Out[272]: 
   value  f
0    975  5
1    976  4
2    977  1
3    978  0
4    979  0

[5 rows x 2 columns]
```
df.ix can select by location or label. df.iloc always selects by location. When indexing by location use df.iloc to signal your intention more explicitly. It is also a bit faster since Pandas does not have to check if your index is using labels.

Another possibility is to use the usecols parameter:
```
data = pandas.read_csv("ThisFile.csv", usecols=[1,3])
```
This will load only the second and fourth columns into the data DataFrame.
0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2020-12-31 07:22
If you rather select column by name, you can use
```
data[['value','f']]

   value  f
0    975  5
1    976  4
2    977  1
3    978  0
4    979  0
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题