发表新帖

发表新帖

Removing duplicate rows from a csv file using a python script

后端未结

关注

 6  1328

离开以前 2020-12-02 10:06

Goal

I have downloaded a CSV file from hotmail, but it has a lot of duplicates in it. These duplicates are complete copies and I don\'t know why my

6条回答

暖寄归人 (楼主)

2020-12-02 10:43
You can do using pandas library in jupyter notebook or relevant IDE, I m importing pandas to jupyter notebook and reading the csv file

Then sort the values,accordingly by which parameters duplicates are present, since I have defined two attributes first it will sort by time, then by latitude

Then remove duplicates as present in time column or column relevant as per you

Then i store the duplicates removed and sorted file as gps_sorted
```
import pandas as pd
stock=pd.read_csv("C:/Users/Donuts/GPS Trajectory/go_track_trackspoints.csv")
stock2=stock.sort_values(["time","latitude"],ascending=True)
stock2.drop_duplicates(subset=['time'])
stock2.to_csv("C:/Users/Donuts/gps_sorted.csv",)
```
Hope this helps
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题