How can I split a column into 2 in the correct way in Python?

久未见 提交于 2019-12-11 06:08:02

问题


I am web-scraping tables from a website, and I am putting it to the Excel file. My goal is to split a columns into 2 columns in the correct way.

The columns what i want to split: "STATUS"

I want this form:

First example: Estimated 3:17 PM --> Estimated and 3:17 PM

Second example: Delayed 3:00 PM --> Delayed and 3:00 PM

Third example: Canceled --> Canceled and (empty cell)

So, I need to separete the FIRST word (in the first column), and after that the next characters.

How Can I do this?

Here my relevant code, which is already contains a formatting code.

df2 = pd.DataFrame(datatable,columns = cols)
df2['a'] = df2['FLIGHT'].str[:2]
df2['b'] = df2['FLIGHT'].str[2:].str.zfill(4)
df2["UPLOAD_TIME"] = datetime.now()
mask = np.column_stack([df2[col].astype(str).str.contains(r"Scheduled", na=True) for col in df2])
df3 = df2.loc[~mask.any(axis=1)] 

if os.path.isfile("output.csv"):
    df1 = pd.read_csv("output.csv", sep=";")
    df4 = pd.concat([df1,df3])
    df4.to_csv("output.csv", index=False, sep=";")

else:
    df3.to_csv
    df3.to_csv("output.csv", index=False, sep=";")

Here the excel prt sc from my table:


回答1:


You can use str.split - n=1 for split by first whitespace and expand=True for return DataFrame, which can be assign to new columns:

df2[['c','d']] = df2['STATUS'].str.split(n=1, expand=True)

Sample:

df2 = pd.DataFrame({'STATUS':['Estimated 3:17 PM','Delayed 3:00 PM']})


df2[['c','d']] = df2['STATUS'].str.split(n=1, expand=True)
print (df2)
              STATUS          c        d
0  Estimated 3:17 PM  Estimated  3:17 PM
1    Delayed 3:00 PM    Delayed  3:00 PM

If no whitespace in input get None in output:

df2 = pd.DataFrame({'STATUS':['Estimated 3:17 PM','Delayed 3:00 PM', 'Canceled']})


df2[['c','d']] = df2['STATUS'].str.split(n=1, expand=True)
print (df2)
              STATUS          c        d
0  Estimated 3:17 PM  Estimated  3:17 PM
1    Delayed 3:00 PM    Delayed  3:00 PM
2           Canceled   Canceled     None

and if need replace None to empty string use fillna:

df2[['c','d']] = df2['STATUS'].str.split(n=1, expand=True)
df2['d'] = df2['d'].fillna('')
print (df2)
              STATUS          c        d
0  Estimated 3:17 PM  Estimated  3:17 PM
1    Delayed 3:00 PM    Delayed  3:00 PM
2           Canceled   Canceled         


来源:https://stackoverflow.com/questions/46524461/how-can-i-split-a-column-into-2-in-the-correct-way-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!