Cleaning the values of a multitype data frame in python/pandas, I want to trim the strings. I am currently doing it in two instructions :
import pandas as pd
You can use the apply function of the Series object:
>>> df = pd.DataFrame([[' a ', 10], [' c ', 5]])
>>> df[0][0]
' a '
>>> df[0] = df[0].apply(lambda x: x.strip())
>>> df[0][0]
'a'
Note the usage of
stripand not theregexwhich is much faster
Another option - use the apply function of the DataFrame object:
>>> df = pd.DataFrame([[' a ', 10], [' c ', 5]])
>>> df.apply(lambda x: x.apply(lambda y: y.strip() if type(y) == type('') else y), axis=0)
0 1
0 a 10
1 c 5