I frequently deal with data which is poorly formatted (I.e. number fields are not consistent etc)
There may be other ways, which I am not aware of but the way I form
You can do df[['Col1', 'Col2', 'Col3']].applymap(format_number)
. Note, though that this will return new columns; it won't modify the existing DataFrame. If you want to put the values back in the original, you'll have to do df[['Col1', 'Col2', 'Col3']] = df[['Col1', 'Col2', 'Col3']].applymap(format_number)
.
You could use apply
like this:
df.apply(lambda row: format_number(row), axis=1)
You would need to specify the columns though in your format_number
function:
def format_number(row):
row['Col1'] = doSomething(row['Col1']
row['Col2'] = doSomething(row['Col2'])
row['Col3'] = doSomething(row['Col3'])
This is not as elegant as @BrenBarn's answer but it has an advantage that the dataframe is modified in place so you don't need to assign the columns back again