Pandas: Modify a particular level of Multiindex

泄露秘密 提交于 2019-11-28 07:27:32

Thanks to @cxrodgers's comment, I think the fastest way to do this is:

df.index = df.index.set_levels(df.index.levels[0].str.replace(' ', ''), level=0)

Old, longer answer:

I found that the list comprehension suggested by @Shovalt works but felt slow on my machine (using a dataframe with >10,000 rows).

Instead, I was able to use .set_levels method, which was quite a bit faster for me.

%timeit pd.MultiIndex.from_tuples([(x[0].replace(' ',''), x[1]) for x in df.index])
1 loop, best of 3: 394 ms per loop

%timeit df.index.set_levels(df.index.get_level_values(0).str.replace(' ',''), level=0)
10 loops, best of 3: 134 ms per loop

In actuality, I just needed to prepend some text. This was even faster with .set_levels:

%timeit pd.MultiIndex.from_tuples([('00'+x[0], x[1]) for x in df.index])
100 loops, best of 3: 5.18 ms per loop

%timeit df.index.set_levels('00'+df.index.get_level_values(0), level=0)
1000 loops, best of 3: 1.38 ms per loop

%timeit df.index.set_levels('00'+df.index.levels[0], level=0)
1000 loops, best of 3: 331 µs per loop

This solution is based on the answer in the link from the comment by @denfromufa ...

python - Multiindex and timezone - Frozen list error - Stack Overflow

As mentioned in the comments, indexes are immutable and must be remade when modifying, but you do not have to use reset_index for that, you can create a new multi-index directly:

df.index = pd.MultiIndex.from_tuples([(x[0], x[1].replace(' ', ''), x[2]) for x in df.index])

This example is for a 3-level index, where you want to modify the middle level. You need to change the size of the tuple for different level sizes.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!