如何使用Pandas重新排序多索引列?
问题描述:
代码:
dff = pd.DataFrame({'Country':['France']*4+['China']*4,
'Progress':['Develop','Middle','Operate','Start']*2,
'NumTrans':np.random.randint(100,900,8),
'TransValue':np.random.randint(10000,9999999,8)})
dff = dff.set_index(['Country','Progress']).T
数据和代码如上所示.
Data and code are shown above.
我想知道有什么方法可以在使用Python进行start-develop-middle-operate时重新排序"Progress".
I want to know is there any way to re-order the "Progress" as start-develop-middle-operate using Python.
我尝试使用地图功能并为每个阶段设置了一个数字,但是无法从多索引中提取进度"
I tried using map function and set each stage with a number, but cannot extract "Progress" from multi-index
谢谢!
答
reindex
您可以指定要重新索引的级别
reindex
You can specify a level to reindex on
cats = ['Start', 'Develop', 'Middle', 'Operate']
dff.reindex(cats, axis=1, level=1)
Country France China
Progress Start Develop Middle Operate Start Develop Middle Operate
NumTrans 772 832 494 793 750 722 818 684
TransValue 7363187 2578816 9764430 4863178 159777 840700 978816 9674337
set_levels
与CategoricalIndex
您可以定义第二级的顺序,然后进行排序.
set_levels
with CategoricalIndex
You can define the order of the second level and then sort.
lvl1 = dff.columns.levels[1]
cats = ['Start', 'Develop', 'Middle', 'Operate']
cati = pd.CategoricalIndex(
lvl1,
categories=cats,
ordered=True
)
dff.columns.set_levels(
cati, level=1, inplace=True
)
dff.sort_index(1)
Country China France
Progress Start Develop Middle Operate Start Develop Middle Operate
NumTrans 750 722 818 684 772 832 494 793
TransValue 159777 840700 978816 9674337 7363187 2578816 9764430 4863178