Python – subtotales de pandas en groupby

Aquí hay una muestra de los datos que estoy usando:

SCENARIO DATE POD AREA IDOC STATUS TYPE AAA 02.06.2015 JKJKJKJKJKK 4210 713375 51 1 AAA 02.06.2015 JWERWERE 4210 713375 51 1 AAA 02.06.2015 JAFDFDFDFD 4210 713375 51 9 BBB 02.06.2015 AAAAAAAA 5400 713504 51 43 CCC 05.06.2015 BBBBBBBBBB 4100 756443 51 187 AAA 05.06.2015 EEEEEEEE 4100 756457 53 228 

He escrito el siguiente código en pandas para agrupar:

 import pandas as pd import numpy as np xl = pd.ExcelFile("MRD.xlsx") df = xl.parse("Sheet3") #print (df.column.values) # The following gave ValueError: Cannot label index with a null key # dfi = df.pivot('SCENARIO) # Here i do not actually need it to count every column, just a specific one table = df.groupby(["SCENARIO", "STATUS", "TYPE"]).agg(['count']) writer = pd.ExcelWriter('pandas.out.xlsx', engine='xlsxwriter') table.to_excel(writer, sheet_name='Sheet1') writer.save() table2 = pd.DataFrame(df.groupby(["SCENARIO", "STATUS", "TYPE"])['TYPE'].count()) print (table2) writer2 = pd.ExcelWriter('pandas2.out.xlsx', engine='xlsxwriter') table2.to_excel(writer2, sheet_name='Sheet1') writer2.save() 

esto produce un resultado:

 SCENARIO STATUS TYPE TYPE AAA 51 1 2 9 1 53 228 1 BBB 51 43 1 CCC 51 187 1 Name: TYPE, dtype: int64 

¿Cómo puedo agregar subtotales por grupo? Lo ideal sería que quisiera lograr algo como:

 SCENARIO STATUS TYPE TYPE AAA 51 1 2 9 1 Total 3 53 228 1 Total 1 BBB 51 43 1 Total 1 CCC 51 187 1 Total 1 Name: TYPE, dtype: int64 

es posible?

Utilizar:

 #if necessary convert TYPE column to string df['TYPE'] = df['TYPE'].astype(str) df = df.groupby(["SCENARIO", "STATUS", "TYPE"])['TYPE'].count() #aggregate sum by first 2 levels df1 = df.groupby(["SCENARIO", "STATUS"]).sum() #add 3 level of MultiIndex df1.index = [df1.index.get_level_values(0), df1.index.get_level_values(1), ['Total'] * len(df1)] #thanks MaxU for improving #df1 = df1.set_index(np.array(['Total'] * len(df1)), append=True) print (df1) SCENARIO STATUS AAA 51 Total 3 53 Total 1 BBB 51 Total 1 CCC 51 Total 1 Name: TYPE, dtype: int64 

 #join together and sorts df = pd.concat([df, df1]).sort_index(level=[0,1]) print (df) SCENARIO STATUS TYPE AAA 51 1 2 9 1 Total 3 53 228 1 Total 1 BBB 51 43 1 Total 1 CCC 51 187 1 Total 1 Name: TYPE, dtype: int64