我有以下数据框:
ID_number Name Date S_code S_description
1 Dani 01/2017 G1 PROCEDURE ON SINGLE VESSEL
1 Dani 01/2017 R56 INSERTION OF THREE VASCULAR STENTS
1 Dani 06/2017 L34 CHOLECYSTECTOMY
2 Alice 03/2015 L12 OTHER CYSTOSCOPY
3 Elle 04/2015 L34 CHOLECYSTECTOMY
3 Elle 04/2015 H6 EXCISION OR DESTRUCTION OF PERITONEAL TISSUE
如果列“ ID_Number”,“名称”和“日期”相同,并且列“ S_code”和“ S_description”将被串联在一起,以便数据看起来像这样,我想合并行:
ID_number Name Date S_code S_description
1 Dani 01/2017 G1,R56 PROCEDURE ON SINGLE VESSEL,INSERTION OF THREE VASCULAR STENTS
1 Dani 06/2017 L34 CHOLECYSTECTOMY
2 Alice 03/2015 L12 OTHER CYSTOSCOPY
3 Elle 04/2015 L34,H6 CHOLECYSTECTOMY,EXCISION OR DESTRUCTION OF PERITONEAL TISSUE
“ ID_Number”列已排序。
我是Python的新手,在此问题上的任何帮助,我将不胜感激!
在三列上使用pandas groupby,并通过agg方法将pandas字符串cat作为函数传递:
df.groupby(['ID_number','Name','Date']).agg(lambda x: x.str.cat(sep=','))
S_code S_description
ID_number Name Date
1 Dani 01/2017 G1,R56 PROCEDURE ON SINGLE VESSEL,INSERTION OF THREE ...
06/2017 L34 CHOLECYSTECTOMY
2 Alice 03/2015 L12 OTHER CYSTOSCOPY
3 Elle 04/2015 L34,H6 CHOLECYSTECTOMY,EXCISION OR DESTRUCTION OF PER...