使用唯一组合对数据进行分组

问题描述 投票:-3回答:1

在我的下面的数据集中,我需要找到唯一的序列并为它们分配序列号。

数据集:

user    age maritalstatus   product
A   Young   married 111
B   young   married 222
C   young   Single  111
D   old single  222
E   old married 111
F   teen    married 222
G   teen    married 555
H   adult   single  444
I   adult   single  333

独特的序列:

young   married     0
young   single      1
old     single      2
old     married     3
teen    married     4
adult   single      5

找到如上所示的唯一值后,如果我传递如下的数据帧,则为newdataframe

user    age maritalstatus  
A      Young   married 
X      young   Single  
D      old     single  
Z      old     married

它应该将产品作为清单返回给我。

A: [222] - as user A has already purchased 111, the matching sequence contains 222, so returns 222.
X: [111, 222]
D: [] - returns nothing, as there is only one sequence like this, and D has already purchased the product 222, so returns empty.
Z: [111] matches with sequence E, so returned 111

如果没有序列,如下所示

user     age     maritalstatus  
    Y     adult  married

它应该给我一个空列表

 Y : []
python pandas
1个回答
0
投票

你可以使用sets - 模块,它提供用于构造和操作无序的独特元素集合的类

看看:https://docs.python.org/2/library/sets.html

© www.soinside.com 2019 - 2024. All rights reserved.