找到设施数量最多的城市

Question

我目前正在尝试破解一个编程难题，该难题具有非常简单的数据框

host

，其中有两列名为

city

和

amenities

（两者都是

object

数据类型）。现在，两列中的条目都可以重复多次。以下是

host

is beLOW

的前几个条目

City    Amenities                                            Price($)
NYC    {TV,"Wireless Internet", "Air conditioning","Smoke      8
        detector",Essentials,"Lock on bedroom door"} 
LA     {"Wireless Internet",Kitchen,Washer,Dryer,"First aid    
        kit",Essentials,"Hair dryer","translation missing: 
         en.hosting_amenity_49","translation missing: 
         en.hosting_amenity_50"} 
                                                               10
SF     {TV,"Cable TV",Internet,"Wireless Internet",Kitchen,"Free 
        parking on premises","Pets live on this 
        property",Dog(s),"Indoor fireplace","Buzzer/wireless 
        intercom",Heating,Washer,Dryer,"Smoke detector","Carbon 
        monoxide detector","First aid kit","Safety card","Fire e 
        extinguisher",Essentials,Shampoo,"24-hour check- 
        in",Hangers,"Hair dryer",Iron,"Laptop friendly 
        workspace","translation missing: 
        en.hosting_amenity_49","translation missing: 
        en.hosting_amenity_50","Self Check-In",Lockbox}        15
NYC    {"Wireless Internet","Air 
        conditioning",Kitchen,Heating,"Suitable for events","Smoke 
        detector","Carbon monoxide detector","First aid kit","Fire 
        extinguisher",Essentials,Shampoo,"Lock on bedroom 
        door",Hangers,"translation missing: 
        en.hosting_amenity_49","translation missing: 
        en.hosting_amenity_50"}                                20
LA     {TV,Internet,"Wireless Internet","Air 
        conditioning",Kitchen,"Free parking on 
        premises",Essentials,Shampoo,"translation missing: 
        en.hosting_amenity_49","translation missing: 
        en.hosting_amenity_50"}
LA    {TV,"Cable TV",Internet,"Wireless Internet",Pool,Kitchen,"Free 
       parking on premises",Gym,Breakfast,"Hot tub","Indoor 
       fireplace",Heating,"Family/kid friendly",Washer,Dryer,"Smoke 
       detector","Carbon monoxide detector",Essentials,Shampoo,"Lock 
       on bedroom door",Hangers,"Private entrance"}           28

.....

问题。输出设施数量最多的城市。

我的尝试。我尝试使用

groupby()

函数根据列

city

使用

host.groupby('city').

对其进行分组。现在，我需要成功count每组便利设施中的元素数量。由于数据类型不同，

len()

函数不起作用，因为集合中的每个元素之间都有

（例如，如果我使用

host['amenities'][0],

，则输出为

"{TV,\"Wireless Internet\",\"Air conditioning\",\"Smoke detector\",\"Carbon monoxide detector\",Essentials,\"Lock on bedroom door\",Hangers,Iron}"

。将

len()

应用于此）输出将导致 134，这显然是不正确的）。我尝试使用

host['amenities'][0].strip('\n')

删除

\,

但

len()

函数仍然给出

134.

谁能帮我解决这个问题吗？

我的解决方案，受到 ddejohn 解决方案的启发：

### Transform each "string-type" entry in column "amenities" to "list" type
host["amenities"] = host["amenities"].str.replace('["{}]', "", regex=True).str.split(",")

## Create a new column that count all the amenities for each row 
entry host["am_count"] = [len(data) for data in host["amenities"]]

## Output the index in the new column resulting from aggregation over the column `am_count` grouped by `city` 
host.groupby("city")["am_count"].agg("sum").argmax()

Answer 1

解决方案

import functools

# Process the Amenities strings into sets of strings
host["amenities"] = host["amenities"].str.replace('["{}]', "", regex=True).str.split(",").apply(set)

# Groupby city, perform the set union to remove duplicates, and get count of unique amenities
amenities_by_city = host.groupby("city")["amenities"].apply(lambda x: len(functools.reduce(set.union, x))).reset_index()

输出：

  city  amenities
0   LA         27
1  NYC         17
2   SF         29

让城市拥有最多的便利设施是通过以下方式实现的：

city_with_most_amenities = amenities_by_city.query("amenities == amenities.max()")

输出：

  city  amenities
2   SF         29

Answer 2

找到“，”+ 1 的个数！

def func(s):
        return s.count(',') + 1 df["Amenitiies_count"] = df.Amenities.apply(func) df1 =
    df.groupby("city").Amenitiies_count.sum().reset_index()
    df1[df1.Amenitiies_count ==
    max(df1.Amenitiies_count)].city.values[0]

找到设施数量最多的城市

问题描述投票：0回答：2

2个回答

解决方案

最新问题

找到设施数量最多的城市

问题描述 投票：0回答：2

2个回答

解决方案

最新问题

问题描述投票：0回答：2