我有一个数据框
df.head(9)
+------------+-------------+-------------+------------+---------+-------------+-------+--------+------------------+
| session_id | Enter | Exit | Difference | User_id | date | Buyer | Seller | Non_buyer_seller |
+------------+-------------+-------------+------------+---------+-------------+-------+--------+------------------+
| a | 43770 | 43770 | 0:00:00 | 1 | 01/Nov/2019 | 1 | 0 | 0 |
| b | 43770.79991 | 43770.79994 | 0:00:02 | 2 | 01/Nov/2019 | 1 | 0 | 0 |
| c | 43770.5634 | 43770.56351 | 0:00:09 | 3 | 01/Nov/2019 | 0 | 0 | 1 |
| d | 43770.5525 | 43770.5528 | 0:00:25 | 4 | 01/Nov/2019 | 1 | 0 | 0 |
| e | 43770.33724 | 43770.33726 | 0:00:01 | 4 | 01/Nov/2019 | 1 | 0 | 0 |
| f | 43770.65617 | 43770.65623 | 0:00:05 | 5 | 01/Nov/2019 | 0 | 0 | 1 |
| g | 43770.54055 | 43770.54093 | 0:00:32 | 6 | 01/Nov/2019 | 0 | 0 | 1 |
| h | 43770.54203 | 43770.54281 | 0:01:07 | 7 | 01/Nov/2019 | 0 | 0 | 1 |
| i | 43770.64442 | 43770.64478 | 0:00:31 | 8 | 01/Nov/2019 | 0 | 1 | 0 |
+------------+-------------+-------------+------------+---------+-------------+-------+--------+------------------+
我曾经在Excel中使用countifs来计算会话数,例如这样
Buyers_0-to-1 min : =COUNTIFS($G:$G,1, $D$2:$D$671746,">=00:00:00",$D$2:$D$671746,"<=00:01:00")
Buyers_1.1-to-5 min : =COUNTIFS($G:$G,1, $D$2:$D$671746,">=00:01:01",$D$2:$D$671746,"<=00:05:00")
Sellers_0-to-1 min : =COUNTIFS($H:$H,1, $D$2:$D$671746,">=00:00:00",$D$2:$D$671746,"<=00:01:00")
Non_buyer_sellers_0to-1 min : =COUNTIFS($I:$I,1, $D$2:$D$671746,">=00:00:00",$D$2:$D$671746,"<=00:01:00")
所以我该如何在python中为数据框做同样的事情。
提前感谢
要执行您列出的第一个COUNTIF,可以尝试一下。
import pandas as pd
d = {'Difference': ["00:00:00", "00:00:02", "00:00:09", "00:00:25", "00:00:01", "00:00:05", "00:00:32", "00:01:07", "00:00:31"]}
df = pd.DataFrame(data=d)
# 'Difference' column to timedelta
df['Difference'] = pd.to_timedelta(df['Difference'])
# Conditional sum statement
df[(df.Difference >= "00:00:00") & (df.Difference <= "00:01:00")].sum()
Difference 00:01:45
dtype: timedelta64[ns]
我已经假设时间存储为字符串,并在执行总和之前将其转换为timedelta。
请注意,总和包含条件,非常类似于excel中的COUNTIF语句。您可以修改语句以添加更多条件,或调整列出的条件以获得所需的输出。