我有以下数据框:
import pandas as pd
import random
data = {
'Column1': [random.randint(1, 100) for _ in range(9)],
'Column2': [random.uniform(0, 1) for _ in range(9)],
'Column3': [chr(random.randint(65, 90)) for _ in range(9)],
'Column4': [random.choice(['A', 'B', 'C']) for _ in range(9)]
}
df = pd.DataFrame(data)
Column1 Column2 Column3 Column4
0 87 0.208179 M C
1 85 0.049071 Q C
2 4 0.474926 X C
3 35 0.966357 L B
4 58 0.295134 C B
5 23 0.633367 R B
6 87 0.069583 V B
7 83 0.427594 N A
8 16 0.592413 R C
我想添加一个新的连续 DATE 列,其中的条目从选定的 start_date (= '2022-01-01') 开始,以便每个日期对于整个数据集重复 N (=2) 次。我还想添加另一列名为 SHIFT 的列,其中包含 n (=2) 个替代 selected_values=['Day','Night']。
## desired output for N=2 and start_date = '2022-01-01', and n=2 with selected_values = ['Day','Night']
Column1 Column2 Column3 Column4 DATE SHIFT
0 87 0.208179 M C 2022-01-01 Day
1 85 0.049071 Q C 2022-01-01 Night
2 4 0.474926 X C 2022-01-02 Day
3 35 0.966357 L B 2022-01-02 Night
4 58 0.295134 C B 2022-01-03 Day
5 23 0.633367 R B 2022-01-03 Night
6 87 0.069583 V B 2022-01-04 Day
7 83 0.427594 N A 2022-01-04 Night
8 16 0.592413 R C 2022-01-05 Day
- N、n、selected_values 和 start_date 都是动态的。其中 n = selected_values 中的元素数量
为了更清楚地说明,这是不同参数所需的输出:
## desired output for N=2 and start_date = '2022-01-01', and n=3 with selected_values = ['Day','Night','Afternoon']
Column1 Column2 Column3 Column4 DATE SHIFT
0 87 0.208179 M C 2022-01-01 Day
1 85 0.049071 Q C 2022-01-01 Night
2 4 0.474926 X C 2022-01-02 Afternoon
3 35 0.966357 L B 2022-01-02 Day
4 58 0.295134 C B 2022-01-03 Night
5 23 0.633367 R B 2022-01-03 Afternoon
6 87 0.069583 V B 2022-01-04 Day
7 83 0.427594 N A 2022-01-04 Night
8 16 0.592413 R C 2022-01-05 Afternoon
from datetime import datetime, timedelta
def add_dates_shifts(df, start_date, N, shifts, n):
dates = [datetime.fromisoformat(start_date) + timedelta(days=i) for i in df.index for _ in range(N)][:len(df)]
df["DATE"] = dates
df["SHIFT"] = [shifts[i] for _ in df.index for i in range(n)][:len(df)]
add_dates_shifts(df, "2022-01-01", 2, ["Day", "Night"], 2)
Column1 Column2 Column3 Column4 DATE SHIFT
0 48 0.283085 J C 2022-01-01 Day
1 76 0.896891 J B 2022-01-01 Night
2 58 0.867856 Z B 2022-01-02 Day
3 60 0.175652 H C 2022-01-02 Night
4 50 0.222947 S A 2022-01-03 Day
5 36 0.753228 A A 2022-01-03 Night
6 48 0.032319 W B 2022-01-04 Day
7 10 0.412104 T A 2022-01-04 Night
8 61 0.266018 Q A 2022-01-05 Day