当我执行一段特定的代码时,我一直无法找到 Python 内核崩溃的确切解决方案。我正在尝试为项目生成随机销售表。以下是我的代码和输出。我已经在 Jupyter 和 VSCode 中尝试过这个。我已经将它作为一个直接的 Python 文件进行了尝试。当我运行最后一段代码时,它要么崩溃,要么不执行最后一部分。
生成随机客户:
import numpy as np
import pandas as pd
import names
data = np.random.randint(1, 200000, size=50000)
df_customers = pd.DataFrame(data, columns=['CustomerID'])
def customer_generator_first(cell_val):
cell_val = names.get_first_name()
return cell_val
# instantiate product id col with nan
df_customers['FirstName'] = np.nan
# apply your function to product id col
df_customers['FirstName'] = df_customers['FirstName'].apply(customer_generator_first)
def customer_generator_last(cell_val):
cell_val = names.get_last_name()
return cell_val
# instantiate product id col with nan
df_customers['LastName'] = np.nan
# apply your function to product id col
df_customers['LastName'] = df_customers['LastName'].apply(customer_generator_last)
输出:
+------------+-----------+----------+
| CustomerID | FirstName | LastName |
+------------+-----------+----------+
| 157863 | Kimberly | Archey |
| 148101 | Tony | Roberson |
| 113579 | Mandy | Kridel |
| 23000 | Russell | Cornett |
| 160104 | Craig | Sterling |
+------------+-----------+----------+
根据我下载的CSV文件生成产品表:
import os
import numpy as np
import pandas as pd
import string
import random
# assign directory
directory = '[MYPATH]'
# myFilePath = os.listdir(directory)
f = 'Amazon-Products.csv'
myFileName = os.path.join(directory, f)
# print(myFilePath)
df = pd.read_csv(myFileName)
df['discount_price'] = df['discount_price'].str.replace(',','')
df['discount_price'] = df['discount_price'].str.replace('₹','')
df['actual_price'] = df['actual_price'].str.replace(',','')
df['actual_price'] = df['actual_price'].str.replace('₹','')
df2 = df.drop(df.columns[[0, 4, 5]],axis = 1)
df2['no_of_ratings'] = df2['no_of_ratings'].str.replace(',','')
df2['discount_price'] = df2['discount_price'].fillna(0)
df2['actual_price'] = df2['actual_price'].fillna(0)
df2['discount_price_USD'] = df2['discount_price'].astype(str).astype(float) * 0.0122
df2['actual_price_USD'] = df2['actual_price'].astype(str).astype(float) * 0.0122
df3 = df2.drop(df2.columns[[5, 6]],axis = 1)
df3['main_category'] = df3['main_category'].str.title()
# Just added cell_val as part of the arguments
def id_generator(cell_val , size=12, chars=string.ascii_uppercase + string.digits):
cell_val = ''.join(random.choice(chars) for _ in range(size))
return cell_val
# instantiate product id col with nan
df3['ProductID'] = np.nan
# apply your function to product id col
df3['ProductID'] = df3['ProductID'].apply(id_generator)
输出:
+---------------------------------------------------+---------------+------------------+---------+---------------+--------------------+------------------+--------------+
| name | main_category | sub_category | ratings | no_of_ratings | discount_price_USD | actual_price_USD | ProductID |
+---------------------------------------------------+---------------+------------------+---------+---------------+--------------------+------------------+--------------+
| Lloyd 1.5 Ton 3 Star Inverter Split Ac (5 In 1... | Appliances | Air Conditioners | 4.2 | 2255 | 402.5878 | 719.678 | D5QPATUY7NQ4 |
| LG 1.5 Ton 5 Star AI DUAL Inverter Split AC (C... | Appliances | Air Conditioners | 4.2 | 2948 | 567.1780 | 927.078 | WDF3BP4HJXTV |
| LG 1 Ton 4 Star Ai Dual Inverter Split Ac (Cop... | Appliances | Air Conditioners | 4.2 | 1206 | 420.7780 | 756.278 | Z5SESQAXVVWW |
| LG 1.5 Ton 3 Star AI DUAL Inverter Split AC (C... | Appliances | Air Conditioners | 4.0 | 69 | 463.4780 | 841.678 | B7NPXS4E4IUQ |
| Carrier 1.5 Ton 3 Star Inverter Split AC (Copp... | Appliances | Air Conditioners | 4.1 | 630 | 420.7780 | 827.038 | BAAGUH73J8VF |
+---------------------------------------------------+---------------+------------------+---------+---------------+--------------------+------------------+--------------+
创建商店代码和时间序列:
store_codes = np.arange(1,3)
date_range_2022 = pd.date_range(start = '2021-01-01', end = '2022-12-31', freq="D")
输出:
[1 2]
DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
'2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08',
'2021-01-09', '2021-01-10',
...
'2022-12-22', '2022-12-23', '2022-12-24', '2022-12-25',
'2022-12-26', '2022-12-27', '2022-12-28', '2022-12-29',
'2022-12-30', '2022-12-31'],
dtype='datetime64[ns]', length=730, freq='D')
根据以上所有内容创建销售表。这是保持内核崩溃的部分,不管现在我限制上面的数据:
index = pd.MultiIndex.from_product(
[date_range_2022, store_codes, df3['ProductID'], df_customers['CustomerID']],
names = ['Date', 'StoreCode', 'ProductID', 'CustomerID'])
sales = pd.DataFrame(index = index)
VSCode 的输出:
Canceled future for execute_request message before replies were done
The Kernel crashed while executing code in the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of failure.
当我点击“更多信息”链接时,生成的 GitHub 存储库没有有用的信息。我在运行 MacOS Monterey 12.6.5 的 MacBook Pro 上。 Python 版本是 3.9.