KeyError:“['建筑年龄','楼层','楼层数']不在索引中”

问题描述 投票:0回答:1
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import category_encoders as ce

# Read the data
transactions_master_df = pd.read_csv('my_data.csv')

# Calculate the average house price for each district
avg_price_per_district = transactions_master_df.groupby('District')['Price'].mean().reset_index()
avg_price_per_district.rename(columns={'Price': 'AvgPrice'}, inplace=True)

#print the average price for each district with the district column next to it
print(avg_price_per_district)

# Merge the average price information with the original DataFrame
transactions_master_df = pd.merge(transactions_master_df, avg_price_per_district, on='District', how='left')

# Binary encode the 'District' feature
encoder = ce.BinaryEncoder(cols=['District'], base=6)
transactions_encoded = encoder.fit_transform(transactions_master_df)

# Concatenate additional features to the encoded DataFrame
additional_features = ['Building Age', 'Floor', 'Number of Floors', 'Elevator', 
                      'number of bathrooms', 'Otopark', 'steeped alley', 
                      'material used and luxuriness', 'view', 
                      'prestige of that district and its vicinity']

# Check if additional features are present in the transactions_encoded DataFrame
for feature in additional_features:
    if feature not in transactions_encoded.columns:
        print(f"Warning: {feature} column not found in transactions_encoded DataFrame.")

# Concatenate additional features to the encoded DataFrame
final_features = pd.concat([transactions_encoded[['District_0', 'District_1', 'District_2', 'SquareMeter']], 
                            transactions_encoded[additional_features]], axis=1)

# Ensure 'final_features' contains the necessary columns for training
print(final_features.head())


你好, 在这段代码中,我正在为我的房价数据集构建一个模型。首先,我对一些非数字特征进行编码,然后当我连接其余特征以接收 Final_features 变量时,出现以下错误:

final_features = pd.concat([transactions_encoded[['District_0', 'District_1', 'District_2', 'SquareMeter']], 
---> 38                             transactions_encoded[additional_features]], axis=1)
KeyError: "['Building Age', 'Floor', 'Number of Floors'] not in index"

奇怪的是,这些功能存在于我的数据集中,但我不知道为什么它会给我这个错误。

python pandas dataframe machine-learning encoding
1个回答
0
投票

数据框中的“建筑年龄”、“楼层”和“楼层数”列名称中似乎存在多余空格。列名中带有额外空格的差异导致了 KeyError。

© www.soinside.com 2019 - 2024. All rights reserved.