我想使用具有给定字符串参数的函数来预测疾病类型。该参数可以包含多个字符串。但总是给出错误信息:KeyError: 'skin_rash'
这是我在极客极客上找到的代码示例:https://www.geeksforgeeks.org/disease-prediction-using-machine-learning/
该函数的数据集可以在 Kaggle 上找到:https://www.kaggle.com/datasets/kaushil268/disease-prediction-using-machine-learning/data
这是我的代码:
symptoms = X.columns.values
Here are the symptoms:
['itching' 'skin_rash' 'nodal_skin_eruptions' 'continuous_sneezing'
'shivering' 'chills' 'joint_pain' 'stomach_pain' 'acidity'
'ulcers_on_tongue' 'muscle_wasting' 'vomiting' 'burning_micturition'
'spotting_ urination' 'fatigue' 'weight_gain' 'anxiety'
'cold_hands_and_feets' 'mood_swings' 'weight_loss' 'restlessness'
'lethargy' 'patches_in_throat' 'irregular_sugar_level' 'cough'
'high_fever' 'sunken_eyes' 'breathlessness' 'sweating' 'dehydration'
'indigestion' 'headache' 'yellowish_skin' 'dark_urine' 'nausea'
'loss_of_appetite' 'pain_behind_the_eyes' 'back_pain' 'constipation'
'abdominal_pain' 'diarrhoea' 'mild_fever' 'yellow_urine'
'yellowing_of_eyes' 'acute_liver_failure' 'fluid_overload'
'swelling_of_stomach' 'swelled_lymph_nodes' 'malaise'
'blurred_and_distorted_vision' 'phlegm' 'throat_irritation'
'redness_of_eyes' 'sinus_pressure' 'runny_nose' 'congestion' 'chest_pain'
'weakness_in_limbs' 'fast_heart_rate' 'pain_during_bowel_movements'
'pain_in_anal_region' 'bloody_stool' 'irritation_in_anus' 'neck_pain'
'dizziness' 'cramps' 'bruising' 'obesity' 'swollen_legs'
'swollen_blood_vessels' 'puffy_face_and_eyes' 'enlarged_thyroid'
'brittle_nails' 'swollen_extremeties' 'excessive_hunger'
'extra_marital_contacts' 'drying_and_tingling_lips' 'slurred_speech'
'knee_pain' 'hip_joint_pain' 'muscle_weakness' 'stiff_neck'
'swelling_joints' 'movement_stiffness' 'spinning_movements'
'loss_of_balance' 'unsteadiness' 'weakness_of_one_body_side'
'loss_of_smell' 'bladder_discomfort' 'foul_smell_of urine'
'continuous_feel_of_urine' 'passage_of_gases' 'internal_itching'
'toxic_look_(typhos)' 'depression' 'irritability' 'muscle_pain'
'altered_sensorium' 'red_spots_over_body' 'belly_pain'
'abnormal_menstruation' 'dischromic _patches' 'watering_from_eyes'
'increased_appetite' 'polyuria' 'family_history' 'mucoid_sputum'
'rusty_sputum' 'lack_of_concentration' 'visual_disturbances'
'receiving_blood_transfusion' 'receiving_unsterile_injections' 'coma'
'stomach_bleeding' 'distention_of_abdomen'
'history_of_alcohol_consumption' 'fluid_overload.1' 'blood_in_sputum'
'prominent_veins_on_calf' 'palpitations' 'painful_walking'
'pus_filled_pimples' 'blackheads' 'scurring' 'skin_peeling'
'silver_like_dusting' 'small_dents_in_nails' 'inflammatory_nails'
'blister' 'red_sore_around_nose' 'yellow_crust_ooze']
# Creating a symptom index dictionary to encode the
# input symptoms into numerical form
symptom_index = {}
for index, value in enumerate(symptoms):
symptom = " ".join([i for i in value.split("_")])
symptom_index[symptom] = index
data_dict = {
"symptom_index":symptom_index,
"predictions_classes":encoder.classes_
}
# Defining the Function
# Input: string containing symptoms separated by commas
# Output: Generated predictions by models
def predictDisease(symptoms):
symptoms = symptoms.split(",")
# creating input data for the models
input_data = [0] * len(data_dict["symptom_index"])
for symptom in symptoms:
index = data_dict["symptom_index"][symptom]
input_data[index] = 1
# reshaping the input data and converting it
# into suitable format for model predictions
input_data = np.array(input_data).reshape(1,-1)
# generating individual outputs
rf_prediction = data_dict["predictions_classes"][final_rf_model.predict(input_data)[0]]
nb_prediction = data_dict["predictions_classes"][final_nb_model.predict(input_data)[0]]
svm_prediction = data_dict["predictions_classes"][final_svm_model.predict(input_data)[0]]
# making final prediction by taking mode of all predictions
final_prediction = mode([rf_prediction, nb_prediction, svm_prediction])[0][0]
predictions = {
"rf_model_prediction":rf_prediction,
"naive_bayes_prediction":nb_prediction,
"svm_model_prediction":svm_prediction,
"final_prediction":final_prediction
}
return predictions
# Testing the function
print(predictDisease("itching,skin_rash,nodal_skin_eruptions"))
但我收到此错误:
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_28380/3227896746.py in <module>
45
46 # Testing the function
---> 47 print(predictDisease("itching,skin_rash,nodal_skin_eruptions"))
~\AppData\Local\Temp/ipykernel_28380/3227896746.py in predictDisease(symptoms)
22 input_data = [0] * len(data_dict["symptom_index"])
23 for symptom in symptoms:
---> 24 index = data_dict["symptom_index"][symptom]
25 input_data[index] = 1
26
KeyError: 'skin_rash'
我尝试提供不同格式的参数(集合、数组、列表),但它不起作用并且总是给出 KeyError: 'skin_rash'
如何解决这个问题? 如果有人可以帮助我,我真的很感激!谢谢
data_dict['symptom_index'] 包含症状名称作为键及其在症状列表中的索引位置的字典,但键名称与症状列表中的名称不同。症状列表中症状键的名称包含空格而不是下划线。这就是为什么找不到钥匙的原因。
尝试:
index = data_dict["symptom_index"][symptom.replace("_", " ")]
其余的代码逻辑我没有检查,所以不知道这段代码是否还有其他问题。