是不是参数有问题,或者这个方法不起作用?

问题描述 投票:0回答:1

我想使用具有给定字符串参数的函数来预测疾病类型。该参数可以包含多个字符串。但总是给出错误信息:KeyError: 'skin_rash'

这是我在极客极客上找到的代码示例:https://www.geeksforgeeks.org/disease-prediction-using-machine-learning/

该函数的数据集可以在 Kaggle 上找到:https://www.kaggle.com/datasets/kaushil268/disease-prediction-using-machine-learning/data

这是我的代码:

symptoms = X.columns.values

Here are the symptoms:
['itching' 'skin_rash' 'nodal_skin_eruptions' 'continuous_sneezing'
 'shivering' 'chills' 'joint_pain' 'stomach_pain' 'acidity'
 'ulcers_on_tongue' 'muscle_wasting' 'vomiting' 'burning_micturition'
 'spotting_ urination' 'fatigue' 'weight_gain' 'anxiety'
 'cold_hands_and_feets' 'mood_swings' 'weight_loss' 'restlessness'
 'lethargy' 'patches_in_throat' 'irregular_sugar_level' 'cough'
 'high_fever' 'sunken_eyes' 'breathlessness' 'sweating' 'dehydration'
 'indigestion' 'headache' 'yellowish_skin' 'dark_urine' 'nausea'
 'loss_of_appetite' 'pain_behind_the_eyes' 'back_pain' 'constipation'
 'abdominal_pain' 'diarrhoea' 'mild_fever' 'yellow_urine'
 'yellowing_of_eyes' 'acute_liver_failure' 'fluid_overload'
 'swelling_of_stomach' 'swelled_lymph_nodes' 'malaise'
 'blurred_and_distorted_vision' 'phlegm' 'throat_irritation'
 'redness_of_eyes' 'sinus_pressure' 'runny_nose' 'congestion' 'chest_pain'
 'weakness_in_limbs' 'fast_heart_rate' 'pain_during_bowel_movements'
 'pain_in_anal_region' 'bloody_stool' 'irritation_in_anus' 'neck_pain'
 'dizziness' 'cramps' 'bruising' 'obesity' 'swollen_legs'
 'swollen_blood_vessels' 'puffy_face_and_eyes' 'enlarged_thyroid'
 'brittle_nails' 'swollen_extremeties' 'excessive_hunger'
 'extra_marital_contacts' 'drying_and_tingling_lips' 'slurred_speech'
 'knee_pain' 'hip_joint_pain' 'muscle_weakness' 'stiff_neck'
 'swelling_joints' 'movement_stiffness' 'spinning_movements'
 'loss_of_balance' 'unsteadiness' 'weakness_of_one_body_side'
 'loss_of_smell' 'bladder_discomfort' 'foul_smell_of urine'
 'continuous_feel_of_urine' 'passage_of_gases' 'internal_itching'
 'toxic_look_(typhos)' 'depression' 'irritability' 'muscle_pain'
 'altered_sensorium' 'red_spots_over_body' 'belly_pain'
 'abnormal_menstruation' 'dischromic _patches' 'watering_from_eyes'
 'increased_appetite' 'polyuria' 'family_history' 'mucoid_sputum'
 'rusty_sputum' 'lack_of_concentration' 'visual_disturbances'
 'receiving_blood_transfusion' 'receiving_unsterile_injections' 'coma'
 'stomach_bleeding' 'distention_of_abdomen'
 'history_of_alcohol_consumption' 'fluid_overload.1' 'blood_in_sputum'
 'prominent_veins_on_calf' 'palpitations' 'painful_walking'
 'pus_filled_pimples' 'blackheads' 'scurring' 'skin_peeling'
 'silver_like_dusting' 'small_dents_in_nails' 'inflammatory_nails'
 'blister' 'red_sore_around_nose' 'yellow_crust_ooze']



 
# Creating a symptom index dictionary to encode the
# input symptoms into numerical form
symptom_index = {}
for index, value in enumerate(symptoms):
    symptom = " ".join([i for i in value.split("_")])
    symptom_index[symptom] = index
 
data_dict = {
    "symptom_index":symptom_index,
    "predictions_classes":encoder.classes_
}
 
# Defining the Function
# Input: string containing symptoms separated by commas
# Output: Generated predictions by models
def predictDisease(symptoms):
    symptoms = symptoms.split(",")
     
    # creating input data for the models
    input_data = [0] * len(data_dict["symptom_index"])
    for symptom in symptoms:
        index = data_dict["symptom_index"][symptom]
        input_data[index] = 1
         
    # reshaping the input data and converting it
    # into suitable format for model predictions
    input_data = np.array(input_data).reshape(1,-1)
     
    # generating individual outputs
    rf_prediction = data_dict["predictions_classes"][final_rf_model.predict(input_data)[0]]
    nb_prediction = data_dict["predictions_classes"][final_nb_model.predict(input_data)[0]]
    svm_prediction = data_dict["predictions_classes"][final_svm_model.predict(input_data)[0]]
     
    # making final prediction by taking mode of all predictions
    final_prediction = mode([rf_prediction, nb_prediction, svm_prediction])[0][0]
    predictions = {
        "rf_model_prediction":rf_prediction,
        "naive_bayes_prediction":nb_prediction,
        "svm_model_prediction":svm_prediction,
        "final_prediction":final_prediction
    }
    return predictions
 
# Testing the function
print(predictDisease("itching,skin_rash,nodal_skin_eruptions"))

但我收到此错误:

KeyError                                  Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_28380/3227896746.py in <module>
     45 
     46 # Testing the function
---> 47 print(predictDisease("itching,skin_rash,nodal_skin_eruptions"))

~\AppData\Local\Temp/ipykernel_28380/3227896746.py in predictDisease(symptoms)
     22     input_data = [0] * len(data_dict["symptom_index"])
     23     for symptom in symptoms:
---> 24         index = data_dict["symptom_index"][symptom]
     25         input_data[index] = 1
     26 

KeyError: 'skin_rash'

我尝试提供不同格式的参数(集合、数组、列表),但它不起作用并且总是给出 KeyError: 'skin_rash'

如何解决这个问题? 如果有人可以帮助我,我真的很感激!谢谢

python pandas dictionary jupyter-notebook keyerror
1个回答
0
投票

data_dict['symptom_index'] 包含症状名称作为键及其在症状列表中的索引位置的字典,但键名称与症状列表中的名称不同。症状列表中症状键的名称包含空格而不是下划线。这就是为什么找不到钥匙的原因。

尝试:

index = data_dict["symptom_index"][symptom.replace("_", " ")]

其余的代码逻辑我没有检查,所以不知道这段代码是否还有其他问题。

© www.soinside.com 2019 - 2024. All rights reserved.