如何将其转换为有用的浮点数?以字符串表示的血压值

问题描述 投票:0回答:1

我正在使用一个看起来像这样的数据集,我有兴趣了解血压如何影响某些患者。

df = {'Person ID': [1,2,3,4,5,6,7,8], 
      'BMI': ['Overweight','Normal','Normal','Obese','Obese','Underweight','Normal','Obese'], 
      'Sleep Disorder': ['Insomnia',float('nan'),float('nan'),'Sleep Apnea','Sleep Apnea',float('nan'),float('nan'),'Insomnia'],
'Illness':['Ill', 'Healthy','Healthy','Ill','Ill','Healthy','Healthy','Ill'],
'Blood Pressure': ['125/82','132/87','128/85','126/83','126/83','115/78','139/91','142/92']}

输出:

   Person ID          BMI Sleep Disorder  Illness Blood Pressure   Sleep Duration
0          1   Overweight       Insomnia      Ill         125/82              6.1
1          2       Normal            NaN  Healthy         132/87              6.2
2          3       Normal            NaN  Healthy         128/85              6.2
3          4        Obese    Sleep Apnea      Ill         126/83              5.9
4          5        Obese    Sleep Apnea      Ill         126/83              5.9
5          6  Underweight            NaN  Healthy         115/78              8.1
6          7       Normal            NaN  Healthy         139/91              8.1
7          8        Obese       Insomnia      Ill         142/92              8.1

主要问题是

Blood Pressure
既不是Int也不是Float,所以我如何测量相关性或者哪种相关性可以用来比较
"Ill"
"Healthy"
人?

这个想法是复制这个:

graph_bp = sns.scatterplot(data = sleep_dataset, x = "Blood Pressure", y = "Sleep Duration", hue = "Health State")
graph_bp.set_title("Relation between age and sleep duration")
graph_bp.set_xlabel("Blood Pressure")
graph_bp.set_ylabel("Sleep Duration")
cor_bp = sleep_dataset["Blood Pressure"].corr(sleep_dataset["Sleep Duration"])
print("Correlation with age: " + str(cor_bp))

获取一些有关血压的信息,但没有相关的int。 一个想法是将最高值和最低值分开并进行差异,但这似乎不是一个好主意,因为它会说一个具有 140/100 的人与一个 100/60 相同,你会如何做图形是这样的吗?

pandas types seaborn correlation units-of-measurement
1个回答
0
投票

相关性可以是数字中的一个,也可以是两者。从科学的角度来看,您应该独立计算这两种相关性,或者找到从两者组合得出的有意义的指标。

无论如何,您都应该将字符串转换为两个数字,可以通过以下方式实现:

df[['systolic', 'diastolic']] = (df['Blood Pressure']
                                 .str.split('/', expand=True)
                                 .astype(int)
                                )

输出:

   Person ID          BMI Sleep Disorder  Illness Blood Pressure  systolic  diastolic
0          1   Overweight       Insomnia      Ill         125/82       125         82
1          2       Normal            NaN  Healthy         132/87       132         87
2          3       Normal            NaN  Healthy         128/85       128         85
3          4        Obese    Sleep Apnea      Ill         126/83       126         83
4          5        Obese    Sleep Apnea      Ill         126/83       126         83
5          6  Underweight            NaN  Healthy         115/78       115         78
6          7       Normal            NaN  Healthy         139/91       139         91
7          8        Obese       Insomnia      Ill         142/92       142         92
© www.soinside.com 2019 - 2024. All rights reserved.