我有一个字典disease_dict,其值位于列表元素中。我想获取特定键的键和值,然后检查其他键中是否存在该值(作为子字符串)并获取所有键 --> 值对。
例如,这是字典。我想查看字典中是否存在“Stroke”或“Stroke”,然后匹配该键的值是否是其他值的子字符串(例如“C10.228.140.300.275”中存在“C10.228.140.300.775”) .800', 'C10.228.140.300.775.600')
'Stroke': ['C10.228.140.300.775', 'C14.907.253.855'], 'Stroke, Lacunar': ['C10.228.140.300.275.800', 'C10.228.140.300.775.600', 'C14.907.253.329.800', 'C14.907.253.855.600']
我有以下几行代码用于获取特定术语的键和值。
#extract all child terms
for k, v in dis_dict.items():
if (k in ['Glaucoma', 'Stroke']) or (k in ['glaucoma', 'stroke']):
disease = k
tree_id = v
print (disease, tree_id)
else:
disease = ''
tree_id = ''
continue
非常感谢任何帮助!
下面的代码应该可以实现您想要实现的目标:
dis_dict = {
'Stroke': ['C10.228.140.300.775', 'C14.907.253.855'],
'Stroke, Lacunar': ['C10.228.140.300.275.800', 'C10.228.140.300.775.600', 'C14.907.253.329.800', 'C14.907.253.855']
}
dict_already_printed = {}
for k, v in dis_dict.items():
if ( k.lower() in ['glaucoma', 'stroke'] ):
disease = k
tree_id = v
output = None
for c_code_1 in tree_id:
for key, value in dis_dict.items():
for c_code_2 in value:
if c_code_1 in c_code_2:
if f'{disease} {tree_id}' != f'{key} {value}':
tmp_output = f'{disease} {tree_id}, other: {key} {value}'
if tmp_output not in dict_already_printed:
output = tmp_output
print(output)
dict_already_printed[output] = None
if output is None:
output = f'{disease} {tree_id}'
print(output)
else:
disease = ''
tree_id = ''
continue
因此用字典的另一个数据对其进行测试,看看它是否按预期工作。仅在完全匹配的情况下才打印:
Stroke ['C10.228.140.300.775', 'C14.907.253.855'], other: Stroke, Lacunar ['C10.228.140.300.275.800', 'C10.228.140.300.775.600', 'C14.907.253.329.800', 'C14.907.253.855']
或者如果没有发现其他疾病(更改字典值以避免匹配),则仅发现找到的疾病:
Stroke ['C10.228.140.300.775', 'C14.907.253.855']
您有一个良好的起点,并且您可能已经知道,您需要研究拆分它的密钥。以下是您可以做到的方法:
disease_dict = { 'Stroke': ['C10.228.140.300.775', 'C14.907.253.855'], 'Stroke, Lacunar': ['C10.228.140.300.275.800', 'C10.228.140.300.775.600', 'C14.907.253.329.800', 'C14.907.253.855.600'], 'Flue' : ['C10.228.140.300.780'] }
for k, v in disease_dict.items():
tmp = ''.join(x for x in k if x.isalpha() or x == '-' or x == ' ')
tmpKey = tmp.split(' ')
for tk in tmpKey:
if tk.capitalize() in ['Stroke', 'Glaucoma']:
print(k, v, end= ' ') # To remove the new line ending
首先,我们使用这一行删除不必要的字符:
tmp = ''.join(x for x in k if x.isalpha() or x == ' ' or x == '-')
它只保留字母字符、空格和破折号。由于我不知道你的病是什么样的,所以我只保留了这些字符(下一行需要空格)。 创建这个新的格式化密钥后,我们将其按空格分隔,然后比较子字符串。
tmpKey = tmp.split(' ')
一旦制作了
tmpKey
,我们就会循环它以检查您想要的疾病是否属于原始密钥。
for tk in tmpKey:
if tk.capitalize() in ['Stroke', 'Glaucoma']:
print(k, v, end= ' ') # To remove the new line ending
tk.capitalize()
用于将第一个字母大写,这样您就不必检查单词的两种形式。
最后,运行上面的脚本后,我们得到的是:
Stroke ['C10.228.140.300.775', 'C14.907.253.855'] Stroke, Lacunar ['C10.228.140.300.275.800', 'C10.228.140.300.775.600', 'C14.907.253.329.800', 'C14.907.253.855.600']
您不需要为此编写大量代码。
要知道的主要事情是,您可以使用
in
查找子字符串。例如。 "abc" in "abcdef" == True
。
if k1.lower() in k2.lower()
(我在这里使用.lower()
进行不区分大小写的比较。不确定是否需要。)in
(if search_string in find_string
) 匹配的内容。这就是函数print_match
。dis_dict = {
'Stroke': ['C10.228.140.300.775', 'C14.907.253.855'],
'Stroke, Lacunar': [
'C10.228.140.300.275.800',
'C10.228.140.300.775.600',
'C14.907.253.329.800',
'C14.907.253.855.600'
]
}
def print_match(v1, v2):
for search_string in v1:
for find_string in v2:
if search_string in find_string:
print(f"{k1}: {v1} found in {k2}: {v2}")
return
for k1, v1 in dis_dict.items():
for k2, v2 in dis_dict.items():
if k1 is k2:
continue
if k1.lower() in k2.lower():
print_match(v1, v2)