将两个列表之间的数据排序到一个新列表中,并使用保存在新列表中的数据格式化字符串列表

问题描述 投票:0回答:2

抱歉,如果这不是很清楚,这是我第一次在这里提问,所以我希望我能正确解释我的问题。

我有以下具有不同值的列表:

A_list = ['A', 'A', 'B', ['C', 'D'] ]
B_list = ['A1', 'W5', 'X6', 'A2', 'A3', 'T5', 'B0', 'Z9', 'C1', 'W3', 'D1']
C_list = []
string_list = ["{0} in Alpha", "{0} in Apple", "{0} in Bee", "{0} in Cheese and {1} in Dice"]

我需要在 B_list 中找到 A_list 的元素,将它们追加到 C_list,并让输出成为带有 C_list 中元素的 string_list 的格式化字符串。

所以在

A_list[i]
中寻找
B_list
之后,
C_list
会变成这样:

C_list = ['A1', 'A2', 'A3', 'B0', ['C1', 'D1'] ]

输出将是这样的:

A1 in Alpha,
A1 in Apple,
A2 in Alpha,
A2 in Apple,
A3 in Alpha,
A3 in Apple,
B0 in Bee,
C1 in Cheese and D1 in Dice

我一直在用嵌套列表来破坏我的头脑,并让它们以与 A_list 类似的顺序排列,以便能够用类似的东西格式化输出:

output = string_list[i].format(*C_list[i]) // just an example

我一直在尝试结合使用 for 循环和 if 语句来解决这个问题。 我可以在一个简单的 for 循环中搜索

A_list
B_list
的元素:

for a in A_list:
    for b in B_list:
        if a in b:
            print(str(a) + " found in " + str(b))

让我崩溃的是如何将找到的 B_list 元素添加到与 A_list 类似的格式中,这样我最终可能会得到

C_list = ['A1', 'A2', 'A3', 'B0', ['C1', 'D1']] 

不是这个:

C_list = ['A1', 'A2', 'A3', 'B0', 'C1', 'D1'] 

python sorting nested-loops string-formatting nested-lists
2个回答
0
投票

如果您在处理 A_list 时对其进行规范化,使其始终是一个字符串列表,则问题更易于管理:

for a in A_list:
    # Normalize a to a list[str]
    a = a if isinstance(a, list) else [a]
    # Pop all matches from B_list into C_list.
    while True:
        c = []
        for i in a:
            for b in B_list.copy():
                if b.startswith(i):
                    c.append(b)
                    B_list.remove(b)
                    break
            if len(c) == len(a):
                break  # append this c and scan B again
        else:
            break  # no more matches, continue to next a
        # Convert c back to a str|list[str]
        C_list.append(c[0] if len(c) == 1 else c)

print(C_list)
# ['A1', 'A2', 'A3', 'B0', ['C1', 'D1']]

我可能建议在所有情况下都将

c
保留为字符串列表,因为它可能会使您的格式化部分更容易,但希望上面的内容可以帮助您克服如何处理这种棘手的嵌套格式的数据的初始障碍(虽然如果需要,仍然可以选择将其转换回原始的棘手格式)。


0
投票

第 1 部分:获取 C_list

您必须自己创建嵌套列表以附加到 C_list。 如果 a 中的项目可以是字符串列表或字符串,则有 2 种情况。

def get_A_in_B(a_list:"list[str|list[str]]",b_list:"list[str]"):
    c_list = [] # global within this function     
    
    # for neatness   
    def process_base_item(a_str:"str",out_list:"list"):
        matches = sorted([b_str for b_str in b_list if b_str.startswith(a_str)])
        out_list.extend(matches)
    
    for a_item in a_list: # case 1 - is list, extend nested
        if type(a_item) is list:
            sublist = a_item
            nested_list = []
            for sub_item in sublist:
                process_base_item(sub_item,nested_list)
            if nested_list:
                c_list.append(nested_list)
        else: # case 2 - is string, extend c list
            process_base_item(a_item,c_list)
    return c_list

用法:

A_list = ['A', 'B', ['C', 'D'] ]
B_list = ['A1', 'W5', 'X6', 'A2', 'A3', 'T5', 'B0', 'Z9', 'C1', 'W3', 'D1']
C_list = get_A_in_B(A_list,B_list,string_list)

输出:

['A1', 'A2', 'A3', 'B0', ['C1', 'D1']]

第 2 部分:格式化

如果支持 2 个假设,这将起作用:

  1. 假设格式字符串中每种类型的字母只有一个
  2. 假设如果你想循环所有的可能性如果嵌套是不均匀的 例如["C1", "C2", "D1"] => "C1"+"D1", "C2"+"D1"

这是真正棘手的部分。我使用正则表达式将字母与格式字符串匹配。

对于

C_list
的嵌套列表,我将它们按字母分成更多的子列表,然后将它们的笛卡尔积作为格式字符串的多个参数输入。

和以前一样,你有2个案例。

def format_string_list(c_list,string_list):
    formatted_string_list = []
    for c_item in c_list:
        for fmt_str in string_list:
            if type(c_item) is list: # case 1 - is list, match multiple
                c_sublist = c_item
                # assumption 1: letters are unique
                first_letters = sorted(set([c_str[0] for c_str in c_sublist]))
                matched_letters = []
                for letter in first_letters:
                    pat = f" in {letter}"
                    if pat in fmt_str:
                        matched_letters.append(letter)
                        
                if first_letters==matched_letters: 
                    # get dictionary of lists, indexed by first letter
                    c_str_d = {}
                    for letter in first_letters:
                        c_str_d[letter] = [c_str for c_str in c_sublist if letter in c_str]
                    
                    # assumption 2: get all combinations
                    for c_str_list in itertools.product(*c_str_d.values()):
                        c_fmtted = fmt_str.format(*c_str_list)
                        formatted_string_list.append(c_fmtted) 
            else: # case 2
                c_str = c_item
                first_letter = c_str[0]
                pat = f" in {first_letter}"

                if pat in fmt_str:
                    c_fmtted = fmt_str.format(c_str)
                    formatted_string_list.append(c_fmtted)
    
    return formatted_string_list

用法:

C_list = ['A1', 'A2', 'A3', 'B0', ['C1', 'D1'] ]
string_list = ["{0} in Alpha", "{0} in Apple", "{0} in Bee", "{0} in Cheese and {1} in Dice"]
formatted_string_list = format_string_list(C_list,string_list)
# print output
print("\n".join(formatted_string_list))

输出:

A1 in Alpha
A1 in Apple
A2 in Alpha
A2 in Apple
A3 in Alpha
A3 in Apple
B0 in Bee
C1 in Cheese and D1 in Dice

也适用于更复杂的案例

不会超出一层嵌套,不要认为你的情况需要它

A_list = ['A', 'B', ['C', 'D', 'E']]
B_list = ['A1', 'W5', 'X6', 'D2', 'E1', 'A2', 'A3', 'T5', 'E2', 'B0', 'Z9', 'C1', 'W3', 'D1']
string_list = ["{0} in Alpha", "{0} in Apple", "{0} in Bee", "{0} in Cheese and {1} in Dice {2} in Egg"]

输出:

['A1', 'A2', 'A3', 'B0', ['C1', 'D1', 'D2', 'E1', 'E2']]
A1 in Alpha
A1 in Apple
A2 in Alpha
A2 in Apple
A3 in Alpha
A3 in Apple
B0 in Bee
C1 in Cheese and D1 in Dice E1 in Egg
C1 in Cheese and D1 in Dice E2 in Egg
C1 in Cheese and D2 in Dice E1 in Egg
C1 in Cheese and D2 in Dice E2 in Egg
© www.soinside.com 2019 - 2024. All rights reserved.