使用正则表达式仅检索列表

问题描述 投票:0回答:1

我正在尝试检索下面文本中的列表。

import re

# Read the content from the file (here we assume the content is stored in a string for demonstration)
content = """
Variation 1:
Based on the provided examples and the input sequence, the next anticipated actions for the input sequence [['Start'], ['Pick', 'mixing_bowl_green']] could be:

1. Pick the sponge (since wiping actions typically require the sponge).
2. Wipe the mixing bowl.
3. Place the mixing bowl.
4. Place the sponge.
5. Pick the broom.
6. Sweep.
7. Place the broom.
8. End the sequence.

Here is the completed sequence:

[['Start'], ['Pick', 'mixing_bowl_green'], ['Pick', 'sponge_small'], ['Wipe mixing_bowl_green'], ['Place', 'mixing_bowl_green'], ['Place', 'sponge_small'], ['Pick', 'broom'], ['Sweep'], ['Place', 'broom'], ['End']]

Variation 2:
Based on the provided examples, the anticipated sequence of actions for the given input can be completed as follows:

Input = [['Start'], ['Pick', 'mixing_bowl_green']]

1. The next logical step is to pick the sponge, as it is required for the wiping actions.
2. Then, proceed to wipe the mixing bowl.
3. Place the mixing bowl back.
4. Pick the cutting board and wipe it.
5. Place the cutting board back.
6. Pick the plate and wipe it.
7. Place the plate back.
8. Place the sponge back.
9. Pick the broom and sweep.
10. Place the broom back.
11. End the sequence.

Here is the completed sequence:

[['Start'],
 ['Pick', 'mixing_bowl_green'],
 ['Pick', 'sponge_small'],
 ['Wipe mixing_bowl_green'],
 ['Place', 'mixing_bowl_green'],
 ['Pick', 'cutting_board_small'],
 ['Wipe cutting_board_small'],
 ['Place', 'cutting_board_small'],
 ['Pick', 'plate_dish'],
 ['Wipe plate_dish'],
 ['Place', 'plate_dish'],
 ['Place', 'sponge_small'],
 ['Pick', 'broom'],
 ['Sweep'],
 ['Place', 'broom'],
 ['End']]
"""

# Define the regular expression pattern
pattern = re.compile(r"\[\['Start'\].*?\['End'\]\]", re.DOTALL)

# Find all matches in the content
matches = pattern.findall(content)

# Print the matches
for match in matches:
    print("#####")
    print(match)
    print("#####")
    input()

但是,代码返回了我

#####
[['Start'], ['Pick', 'mixing_bowl_green']] could be:

1. Pick the sponge (since wiping actions typically require the sponge).
2. Wipe the mixing bowl.
3. Place the mixing bowl.
4. Place the sponge.
5. Pick the broom.
6. Sweep.
7. Place the broom.
8. End the sequence.

Here is the completed sequence:

[['Start'], ['Pick', 'mixing_bowl_green'], ['Pick', 'sponge_small'], ['Wipe mixing_bowl_green'], ['Place', 'mixing_bowl_green'], ['Place', 'sponge_small'], ['Pick', 'broom'], ['Sweep'], ['Place', 'broom'], ['End']]
#####

作为第一个匹配项,这是不正确的。如何将正则表达式写入仅匹配列表? ['Start'] 和 ['End'] 之间的文本必须类似于列表,即逗号后跟方括号。输出应该是

list1 = [['Start'], ['Pick', 'mixing_bowl_green'], ['Pick', 'sponge_small'], ['Wipe mixing_bowl_green'], ['Place', 'mixing_bowl_green'], ['Place', 'sponge_small'], ['Pick', 'broom'], ['Sweep'], ['Place', 'broom'], ['End']]
list2 = [['Start'],
 ['Pick', 'mixing_bowl_green'],
 ['Pick', 'sponge_small'],
 ['Wipe mixing_bowl_green'],
 ['Place', 'mixing_bowl_green'],
 ['Pick', 'cutting_board_small'],
 ['Wipe cutting_board_small'],
 ['Place', 'cutting_board_small'],
 ['Pick', 'plate_dish'],
 ['Wipe plate_dish'],
 ['Place', 'plate_dish'],
 ['Place', 'sponge_small'],
 ['Pick', 'broom'],
 ['Sweep'],
 ['Place', 'broom'],
 ['End']]
python regex
1个回答
0
投票

正则表达式的问题是

.*?
可以匹配从第一个
Start
到第一个
End
(同样第三个
Start
到第二个
End
)的所有字符,这意味着它匹配更多比你想要的要多。您可以通过使正则表达式的该部分更加具体来解决这个问题,即仅匹配逗号、一些空格和
[]
封闭的数据:

\[\['Start'\](?:,\s*\[[^]]+\])*,\s*\['End'\]\]

regex101 上的正则表达式演示

在Python中:

pattern = re.compile(r"\[\['Start'\](?:,\s*\[[^]]+\])*,\s*\['End'\]\]") pattern.findall(content)
输出:

[ "[['Start'], ['Pick', 'mixing_bowl_green'], ['Pick', 'sponge_small'], ['Wipe mixing_bowl_green'], ['Place', 'mixing_bowl_green'], ['Place', 'sponge_small'], ['Pick', 'broom'], ['Sweep'], ['Place', 'broom'], ['End']]", "[['Start'],\n ['Pick', 'mixing_bowl_green'],\n ['Pick', 'sponge_small'],\n ['Wipe mixing_bowl_green'],\n ['Place', 'mixing_bowl_green'],\n ['Pick', 'cutting_board_small'],\n ['Wipe cutting_board_small'],\n ['Place', 'cutting_board_small'],\n ['Pick', 'plate_dish'],\n ['Wipe plate_dish'],\n ['Place', 'plate_dish'],\n ['Place', 'sponge_small'],\n ['Pick', 'broom'],\n ['Sweep'],\n ['Place', 'broom'],\n ['End']]" ]
    
© www.soinside.com 2019 - 2024. All rights reserved.