这是一个列表,每个元素由两个字符串和中间的“/t”组成。我们可以将左侧的字符串称为“标签”,右侧的部分称为“文本”。
continued In the film, "Girl Interrupted," Winona Ryder plays an 18-year-old
continued who enters a mental institution for
continued what
anecdote is diagnosed as borderline personality disorder
anecdote The year is 1967
anecdote the country is in turmoil over Vietnam and civil rights
continued While
continued lying on her bed one night
continued and
continued watching TV
continued ,
anecdote she sees a news report about a demonstration
continued The narrator says something
continued that might apply to today's turmoil
continued :
continued "We live in a time of doubt
continued .
continued The institutions
continued we once trusted no longer
anecdote seem reliable."
continued As 2014 ends Modd-NU
statistics the stock market is at record highs
assumption our traditional institutions and self-confidence are in decline
continued A Pew Research Center study confirms one trend
testimony that has been obvious over several years
assumption The "typical" American family is no longer typical
statistics Just 46 percent of American children now live in homes with their married, heterosexual parents
statistics Five percent have no parents at home
continued They most likely are living with grandparents
continued ,
testimony says the study
assumption These startling figures about the decline of the American family contrast with the year 1960
continued when Modd-NU
statistics 73 percent of American children lived in traditional families
assumption A major contributor to this trend has been the assault on marriage and other institutions by the Baby Boom generation
我的问题是我想将标记为“继续”的元素与其后面的第一个未标记为“继续”的元素合并。例如,前三个元素标记为“继续”,所以我想将它们合并与第四个元素,并为第四个元素使用标签“轶事”。我是初学者,对迭代运算不太熟悉,非常感谢!
Chatgpt 给出了一段代码:
result = []
iterator = iter(lines)
current_item = next(iterator, None)
while current_item:
label, text = current_item.split('\t', 1)
if label == 'continued':
text_list = [text]
while label == 'continued':
next_item = next(iterator, None)
if next_item:
label, next_text = next_item.split('\t', 1)
text_list.append(next_text)
else:
break
label = label
text = ''.join(text_list)
else:
next_item = next(iterator, None)
result.append(f'{label}\t{text}')
current_item = next_item
for item in result:
print(item)
但是这段代码最终会生成一些重复的元素。如果您可以微调这段代码,我也将不胜感激!
尝试在迭代器中“预读”并不是一个好计划。相反,一次只执行一行,然后收集“连续”行,直到得到一个不“连续”的行。
result = []
build = []
for line in open('x.txt'):
line = line.strip()
if not line:
continue
label, text = line.split('\t', 1)
if label == 'continued':
build.append( text )
else:
if build:
build = " ".join(build)
result.append(f"continued\t{build}")
build = []
result.append(f'{label}\t{text}')
if build:
build = " ".join(build)
result.append(f"continued\t{build}")
for item in result:
print(item)
输出:
continued In the film, "Girl Interrupted," Winona Ryder plays an 18-year-old who enters a mental institution for what
anecdote is diagnosed as borderline personality disorder
anecdote The year is 1967
anecdote the country is in turmoil over Vietnam and civil rights
continued While lying on her bed one night and watching TV ,
anecdote she sees a news report about a demonstration
continued The narrator says something that might apply to today's turmoil : "We live in a time of doubt . The institutions we once trusted no longer
anecdote seem reliable."
continued As 2014 ends Modd-NU
statistics the stock market is at record highs
assumption our traditional institutions and self-confidence are in decline
continued A Pew Research Center study confirms one trend
testimony that has been obvious over several years
assumption The "typical" American family is no longer typical
statistics Just 46 percent of American children now live in homes with their married, heterosexual parents
statistics Five percent have no parents at home
continued They most likely are living with grandparents ,
testimony says the study
assumption These startling figures about the decline of the American family contrast with the year 1960
continued when Modd-NU
statistics 73 percent of American children lived in traditional families
assumption A major contributor to this trend has been the assault on marriage and other institutions by the Baby Boom generation