如何使用关键字搜索句子,直到python中的字符串结尾

问题描述 投票:0回答:2

我被打成了一个小小的逻辑部分

我的代码在这里

def find_between( string, first, last ):
    list1 = []
    try:
        start = string.index( first ) + len( first )
        end = string.index( last, start )
        list1.append(string[start:end])
        print(list1)
    except ValueError:
        return ""

with open("sample.txt")as f:
    data = f.read()
    print(data)

    find_between( data, "*CHI:  " , "%mor:  " )

我的sample.txt包含:

*CHI:   I saw a giraffe and a elephant .
%mor:   pro:sub|I v|see&PAST det:art|a n|giraffe coord|and det:art|a
    n|elephant .
%gra:   1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|7|DET 7|5|COORD 8|2|PUNCT
*CHI:   <that> [/] (.) that (i)s it . [+ bch]
%mor:   pro:dem|that cop|be&3S pro:per|it .
%gra:   1|2|SUBJ 2|0|ROOT 3|2|PRED 4|2|PUNCT
*CHI:   I saw an elephant go swimming .
%mor:   pro:sub|I v|see&PAST det:art|a n|elephant v|go part|swim-PRESP .
%gra:   1|2|SUBJ 2|0|ROOT 3|4|DET 4|5|SUBJ 5|2|COMP 6|5|OBJ 7|2|PUNCT
*CHI:   <I saw eleph> [//] I saw the <g> [/] giraffe and the elephant <s>
    [//] drop ball in the pool .
%mor:   pro:sub|I v|see&PAST det:art|the n|giraffe coord|and det:art|the
    n|elephant n|drop n|ball prep|in det:art|the n|pool .
%gra:   1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|9|DET 7|9|MOD 8|9|MOD
    9|5|COORD 10|9|NJCT 11|12|DET 12|10|POBJ 13|2|PUNCT
*CHI:   I saw giraffe swimming in the pool to get that ball .
%mor:   pro:sub|I v|see&PAST n|giraffe part|swim-PRESP prep|in det:art|the
    n|pool inf|to v|get pro:dem|that n|ball .

我应该返回“* CHI:”和“%mor:”之间的所有句子我的代码只会带来第一行

I saw a giraffe and a elephant

通过迭代直到字符串的结尾帮助我,我应该能够打印“* CHI:”和“%mor:”之间的所有句子

python file
2个回答
1
投票

所以我使用正则表达式和简单的字符串而不是文件,但它是相同的原则。检查工作代码:

import re

s = """
    *CHI:   I saw a giraffe and a elephant .
    %mor:   pro:sub|I v|see&PAST det:art|a n|giraffe coord|and det:art|a 
        n|elephant .
    %gra:   1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|7|DET 7|5|COORD 8|2|PUNCT
    *CHI:   <that> [/] (.) that (i)s it . [+ bch]
    %mor:   pro:dem|that cop|be&3S pro:per|it .
    %gra:   1|2|SUBJ 2|0|ROOT 3|2|PRED 4|2|PUNCT
    *CHI:   I saw an elephant go swimming .
    %mor:   pro:sub|I v|see&PAST det:art|a n|elephant v|go part|swim-PRESP .
    %gra:   1|2|SUBJ 2|0|ROOT 3|4|DET 4|5|SUBJ 5|2|COMP 6|5|OBJ 7|2|PUNCT
    *CHI:   <I saw eleph> [//] I saw the <g> [/] giraffe and the elephant <s>
        [//] drop ball in the pool .
    %mor:   pro:sub|I v|see&PAST det:art|the n|giraffe coord|and det:art|the
        n|elephant n|drop n|ball prep|in det:art|the n|pool .
    %gra:   1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|9|DET 7|9|MOD 8|9|MOD
        9|5|COORD 10|9|NJCT 11|12|DET 12|10|POBJ 13|2|PUNCT
    *CHI:   I saw giraffe swimming in the pool to get that ball .
    %mor:   pro:sub|I v|see&PAST n|giraffe part|swim-PRESP prep|in det:art|the
        n|pool inf|to v|get pro:dem|that n|ball .
    """

result = re.findall('(?<=CHI:)(.*?)(?=%mor)', s, flags=re.S)
print(result)

0
投票

试图没有正则表达式,虽然我仍然不确定是否允许%gra打印。

def find_between( string, first, last ):
    flag = True
    try:
        buffer = ""
        op_string = ""
        for line in string:
            if first in line:
                buffer += line
                flag = True

            elif last in line:
                op_string += buffer 
                buffer = "" # flush buffer
                flag = False

            elif flag is True:
                buffer += line

        print(op_string)

    except ValueError:
        return ""

with open("sample.txt")as f:
    data = f.readlines()
    #print(data)

    find_between( data, "*CHI:  " , "%mor:  " )
© www.soinside.com 2019 - 2024. All rights reserved.