嗨，正在尝试从文本文件中提取姓名+电子邮件，但不知道如何将输出彼此相邻放置

Question

我正在尝试将我的数据，姓名和电子邮件并排输出。现在，它只打印电子邮件，然后打印名称。

这是我的代码：

import re
import nltk
from nltk.corpus import stopwords
stop = stopwords.words('english')

inputfile = open('/Users/jchome/Downloads/StockXRF/untitled.txt','r')
string = inputfile.read()



def extract_email_addresses(string):
    r = re.compile(r'[\w\.-]+@[\w\.-]+')
    return r.findall(string)

def ie_preprocess(document):
    document = ' '.join([i for i in document.split() if i not in stop])
    sentences = nltk.sent_tokenize(document)
    sentences = [nltk.word_tokenize(sent) for sent in sentences]
    sentences = [nltk.pos_tag(sent) for sent in sentences]
    return sentences

def extract_names(document):
    names = []
    sentences = ie_preprocess(document)
    for tagged_sentence in sentences:
        for chunk in nltk.ne_chunk(tagged_sentence):
            if type(chunk) == nltk.tree.Tree:
                if chunk.label() == 'PERSON':
                    names.append(' '.join([c[0] for c in chunk]))
    return names

if __name__ == '__main__':
    emails = extract_email_addresses(string)
    names = extract_names(string)


print (emails + names)

输出：

['[email protected]', [email protected], 'Lawrence', 'George']

如何将输出彼此相邻并写入文本文件？

Answer 1

您可以执行以下操作：

zipped = list(zip(emails, names))
df = pd.DataFrame(zipped, columns = ['emails' , 'names'])

此后，您可以打印数据框，并且可以使用例如to_csv方法将输出保存到文件。

嗨，正在尝试从文本文件中提取姓名+电子邮件，但不知道如何将输出彼此相邻放置

问题描述投票：0回答：1

1个回答

最新问题

嗨，正在尝试从文本文件中提取姓名+电子邮件，但不知道如何将输出彼此相邻放置

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1