append()方法的意外行为。为什么,添加字典到列表中时,被列表的前一个元素覆盖?

问题描述 投票:1回答:1

下面的代码检查的公司名称的存在(来自tickerList)或其在的消息(从新闻列表)的文本片段。在当时该公司在新闻打印发现给出了该公司的预期股票,但加入这个消息列表后的情况下,一些废话发生:(它的样子,追加词典(新闻)时,列表( tickersNews),在列表的前面元素覆盖。为什么呢?

应当注意的是,当新闻追加作为字典转换为字符串,一切正常,因为它应该

import re

tickersList = [('ATI', 'Allegheny rporated', 'Allegheny Technologies Incorporated'), ('ATIS', 'Attis', 'Attis Industries, Inc.'), ('ATKR', 'Atkore International Group', 'Atkore International Group Inc.'), ('ATMP', 'Barclays + Select M', 'Barclays ETN+ Select MLP'), ('ATNM', 'Actinium', 'Actinium Pharmaceuticals, Inc.'), ('ATNX', 'Athenex', 'Athenex, Inc.'), ('ATOS', 'Atossa Genetics', 'Atossa Genetics Inc.'), ('ATRA', 'Atara Biotherapeutics', 'Atara Biotherapeutics, Inc.'), ('ATRC', 'AtriCure', 'AtriCure, Inc.'), ('ATRO', 'Astronics', 'Astronics Corporation'), ('ATRS', 'Antares Pharma', 'Antares Pharma, Inc.'), ('ATSG', 'Air Transport Services Group', 'Air Transport Services Group, Inc.'),  ('CJ', 'C&J Energy', 'C&J Energy Services, Inc.'), ('CJJD', 'China Jo-Jo Drugstores', 'China Jo-Jo Drugstores, Inc.'), ('CLAR', 'Clarus', 'Clarus Corporation'), ('CLD', 'Cloud Peak Energy', 'Cloud Peak Energy Inc.'), ('CLDC', 'China Lending', 'China Lending Corporation'), ('CLDR', 'Cloudera', 'Cloudera, Inc.')]

newsList = [
    {'title':'Atara Biotherapeutics Announces Planned Chief Executive Officer Transition'},
    {'title':'Chongqing Jingdong Pharmaceutical and Athenex Announce a Strategic Partnership and Licensing Agreement to Develop and Commercialize KX2-391 in China'}
           ]

tickersNews = []

for news in newsList:
    # pass through the list of companies looking for their mention in the news
    for ticker, company, company_full in tickersList:
        # clear the full name of the company from brackets, spaces, articles,
        # points and commas and save fragments of the full name to the list
        companyFullFragments = company_full.replace(',', '')\
            .replace('.', '').replace('The ', ' ')\
            .replace('(', ' ').replace(')', ' ')\
            .replace('  ', ' ').strip().split()
        # looking for a company in the news every time cutting off
        # the last fragment from the full company name
        for i in range(len(companyFullFragments), 0, -1):
            companyFullFragmentsString = ' '.join(companyFullFragments[:i]).strip()
            lookFor_company = r'(^|\s){0}(\s|$)'.format(companyFullFragmentsString)
            results_company = re.findall(lookFor_company, news['title'])
            # if the title of the news contains the name of the company,
            # then we add the ticker, the found fragment and the full name
            # of the company to the news, print the news and add it to the list
            if results_company:
                news['ticker'] = ticker#, companyFullFragmentsString, company_full
                print(news['ticker'], 'found')
                #tickersNews.append(str(news))
                #-----------------------------Here is the problem!(?)
                tickersNews.append(news)
                # move on to the next company
                break

print(20*'-', 'appended:')
for news in tickersNews:
    print(news['ticker'])

输出(字典的列表):

ATRA found
ATNX found
CJJD found
CLDC found
-------------------- appended:
ATRA
CLDC
CLDC
CLDC

输出(字符串列表):

ATRA found
ATNX found
CJJD found
CLDC found
-------------------- appended as a strings:
["{'title': 'Atara Biotherapeutics Announces Planned Chief Executive Officer Transition', 'ticker': 'ATRA'}", "{'title': 'Chongqing Jingdong Pharmaceutical and Athenex Announce a Strategic Partnership and Licensing Agreement to Develop and Commercialize KX2-391 in China', 'ticker': 'ATNX'}", "{'title': 'Chongqing Jingdong Pharmaceutical and Athenex Announce a Strategic Partnership and Licensing Agreement to Develop and Commercialize KX2-391 in China', 'ticker': 'CJJD'}", "{'title': 'Chongqing Jingdong Pharmaceutical and Athenex Announce a Strategic Partnership and Licensing Agreement to Develop and Commercialize KX2-391 in China', 'ticker': 'CLDC'}"]
python-3.x list dictionary append
1个回答
1
投票

news['ticker'] = tickertickersNews.append(news)其位于内部for循环:问题2线起源。你的问题要简单得多的版本是:

a = 10
a = 20
a = 30
print(a, a, a)

输出将被30 30 30。我想这是显而易见的。

为了解决这个问题,你可以使用多种方法。

第一种可能性(容易)。与tickersNews.append(news)更换tickersNews.append(news.copy())

第二可能性(优选)。不要使用tickersNews。对于每一个news建立空单news['ticker_list'] = list()。对于每一个ticker追加它news['ticker_list']

import re

tickersList = [('ATI', 'Allegheny rporated', 'Allegheny Technologies Incorporated'), ('ATIS', 'Attis', 'Attis Industries, Inc.'), ('ATKR', 'Atkore International Group', 'Atkore International Group Inc.'), ('ATMP', 'Barclays + Select M', 'Barclays ETN+ Select MLP'), ('ATNM', 'Actinium', 'Actinium Pharmaceuticals, Inc.'), ('ATNX', 'Athenex', 'Athenex, Inc.'), ('ATOS', 'Atossa Genetics', 'Atossa Genetics Inc.'), ('ATRA', 'Atara Biotherapeutics', 'Atara Biotherapeutics, Inc.'), ('ATRC', 'AtriCure', 'AtriCure, Inc.'), ('ATRO', 'Astronics', 'Astronics Corporation'), ('ATRS', 'Antares Pharma', 'Antares Pharma, Inc.'), ('ATSG', 'Air Transport Services Group', 'Air Transport Services Group, Inc.'),  ('CJ', 'C&J Energy', 'C&J Energy Services, Inc.'), ('CJJD', 'China Jo-Jo Drugstores', 'China Jo-Jo Drugstores, Inc.'), ('CLAR', 'Clarus', 'Clarus Corporation'), ('CLD', 'Cloud Peak Energy', 'Cloud Peak Energy Inc.'), ('CLDC', 'China Lending', 'China Lending Corporation'), ('CLDR', 'Cloudera', 'Cloudera, Inc.')]

newsList = [
    {'title':'Atara Biotherapeutics Announces Planned Chief Executive Officer Transition'},
    {'title':'Chongqing Jingdong Pharmaceutical and Athenex Announce a Strategic Partnership and Licensing Agreement to Develop and Commercialize KX2-391 in China'}
           ]

for news in newsList:
    news['ticker_list'] = list()
    for ticker, company, company_full in tickersList:
        companyFullFragments = company_full.replace(',', '')\
            .replace('.', '').replace('The ', ' ')\
            .replace('(', ' ').replace(')', ' ')\
            .replace('  ', ' ').strip().split()
        for i in range(len(companyFullFragments), 0, -1):
            companyFullFragmentsString = ' '.join(companyFullFragments[:i]).strip()
            lookFor_company = r'(^|\s){0}(\s|$)'.format(companyFullFragmentsString)
            results_company = re.findall(lookFor_company, news['title'])
            if results_company:
                news['ticker_list'].append(ticker)
                # print(ticker, 'found')
                break

print('tickers for news:')

for news in newsList:
    print(news['ticker_list'])

输出将是:

tickers for news:
['ATRA']
['ATNX', 'CJJD', 'CLDC']
© www.soinside.com 2019 - 2024. All rights reserved.