TooManyRequests:运行 tweepy 时出现 429 太多请求

问题描述 投票:0回答:1

通过基本的学术研究开发者帐户,我使用 Tweepy API 来收集包含指定关键字或主题标签的推文。这使我每月可以收集 10,000,000 条推文。使用整个档案搜索,我尝试一次收集整个日历日期的推文。我收到速率限制错误(尽管 wait_on_rate_limit 标志设置为 true)现在请求限制出现错误。

这是代码

import pandas as pd
import tweepy


# function to display data of each tweet
def printtweetdata(n, ith_tweet):
        print()
        print(f"Tweet {n}:")
        print(f"Username:{ith_tweet[0]}")
        print(f"tweet_ID:{ith_tweet[1]}")
        print(f"userID:{ith_tweet[2]}")
        print(f"creation:{ith_tweet[3]}")
        print(f"location:{ith_tweet[4]}")
        print(f"Total Tweets:{ith_tweet[5]}")
        print(f"likes:{ith_tweet[6]}")
        print(f"retweets:{ith_tweet[7]}")
        print(f"hashtag:{ith_tweet[8]}")


# function to perform data extraction
def scrape(words, numtweet, since_date, until_date):
    
    # Creating DataFrame using pandas
    db = pd.DataFrame(columns=['username', 'tweet_ID', 'userID',
                            'creation', 'location', 'text','likes','retweets', 'hashtags'])
    
    # We are using .Cursor() to search through twitter for the required tweets.
    # The number of tweets can be restricted using .items(number of tweets)
    tweets = tweepy.Cursor(api.search_full_archive,'research',query=words,
                        fromDate=since_date, toDate=until_date).items(numtweet)
    
    # .Cursor() returns an iterable object. Each item in
    # the iterator has various attributes that you can access to
    # get information about each tweet
    list_tweets = [tweet for tweet in tweets]
    
    # Counter to maintain Tweet Count
    i = 1
    
    # we will iterate over each tweet in the list for extracting information about each tweet
    for tweet in list_tweets:
            username = tweet.user.screen_name
            tweet_ID = tweet.id
            userID= tweet.author.id
            creation = tweet.created_at
            location = tweet.user.location
            likes = tweet.favorite_count
            retweets = tweet.retweet_count
            hashtags = tweet.entities['hashtags']
        
        # Retweets can be distinguished by a retweeted_status attribute,
        # in case it is an invalid reference, except block will be executed
            try:
                text = tweet.retweeted_status.full_text
            except AttributeError:
                text = tweet.text
            hashtext = list()
            for j in range(0, len(hashtags)):
                hashtext.append(hashtags[j]['text'])
        
        # Here we are appending all the extracted information in the DataFrame
            ith_tweet = [username, tweet_ID, userID,
                    creation, location, text, likes,retweets,hashtext]
            db.loc[len(db)] = ith_tweet
        
        # Function call to print tweet data on screen
            printtweetdata(i, ith_tweet)
            i = i+1
    filename = 'C:/Users/USER/Desktop/الجامعة الالمانية/output/twitter.csv'
    
    # we will save our database as a CSV file.
    db.to_csv(filename)


if __name__ == '__main__':
    consumer_key = "####"
    consumer_secret = "###"
    access_token = "###"
    access_token_secret = "###"
    
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth,wait_on_rate_limit=True)
    
    since_date = '200701010000'
    until_date = '202101012359'
    
    words = "#USA"
    
    
    # number of tweets you want to extract in one run
    numtweet = 1000
    scrape(words, numtweet, since_date, until_date)
    print('Scraping has completed!')

我收到此错误:

TooManyRequests: 429 Too Many Requests
Request exceeds account’s current package request limits. Please upgrade your package and retry or contact Twitter about enterprise access.
python tweepy
1个回答
0
投票

不幸的是,我相信这是由于沙箱配额造成的。对于高级帐户来说会更多。 Tweepy API 文档

您可以在这里查看这个答案 - Limit

© www.soinside.com 2019 - 2024. All rights reserved.