如何使用python将Reddit数据作为数据库文件剪贴,从而节省进度

问题描述 投票:0回答:1

我编写了一个脚本,用于从reddit获取一些帖子。

import praw
import pandas as pd
reddit = praw.Reddit(client_id='*******', \
                     client_secret='*******', \
                     user_agent='**********', \
                     username='********', \
                     password='*******8')
subreddit1 = reddit.subreddit("Tea")
subreddit2 = reddit.subreddit("Biophysics")
top_subreddit1 = subreddit1.top(limit=500)
top_subreddit2 = subreddit2.top(limit=500)
topics_dict = { "title":[],
                "score":[],
                "id":[], "url":[], 
                "comms_num": [],
                "created": [],
                "body":[]}
for submission1 in top_subreddit1:
    topics_dict["title"].append(submission1.title)
    topics_dict["score"].append(submission1.score)
    topics_dict["id"].append(submission1.id)
    topics_dict["url"].append(submission1.url)
    topics_dict["comms_num"].append(submission1.num_comments)
    topics_dict["created"].append(submission1.created)
    topics_dict["body"].append(submission1.selftext)
for submission2 in top_subreddit2:
    topics_dict["title"].append(submission2.title)
    topics_dict["score"].append(submission2.score)
    topics_dict["id"].append(submission2.id)
    topics_dict["url"].append(submission2.url)
    topics_dict["comms_num"].append(submission2.num_comments)
    topics_dict["created"].append(submission2.created)
    topics_dict["body"].append(submission2.selftext)
topics_data = pd.DataFrame(topics_dict)
topics_data

但是它仅显示在我的jupyter中。现在,我想将进度保存为数据库文件。感谢所有建议。

python pandas reddit praw
1个回答
0
投票
您有两种选择。我将介绍两个,每个都有其优缺点:
© www.soinside.com 2019 - 2024. All rights reserved.