无法插入使用Python和Oracle数据库中的数据为表

问题描述 投票:0回答:1

我的数据库表看起来像这样enter image description here

我有一个网络爬虫从网站获取新闻,我试图将其存储在此表中。我用零碎的和美丽的汤libraries.The下面的代码显示了我的履带式逻辑。

import requests
from bs4 import BeautifulSoup
import os
import datetime
import cx_Oracle

def scrappy(url):
    try:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'html.parser')
        title = soup.find('title').text.split('|')[0]
        time =soup.find('span', attrs={'class':'time_cptn'}).find_all('span')[2].contents[0]
        full_text =soup.find('div', attrs={'class':'article_content'}).text.replace('Download The Times of India News App for Latest India News','')
    except:
        return ('','','','')
    else:
        return (title,time,url,full_text)

def pathmaker(name):   
    path = "Desktop/Web_Crawler/CRAWLED_DATA/{}".format(name)

    try:  
        os.makedirs(path)
    except OSError:  
        pass
    else:  
        pass

def filemaker(folder,links_all):
    #k=1
    for link in links_all:
        scrapped=scrappy(link)    
        #textfile=open('Desktop/Web_Crawler/CRAWLED_DATA/{}/text{}.txt'.format(x,k),'w+')
        #k+=1
        Title = scrapped[0]
        Link = scrapped[2]
        Dates = scrapped[1]
        Text = scrapped[3]
        con = cx_Oracle.connect('shivams/[email protected]/XE')
        cursor = con.cursor()
        sql_query = "insert into newsdata values(:1,:2,:3,:4)"
        cursor.executemany(sql_query,[Title,Link,Dates,Text])
        con.commit()
        cursor.close()
        con.close()
        #textfile.write('Title\n{}\n\nLink\n{}\n\nDate & Time\n{}\n\nText\n{}'.format(scrapped[0],scrapped[2],scrapped[1],scrapped[3]))
        #textfile.close()
        con.close()



folders_links=[('India','https://timesofindia.indiatimes.com/india'),('World','https://timesofindia.indiatimes.com/world'),('Business','https://timesofindia.indiatimes.com/business'),('Homepage','https://timesofindia.indiatimes.com/')]

for x,y in folders_links:
    pathmaker(x)
    r = requests.get(y)
    soup = BeautifulSoup(r.text, 'html.parser')
    if x!='Homepage':
        links =soup.find('div', attrs={'class':'main-content'}).find_all('span', attrs={'class':'twtr'})
        links_all=['https://timesofindia.indiatimes.com'+links[x]['data-url'].split('?')[0] for x in range(len(links))]
    else:
        links =soup.find('div', attrs={'class':'wrapper clearfix'})
        total_links = links.find_all('a')
        links_all=[]
        for p in range(len(total_links)):
            if 'href' in str(total_links[p]) and '.cms'  in total_links[p]['href'] and  'http' not in total_links[p]['href'] and 'articleshow' in total_links[p]['href'] :
                links_all+=['https://timesofindia.indiatimes.com'+total_links[p]['href']]

    filemaker(x,links_all)

早些时候,我创建的文本文件,并存储在他们的消息,但现在我想将它存储在数据库中对我的web应用程序来访问它。我的数据库逻辑是在文件制作功能。我试图在表中插入值,但它不工作,并给予不同类型的错误。我跟着网站上的其他职位,但他们在我的情况下,没有工作。谁能帮我这个。另外我不知道这是因为我使用它的第一次插入CLOB数据的正确方法。需要帮忙。

python-3.x oracle11g beautifulsoup cx-oracle
1个回答
0
投票

你可以做到以下几点:

cursor.execute(sql_query, [Title, Link, Dates, Text])

或者,如果你建立了这些值的列表,然后你可以做到以下几点:

allValues = []
allValues.append([Title, Link, Dates, Text])
cursor.executemany(sql_query, allValues)

希望解释的事情!

© www.soinside.com 2019 - 2024. All rights reserved.