无法将列表的所有值写入CSV行。值从页面提取并存储在变量中,但在文件中丢失

问题描述 投票:0回答:1

值正确地从网页中删除,但当我尝试以CSV格式写入时,公司名称未写入文件中。

这里是链接到网页:http://search.sunbiz.org/Inquiry/CorporationSearch/SearchResultDetail?inquirytype=EntityName&directionType=Initial&searchNameOrder=A%201421260&aggregateId=domp-142126-360258c3-c08b-4f9a-8866-ad3ecdedef02&searchTerm=A&listNameOrder=A%201421260

import requests
from bs4 import BeautifulSoup
from requests import get
from json import loads
import csv

def writeFile(data):
    with open("A.csv", "a") as file:
        writer = csv.writer(file)
        writer.writerow(data)

def company_page(url):
    docnum, fenum, fdate, edate, status, levent, companyName, year = '','','','','','','',''
    page = requests.get(url)
    soup = BeautifulSoup(page.text, "lxml")
    # print(soup.prettify())
    reqDiv = soup.find("div", class_ = "detailSection filingInformation")
    companyName = soup.find("div", class_ = "detailSection corporationName").text

    for lab in reqDiv.select("label"):

        if(lab.text == 'Document Number'):
            docnum = lab.find_next_sibling().text
        if(lab.text == 'FEI/EIN Number'):
            fenum = lab.find_next_sibling().text
        if(lab.text == 'Date Filed'):
            fdate = lab.find_next_sibling().text
        if(lab.text == 'Effective Date'):
            edate = lab.find_next_sibling().text
            year = lab.find_next_sibling().text.split('/')[2]
        if(lab.text == 'Status'):
            status = lab.find_next_sibling().text
        if(lab.text == 'Last Event'):
            levent = lab.find_next_sibling().text
        else:
            continue

        details = []
        details.append(companyName)
        details.append(docnum)
        details.append(fenum)
        details.append(fdate)
        details.append(edate)
        details.append(status)
        details.append(levent)
        writeFile(details)

在csv enter image description here中获取以下值我也要在csv文件中添加公司名称。它存储在变量companyName中,但不能在csv中写入。

python csv web-scraping beautifulsoup
1个回答
1
投票
import requests
from bs4 import BeautifulSoup
import csv


def Main(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.content, 'html.parser')
    name = soup.find("div", class_="detailSection corporationName").get_text(
        strip=True, separator=" ")
    data = [item.string for item in soup.select(
        "div.detailSection.filingInformation span")]
    del data[5]
    # print(data[2:-2])
    with open("data.csv", 'w', newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Name", "Document Number",
                         "FEI/EIN Number", "Date Filed", "Status", "Last Event"])
        writer.writerow([name, *data[2:-2]])


Main("http://search.sunbiz.org/Inquiry/CorporationSearch/SearchResultDetail?inquirytype=EntityName&directionType=Initial&searchNameOrder=A%201421260&aggregateId=domp-142126-360258c3-c08b-4f9a-8866-ad3ecdedef02&searchTerm=A&listNameOrder=A%201421260")

输出:view-online

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.