为什么我的字计数器产生不同的输出相比,第二次我第一次运行呢?

问题描述 投票:1回答:1

我在做练习的基础工程。我把一个简单的维基百科页面,然后我写的一切为使用美丽的汤的文本文件。然后我再算的次数一个词出现在新写的文本文件

出于某种原因,我第一次运行的代码,我得到比我第二次运行的代码不同数量。

我相信,我第一次运行该代码时,“anime.txt”比我第二次运行代码不同。

这个问题必须与我收集所有我的文字数据与美丽的汤的方式。

请帮忙

from urllib.request import urlopen
from bs4 import BeautifulSoup

f = open("anime.txt", "w", encoding="utf-8")
f.write("")
f.close() 

my_url ="https://en.wikipedia.org/wiki/Anime"

uClient = urlopen(my_url)
page_html = uClient.read()
uClient.close()
page_soup = BeautifulSoup(page_html, "html.parser")
p=page_soup.findAll("p")


f = open("anime.txt", "a", encoding="utf-8")

for i in p:

    f.write(i.text)
    f.write("\n\n")

data= open("anime.txt", encoding="utf-8").read()
anime_count = data.count("anime")
Anime_count = data.count("Anime")

print(anime_count,"\n")
print(Anime_count, "\n")

count= anime_count+Anime_count

print("The total number of times the word Anime appears within <p> in the wikipedia page is : ", count)

第一输出:

anime_count = 14

Anime_count = 97

计数= 111

第二输出:

anime_count = 23

Anime_count = 139

计数= 162

编辑:

我编辑的第2个评论我的代码基础,当然,现在它工作:P。这看起来的问候更好地打开和关闭文件的时间正确的方法/号码是多少?

from urllib.request import urlopen
from bs4 import BeautifulSoup

my_url ="https://en.wikipedia.org/wiki/Anime"

uClient = urlopen(my_url)
page_html = uClient.read()
uClient.close()
page_soup = BeautifulSoup(page_html, "html.parser")
p=page_soup.findAll("p")


f = open("anime.txt", "w", encoding="utf-8")

for i in p:

    f.write(i.text)
    f.write("\n\n")

f.close()

data= open("anime.txt", encoding="utf-8").read()
anime_count = data.count("anime")
Anime_count = data.count("Anime")

print(anime_count,"\n")
print(Anime_count, "\n")

count= anime_count+Anime_count

print("The total number of times the word Anime appears within <p> in the wikipedia page is : ", count)
python
1个回答
0
投票

不要混淆打开和关闭文件。包括所有在with statements写入/读取部分。

from urllib.request import urlopen
from bs4 import BeautifulSoup

with open("anime.txt", "w", encoding="utf-8") as outfile:

    my_url ="https://en.wikipedia.org/wiki/Anime"

    uClient = urlopen(my_url)
    page_html = uClient.read()
    uClient.close()
    page_soup = BeautifulSoup(page_html, "html.parser")
    p=page_soup.findAll("p")


    for i in p:
        outfile.write(i.text)
        outfile.write("\n\n")

with open("anime.txt", "r", encoding="utf-8") as infile:
    data = infile.read()
    anime_count = data.count("anime")
    Anime_count = data.count("Anime")

    print(anime_count,"\n")
    print(Anime_count, "\n")

    count= anime_count+Anime_count

    print("The total number of times the word Anime appears within <p> in the wikipedia page is : ", count)s : ", count)
© www.soinside.com 2019 - 2024. All rights reserved.