这个问题在这里已有答案:
WindowsError:[错误3]系统找不到指定的路径:我试图将文件目录分配给基路径,以便其余的代码可以完成它的工作。
Also let me know if you think there should be a : at the end of this line of code----->
for file in sorted(os.listdir(path))
i think should be ....
for file in sorted(os.listdir(path)):
the book doesnt have the : at the end
import pyprind #INSTALLED IN ANACONDA TERMINAL
import pandas as pd
import os
# change the 'basepath' to the directory of unzipped movie dataset
#tried:
#basepath = 'C:\\Users\\zacka\\Downloads\\aclImdb_v1.tar.gz'
#basepath = 'C://Users//zacka//Downloads//aclImdb_v1.tar.gz'
#basepath = 'C:/Users/zacka/Downloads/aclImdb_v1.tar.gz'
#basepath = 'C:\Users\zacka\Downloads\aclImdb_v1.tar.gz'
#not sure if im using the back slash or forward slash incorrectly or if i #need to double up....
labels = {'pos': 1, 'neg': 0}
pbar = pyprind.ProgBar(50000)
df = pd.DataFrame()
for s in ('test', 'train'):
for l in ('pos', 'neg'):
path = os.path.join(basepath, s, l)
for file in sorted(os.listdir(path))
with open(os.path.join(path, file),
'r', encoding='utf-8') as infile:
txt = infile.read()
df = df.append([[txt, labels[1]]],
ignore_index=True)
pbar.update()
df.columns = ['review', 'sentiment']
basepath ='C:\\ Users \\ zacka \\ Downloads \\ aclImdb'
双反斜杠是必要的,特别是'.... Downloads \\ aclImdb'我尝试打印(basepath)没有双反斜杠,它为aclImdb中的字符a产生一个0x7。
我也将basepath =设置为压缩文件夹而不是解压缩文件夹。
现在我需要弄清楚:TypeError:'encoding'是这个函数的无效关键字参数encoding ='utf-8'