读取文件python时跳过空格

Question

我在一个很长的项目上工作，我从保存在我的校园网络系统中的文件中读取，当读取文件时，如果我删除列表底部的空白区域，但是当我离开它们时（如教授希望）我得到一个错误，“int（无效的文字）与基数10：'日期'”我尝试了几个不同的选项来忽略空格但没有一个工作 - 我尝试过的列表

 with open("C:\\Users\\Brayd\OneDrive\\Documents\\2015HomicideLog_FINAL.txt") as f_in:
     lines = (line.rstrip() for line in f_in) 
     lines = list(line for line in lines if line)

for line in file: if not line.strip(): print("it is empty line")

with open("fname.txt") as file:
    for line in file:
      if not line.strip():
         file.close()


with open file as f_in: lines = list(line for line in (l.strip() for l in f_in) if line)

没有什么工作，这是我在删除文件中的空白时使用的，它完美地工作

file = open("C:\\Users\\Brayd\OneDrive\\Documents\\2015HomicideLog_FINAL.txt" , "r")
lines=file.readlines()[1:]
file.close()

我一直在努力让它在空白区域工作12个小时，现在没有任何运气......任何想法的人？

这是文本文件的样子 -

Date   Event #  TIME    Victim Name     V R/G   V Age
150101 0685 2:03    Anderson, Kedral    BM  26
150103 0816 5:57    Shines, Kathryn     WF  54
150106 4417 22:06   Norton, Noella      HF  46
150107 4655 23:27   Speidel, Steven     WM  41
150110 1100 8:35    Orozco, Jose        HM  53
*blank spaces here*
     *blank spaces here*
*BSH^*

有关我的程序的更好示例，这里是完整的代码

def dayofmurder(date): #function to find day of the murder
    date = date%10000 #takes 10000 out leaving 2 digits for year
    month = date//100 #takes 100 out leaving 1-2 digits for month
    date= date %100 # mod 100 to figure out date
    day=date #day=date
    monthlist = [0,31,59,90,120,151,181,212,243,273,304,334] #possible months through date ranges
    daysofweek = ["Sunday","Monday","Tuesday","Wednesday", #list of days of the week
    "Thursday","Friday","Saturday"]
    startonday = 4 #start on 4th day (thursday) per txt file
    startonday = monthlist[month-1]+(day-1)+startonday # start on day w/ days
    startonday %= 7 #mod 7 to find day of week
    return daysofweek[startonday] #return the day of the week homicide was on

daysoftheweek = ["Sunday","Monday","Tuesday","Wednesday",
"Thursday","Friday","Saturday"] #list of days of the week for printing in order


file = open("C:\\Users\\Brayd\OneDrive\\Documents\\2015HomicideLog_FINAL.txt" , "r")
lines=file.readlines()[1:]
file.close()

print("Days Homicides Happened on:")
dayOfmurders = {"Sunday": 0 ,"Monday": 0,"Tuesday": 0,"Wednesday": 0,
"Thursday":0, "Friday": 0,"Saturday": 0} #list of days and start vaule of 0 
#murders
for line in lines: #reads all lines
    value=line.split() #splits each value in line
    listdays=(dayofmurder(int(value[0]))) #for every value in the row
    dayOfmurders[listdays] = dayOfmurders[listdays] + 1 #every time there is an
    #occurance, add 1 to total value in dayOfmurders

for v in daysoftheweek: #in order of value (S-M-T-W-TH-F-S (from daysoftheweek 
    print(dayOfmurders[v],"homicides happen on a", v)   #prints [v](value) of 
    #daysOfmurders with string " " and prints v (value) in daysoftheweek)
print("----------------------------------",'\n', "Number of Homicides\
in hour block:")
time = {"0:" : 0, "1:" : 0, "2:" : 0, "3:" : 0, "4:" : 0, "5:" : 0, "6:" : 0,
        ##list of possible time's
        "7:" : 0,"8:" : 0, "9:" : 0, "10" : 0, "11" : 0,"12" : 0, "13" : 0,   
        # " " is the hour possible
        "14" : 0, "15" : 0,"16" : 0,  "17" : 0, "18" : 0, "19" : 0,"20" : 0,  
        # 0 value is the number of occurances
        "21" : 0, "22" : 0, "23" : 0}
for line in lines:      #reads each line of the file
    value=line.split()  #splits up each value in the line
    listdays=(value[2][0:2])  #moves the index of the line and grabs only 
    #first 2 variables
    time[listdays] = time[listdays] + 1

for k,v in time.items():  #uses key and value in time dict
    print(v,"Homicides happened in",k,"hour block")  #
print("----------------------------------",'\n', "Races and Occurances of Hom\
idices")
races = {"HF": 0 ,"HM": 0,"WF": 0,"WM": 0,"AF":0, "BM": 0,"BF": 0, "AM": 0} 
#list of races and start value of 0
for line in lines: #function to find all races in Homicide File
    value=line.split()
    listdays=(value[5])
    if listdays == "Chunng": #if statement for the people who have more than2 
    #names
        listdays = (value[6]) #if they do have more than 2 names, move to the 
        #next index slot and to register race
    elif listdays == "Terrance": #same as above
        listdays = (value[6]) #same as above
    elif listdays == "Lasunda": #same as above
        listdays = (value[6]) #same as above
    else:
        listdays = (value[5]) #same as above
    races[listdays] = races[listdays] + 1 #for every occurance add's 1 to the
    #value

for k,v in races.items(): #uses key and value in dictionary races
    print(k,"=",v) #prints key and value in race dictionary

ages = { "0" : 0, "1" : 0, "2" : 0, "3" : 0, #list of all possible ages and
# their
        "4" : 0, "5" : 0, "6" : 0,"7" : 0,  #occurances
        "8" : 0,"9" : 0}

for line in lines:   #function to find all ages in Homicide File
    value = line.split()
    listdays = (value[6][0])
    if listdays == "A": #for people w/ 3 names, if index 6 = a/b/t(see race's)
        listdays = (value[7][0]) # skip to next index and use index 7
    elif listdays == "B":
        listdays = (value[7][0])
    elif listdays == "T":
        listdays = (value[7][0])
    else:
        listdays = (value[6][0])
    ages[listdays] = ages[listdays]+ 1 #adds all occurances

for k,v in ages.items(): #uses key and value in dictionary ages
    print(k, "=", v) #prints key and value in age dictionary

print("----------------------------------",'\n', "Here are the Graphs from\
data found above")

import pylab #importing pylab for graphs
bar_width = .75
x_values = [1,2,3,4,5,6,7] #range 1-7
y_values = [13,25,17,26,20,14,19] # data from murder occurances, see above
tlabel = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
pylab.title("Homicide Occurenece by Day of Week Per Homicides File")
pylab.bar(x_values, y_values, width=bar_width, tick_label = tlabel, align = 
'center' , color = 'b')
pylab.show()

pylab.axes(aspect = 1) #used pylab example from sheet
values = [39, 11, 31, 6, 1, 2, 29, 15] #data from race/gender see above
pie_labels = ["BM", "BF", "HM", "HF", "AM", "AF", "WM", "WF"]
color_list = ['purple', 'green', 'blue', 'cyan', 'yellow', 'maroon', 'red',
              'white']
pylab.pie(values,autopct = '%1.f%%', labels = pie_labels, colors=color_list)
pylab.title("Pie Chart Showing Racial and Gender Breakdown in Homicides File")
pylab.show()    


bar_width = .5 #used pylab examples from sheet (sets bar width)
x_values = [0,1,2,3,4,5,6,7,8,9] #range 0-9 (0-9,10-19,20-29... ect)
y_values = [4,7,27,41,4,15,7,6,2,5] # number of occurances per age
tlabel = ["0-10", "11-20", "21-30", "31-40", "41-50", "51-60",
          "61-70", "71-80", "81-90", "90+"]
pylab.title("Homicides per Age Categories in Homocide File")
pylab.bar(x_values, y_values, width=bar_width, tick_label = tlabel, align = 
'center' , color = 'b')
pylab.show()

bar_width = .3 #pylab example from sheet(sets bar width)
x_values = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23]
#^number of hours possible for murders
y_values = [3,3,7,1,4,6,4,4,4,5,5,3,8,4,6,2,5,13,10,6,7,5,13,6] #occurances
#of deaths per hour
tlabel = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
          "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"]
pylab.title("Homicides Per Hour of the Clock in Homicide File")
pylab.bar(x_values, y_values, width=bar_width, tick_label = tlabel, align =
 'center' , color = 'b')
pylab.show()

Answer 1

要跳过行末端的空行和空格，您可以执行此操作

lines = []
with open("fname.txt") as f:
    for line in f:
        line = line.strip()
        if line:
            lines.append(line)

# do something with lines         
print(lines[1:])

print("Days Homicides Happened on:")

或更短

with open("fname.txt") as f:
    lines = [line.strip() for line in f if line.strip()]

# do something with lines         
print(lines[1:])         

print("Days Homicides Happened on:")

或者正常阅读并在执行某些操作之前检查代码中的每一行

file = open("C:\\Users\\Brayd\OneDrive\\Documents\\2015HomicideLog_FINAL.txt" , "r")
lines = file.readlines()[1:]
file.close()

print("Days Homicides Happened on:") 

for line in lines:
    # check if line is not empty
    if line.strip():
        # do something with not-empty line
        for number in line.split():
         print(int(number))

编辑：完整的代码，适合我

def dayofmurder(date): #function to find day of the murder
    date = date%10000 #takes 10000 out leaving 2 digits for year
    month = date//100 #takes 100 out leaving 1-2 digits for month
    date= date %100 # mod 100 to figure out date
    day=date #day=date
    monthlist = [0,31,59,90,120,151,181,212,243,273,304,334] #possible months through date ranges
    daysofweek = ["Sunday","Monday","Tuesday","Wednesday", #list of days of the week
    "Thursday","Friday","Saturday"]
    startonday = 4 #start on 4th day (thursday) per txt file
    startonday = monthlist[month-1]+(day-1)+startonday # start on day w/ days
    startonday %= 7 #mod 7 to find day of week
    return daysofweek[startonday] #return the day of the week homicide was on

daysoftheweek = ["Sunday","Monday","Tuesday","Wednesday",
"Thursday","Friday","Saturday"] #list of days of the week for printing in order

#-------------------------

# OPEN FUNCTION THAT WORKS WITH ORYGINAL FILE

with open("fname.txt") as f:
    lines = [line.strip() for line in f if line.strip()]

# skip headers         
lines = lines[1:]

#-------------------------

print("Days Homicides Happened on:")
dayOfmurders = {"Sunday": 0 ,"Monday": 0,"Tuesday": 0,"Wednesday": 0,
"Thursday":0, "Friday": 0,"Saturday": 0} #list of days and start vaule of 0 
#murders
for line in lines: #reads all lines
    value=line.split() #splits each value in line
    listdays=(dayofmurder(int(value[0]))) #for every value in the row
    dayOfmurders[listdays] = dayOfmurders[listdays] + 1 #every time there is an
    #occurance, add 1 to total value in dayOfmurders

for v in daysoftheweek: #in order of value (S-M-T-W-TH-F-S (from daysoftheweek 
    print(dayOfmurders[v],"homicides happen on a", v)   #prints [v](value) of 
    #daysOfmurders with string " " and prints v (value) in daysoftheweek)
print("----------------------------------",'\n', "Number of Homicides\
in hour block:")
time = {"0:" : 0, "1:" : 0, "2:" : 0, "3:" : 0, "4:" : 0, "5:" : 0, "6:" : 0,
        ##list of possible time's
        "7:" : 0,"8:" : 0, "9:" : 0, "10" : 0, "11" : 0,"12" : 0, "13" : 0,   
        # " " is the hour possible
        "14" : 0, "15" : 0,"16" : 0,  "17" : 0, "18" : 0, "19" : 0,"20" : 0,  
        # 0 value is the number of occurances
        "21" : 0, "22" : 0, "23" : 0}
for line in lines:      #reads each line of the file
    value=line.split()  #splits up each value in the line
    listdays=(value[2][0:2])  #moves the index of the line and grabs only 
    #first 2 variables
    time[listdays] = time[listdays] + 1

for k,v in time.items():  #uses key and value in time dict
    print(v,"Homicides happened in",k,"hour block")  #
print("----------------------------------",'\n', "Races and Occurances of Hom\
idices")
races = {"HF": 0 ,"HM": 0,"WF": 0,"WM": 0,"AF":0, "BM": 0,"BF": 0, "AM": 0} 
#list of races and start value of 0
for line in lines: #function to find all races in Homicide File
    value=line.split()
    listdays=(value[5])
    if listdays == "Chunng": #if statement for the people who have more than2 
    #names
        listdays = (value[6]) #if they do have more than 2 names, move to the 
        #next index slot and to register race
    elif listdays == "Terrance": #same as above
        listdays = (value[6]) #same as above
    elif listdays == "Lasunda": #same as above
        listdays = (value[6]) #same as above
    else:
        listdays = (value[5]) #same as above
    races[listdays] = races[listdays] + 1 #for every occurance add's 1 to the
    #value

for k,v in races.items(): #uses key and value in dictionary races
    print(k,"=",v) #prints key and value in race dictionary

ages = { "0" : 0, "1" : 0, "2" : 0, "3" : 0, #list of all possible ages and
# their
        "4" : 0, "5" : 0, "6" : 0,"7" : 0,  #occurances
        "8" : 0,"9" : 0}

for line in lines:   #function to find all ages in Homicide File
    value = line.split()
    listdays = (value[6][0])
    if listdays == "A": #for people w/ 3 names, if index 6 = a/b/t(see race's)
        listdays = (value[7][0]) # skip to next index and use index 7
    elif listdays == "B":
        listdays = (value[7][0])
    elif listdays == "T":
        listdays = (value[7][0])
    else:
        listdays = (value[6][0])
    ages[listdays] = ages[listdays]+ 1 #adds all occurances

for k,v in ages.items(): #uses key and value in dictionary ages
    print(k, "=", v) #prints key and value in age dictionary

print("----------------------------------",'\n', "Here are the Graphs from\
data found above")

import pylab #importing pylab for graphs
bar_width = .75
x_values = [1,2,3,4,5,6,7] #range 1-7
y_values = [13,25,17,26,20,14,19] # data from murder occurances, see above
tlabel = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
pylab.title("Homicide Occurenece by Day of Week Per Homicides File")
pylab.bar(x_values, y_values, width=bar_width, tick_label = tlabel, align = 
'center' , color = 'b')
pylab.show()

pylab.axes(aspect = 1) #used pylab example from sheet
values = [39, 11, 31, 6, 1, 2, 29, 15] #data from race/gender see above
pie_labels = ["BM", "BF", "HM", "HF", "AM", "AF", "WM", "WF"]
color_list = ['purple', 'green', 'blue', 'cyan', 'yellow', 'maroon', 'red',
              'white']
pylab.pie(values,autopct = '%1.f%%', labels = pie_labels, colors=color_list)
pylab.title("Pie Chart Showing Racial and Gender Breakdown in Homicides File")
pylab.show()    


bar_width = .5 #used pylab examples from sheet (sets bar width)
x_values = [0,1,2,3,4,5,6,7,8,9] #range 0-9 (0-9,10-19,20-29... ect)
y_values = [4,7,27,41,4,15,7,6,2,5] # number of occurances per age
tlabel = ["0-10", "11-20", "21-30", "31-40", "41-50", "51-60",
          "61-70", "71-80", "81-90", "90+"]
pylab.title("Homicides per Age Categories in Homocide File")
pylab.bar(x_values, y_values, width=bar_width, tick_label = tlabel, align = 
'center' , color = 'b')
pylab.show()

bar_width = .3 #pylab example from sheet(sets bar width)
x_values = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23]
#^number of hours possible for murders
y_values = [3,3,7,1,4,6,4,4,4,5,5,3,8,4,6,2,5,13,10,6,7,5,13,6] #occurances
#of deaths per hour
tlabel = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
          "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"]
pylab.title("Homicides Per Hour of the Clock in Homicide File")
pylab.bar(x_values, y_values, width=bar_width, tick_label = tlabel, align =
 'center' , color = 'b')
pylab.show()

Answer 2

根据我从评论中收集的内容，您可能有几个选项，我会做的是：

Case where it trully is just whitespace

with open("file.txt") as f:
    data = f.readlines()
    headersRaw = data[0].split()
    headersFinal = [headersRaw[0],                   # Date
                    headersRaw[1]+" "+headersRaw[2], # Event #
                    headersRaw[3],                   # Time
                    headersRaw[4]+" "+headersRaw[5], # Victim name
                    headersRaw[6],                   # V
                    headersRaw[7],                   # R/G
                    headersRaw[8]+" "+headersRaw[9]  # V Age
                   ]
    i = 1
    computedData = []
    while data[i].split()[O].isdigit():
        rawData = data[i].split()
        computedData.append([rawData[0],                # Date
                             rawData[1],                # Event #
                             rawData[2],                # Time
                             rawData[3]+" "+rawData[4], # Victim name
                             rawData[5],                # V
                             rawData[6],                # R/G
                             rawData[7]                 # V Age
                            ])
        i += 1

所以我们只是检查下一行是否以数字开头

Case where next line might be something different but not whitespace

with open("file.txt") as f:
    data = f.readlines()
    headersRaw = data[0].split()
    headersFinal = [headersRaw[0],                   # Date
                    headersRaw[1]+" "+headersRaw[2], # Event #
                    headersRaw[3],                   # Time
                    headersRaw[4]+" "+headersRaw[5], # Victim name
                    headersRaw[6],                   # V
                    headersRaw[7],                   # R/G
                    headersRaw[8]+" "+headersRaw[9]  # V Age
                   ]
    i = 1
    computedData = []
    while len(data[i].split()[O]) == 6:
        rawData = data[i].split()
        computedData.append([rawData[0],                # Date
                             rawData[1],                # Event #
                             rawData[2],                # Time
                             rawData[3]+" "+rawData[4], # Victim name
                             rawData[5],                # V
                             rawData[6],                # R/G
                             rawData[7]                 # V Age
                            ])
        i += 1

因此，我们将检查表中的下一个数据是否包含长度为6的日期值

BUT

您的数据可能是不同的，甚至是错误，也可能是6位数或整数，因此请确保您在while循环上执行的检查适合于表后面的数据。

它可能不是最好的解决方案，但没有更多的信息，事后可能会发生什么，有点难以提出完美的检查。如果可以的话，我建议添加一些破折号或者你可以轻松检查txt文件的东西。

读取文件python时跳过空格

问题描述投票：0回答：2

2个回答

Case where it trully is just whitespace

Case where next line might be something different but not whitespace

BUT

最新问题

读取文件python时跳过空格

问题描述 投票：0回答：2

2个回答

Case where it trully is just whitespace

Case where next line might be something different but not whitespace

BUT

最新问题

问题描述投票：0回答：2