我正在研究一个检查文件夹大小的程序,然后打印出最大使用量的百分比,即50GB。我遇到的问题是,如果数据只有1mb或一个不是gb的小数字我没有得到准确的百分比。如何改进我的代码来解决此问题。
import math, os
def get(fold):
total_size = 0
for dirpath, dirnames, filenames in os.walk(fold):
for f in filenames:
fp = os.path.join(dirpath, f)
size = os.path.getsize(fp)
total_size += size
size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
i = int(math.floor(math.log(total_size, 1024)))
p = math.pow(1024, i)
s = round(total_size / p, 2)
return "%s %s" % (s, size_name[i])
per = 100*float(get(fold))/float(5e+10)
print(per)
您可能计入的一个地方是,您在不考虑块大小的情况下添加文件大小。例如,在我的系统上,分配块大小为4096字节。因此,如果我'echo 1> test.txt',这个1字节文件占用4096字节。我们可以重新编写代码来尝试解释块:
import math
import os
SIZE_NAMES = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
def get(fold):
total_size = 0
for dirpath, _, filenames in os.walk(fold):
for f in filenames:
fp = os.path.join(dirpath, f)
stat = os.stat(fp)
size = stat.st_blksize * math.ceil(stat.st_size / float(stat.st_blksize))
total_size += size
i = int(math.floor(math.log(total_size, 1024)))
p = math.pow(1024, i)
s = round(total_size / p, 2)
return "%s %s" % (s, SIZE_NAMES[i])
虽然getsize()
undercount影响所有文件,但百分比方面,它会影响较小的文件。当然,目录节点也会占用空间。此外,此计算有几个问题:
per = 100*float(get(fold))/float(5e+10)
首先,它失败了,因为fold()
返回'122.23 MB'
不喜欢的float()
字符串。其次,它没有考虑在float()
代码中调整但未在此处未调整的数字的单位。最后,它没有解决千兆字节与gibibyte问题(如果没有其他内容则在评论中。)I.e。在fold()
代码中,空间减少了1024的幂,但在这里除以1000的幂。我的返工:
number, unit = get(fold).split() # "2.34 MB" -> ["2.34", "MB"]
number = float(number) * 1024 ** SIZE_NAMES.index(unit) # 2.34 * 1024 ** 2
print("{0:%}".format(number / 500e9)) # percentage of 500GB
你在代码中混合了一些东西;例如,您的函数get()
返回一个字符串,但您稍后尝试将其强制转换为float
。
我建议稍微分开一下。首先是格式化大小的函数(我从其他stackoverflow问题得到了一些想法):
SIZE_UNITS = ['', 'K', 'M', 'G', 'T']
def format_size(size_in_bytes):
if size_in_bytes == 0:
return '0.0 B'
exp = math.floor(math.log(size_in_bytes, 1024))
size = size_in_bytes / math.pow(1024, exp)
return '{:.1f} {}B'.format(
size,
SIZE_UNITS[exp])
你有一个计算目录大小的功能和一个很好地打印信息的功能:
def get_size_of_dir(dir_path):
total_size = 0
for dir_path, dir_list, file_list in os.walk(dir_path):
for filename in file_list:
f = os.path.join(dir_path, filename)
size = os.path.getsize(f)
total_size += size
return total_size
def print_info(dir_path, capacity):
total_size = get_size_of_dir(dir_path)
percent = total_size * 100.0 / capacity
print()
print('Directory: "{}"'.format(dir_path))
print('capacity {:>10s}'.format(format_size(capacity)))
print('total_size {:>10s}'.format(format_size(total_size)))
print('percent used {:8.1f} %'.format(percent))
在我的机器上看起来像这样:
# 1024**1 == > 1 KB
# 1024**2 == > 1 MB
# 1024**3 == > 1 GB
>>> capacity = 5 * 1024**3
>>> for folder in ('/home/ralf/Documents/', '/home/ralf/Downloads/'):
... print_info(folder, capacity)
Directory: "/home/ralf/Documents/"
capacity 5.0 GB
total_size 721.7 MB
percent used 14.1 %
Directory: "/home/ralf/Downloads/"
capacity 5.0 GB
total_size 1.3 GB
percent used 25.7 %