我需要获得大小超过2 GB的文件的文件大小。 (在4.6 GB文件上测试)。没有外部程序有没有办法做到这一点?
当前状态:
filesize()
,stat()
和fseek()
失败了fread()
和feof()
的作品通过读取文件内容可以获得文件大小(非常慢!)。
$size = (float) 0;
$chunksize = 1024 * 1024;
while (!feof($fp)) {
fread($fp, $chunksize);
$size += (float) $chunksize;
}
return $size;
我知道如何在64位平台上使用它(使用fseek($fp, 0, SEEK_END)
和ftell()
),但我需要32位平台的解决方案。
解决方案:我已经为此启动了开源项目。
Big File Tools是在PHP中操作超过2 GB的文件所需的hacks集合(即使在32位系统上)。
这是一种可能的方法:
它首先尝试使用适合平台的shell命令(Windows shell替换修饰符或* nix / Mac stat
命令)。如果失败,它会尝试COM(如果在Windows上),最后回到filesize()
。
/*
* This software may be modified and distributed under the terms
* of the MIT license.
*/
function filesize64($file)
{
static $iswin;
if (!isset($iswin)) {
$iswin = (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN');
}
static $exec_works;
if (!isset($exec_works)) {
$exec_works = (function_exists('exec') && !ini_get('safe_mode') && @exec('echo EXEC') == 'EXEC');
}
// try a shell command
if ($exec_works) {
$cmd = ($iswin) ? "for %F in (\"$file\") do @echo %~zF" : "stat -c%s \"$file\"";
@exec($cmd, $output);
if (is_array($output) && ctype_digit($size = trim(implode("\n", $output)))) {
return $size;
}
}
// try the Windows COM interface
if ($iswin && class_exists("COM")) {
try {
$fsobj = new COM('Scripting.FileSystemObject');
$f = $fsobj->GetFile( realpath($file) );
$size = $f->Size;
} catch (Exception $e) {
$size = null;
}
if (ctype_digit($size)) {
return $size;
}
}
// if all else fails
return filesize($file);
}
通过检查filesize()是否返回负数,您无法可靠地获取32位系统上文件的大小,正如一些答案所示。这是因为如果文件在32位系统上介于4到6演出之间,则文件大小将报告正数,然后从6到8为负,然后从8到10为正,依此类推。它以一种说话的方式循环。
因此,您在使用可在32位系统上可靠运行的外部命令时遇到困难。
但是,一个非常有用的工具是能够检查文件大小是否大于特定大小,并且即使是非常大的文件也可以可靠地执行此操作。
以下寻求50兆,并尝试读取一个字节。它在我的低规格测试机器上非常快,即使尺寸远大于2演出也能可靠地工作。
您可以使用它来检查文件是否大于2147483647字节(214位系统上的2147483648是max int),然后以不同方式处理文件或让您的应用程序发出警告。
function isTooBig($file){
$fh = @fopen($file, 'r');
if(! $fh){ return false; }
$offset = 50 * 1024 * 1024; //50 megs
$tooBig = false;
if(fseek($fh, $offset, SEEK_SET) === 0){
if(strlen(fread($fh, 1)) === 1){
$tooBig = true;
}
} //Otherwise we couldn't seek there so it must be smaller
fclose($fh);
return $tooBig;
}
下面的代码适用于任何版本的PHP / OS / Webserver / Platform上的任何文件大小。
// http head request to local file to get file size
$opts = array('http'=>array('method'=>'HEAD'));
$context = stream_context_create($opts);
// change the URL below to the URL of your file. DO NOT change it to a file path.
// you MUST use a http:// URL for your file for a http request to work
// SECURITY - you must add a .htaccess rule which denies all requests for this database file except those coming from local ip 127.0.0.1.
// $tmp will contain 0 bytes, since its a HEAD request only, so no data actually downloaded, we only want file size
$tmp= file_get_contents('http://127.0.0.1/pages-articles.xml.bz2', false, $context);
$tmp=$http_response_header;
foreach($tmp as $rcd) if( stripos(trim($rcd),"Content-Length:")===0 ) $size= floatval(trim(str_ireplace("Content-Length:","",$rcd)));
echo "File size = $size bytes";
// example output
File size = 10082006833 bytes
我迭代了BigFileTools类/答案: -option禁用curl方法,因为某些平台(例如Synology NAS)不支持Curl的FTP协议 -extra非posix,但更精确,执行sizeExec,而不是磁盘上的大小,使用stat而不是du返回实际的文件大小 - 大文件(> 4GB)的大小结果和sizeNativeSeek几乎一样快 -debug消息选项
<?php
/**
* Class for manipulating files bigger than 2GB
* (currently supports only getting filesize)
*
* @author Honza Kuchař
* @license New BSD
* @encoding UTF-8
* @copyright Copyright (c) 2013, Jan Kuchař
*/
class BigFileTools {
/**
* Absolute file path
* @var string
*/
protected $path;
/**
* Use in BigFileTools::$mathLib if you want to use BCMath for mathematical operations
*/
const MATH_BCMATH = "BCMath";
/**
* Use in BigFileTools::$mathLib if you want to use GMP for mathematical operations
*/
const MATH_GMP = "GMP";
/**
* Which mathematical library use for mathematical operations
* @var string (on of constants BigFileTools::MATH_*)
*/
public static $mathLib;
/**
* If none of fast modes is available to compute filesize, BigFileTools uses to compute size very slow
* method - reading file from 0 byte to end. If you want to enable this behavior,
* switch fastMode to false (default is true)
* @var bool
*/
public static $fastMode = true;
//on some platforms like Synology NAS DS214+ DSM 5.1 FTP Protocol for curl is not working or disabled
// you will get an error like "Protocol file not supported or disabled in libcurl"
public static $FTPProtocolCurlEnabled = false;
public static $debug=false; //shows some debug/error messages
public static $posix=true; //more portable but it shows size on disk not actual filesize so it's less accurate: 0..clustersize in bytes inaccuracy
/**
* Initialization of class
* Do not call directly.
*/
static function init() {
if (function_exists("bcadd")) {
self::$mathLib = self::MATH_BCMATH;
} elseif (function_exists("gmp_add")) {
self::$mathLib = self::MATH_GMP;
} else {
throw new BigFileToolsException("You have to install BCMath or GMP. There mathematical libraries are used for size computation.");
}
}
/**
* Create BigFileTools from $path
* @param string $path
* @return BigFileTools
*/
static function fromPath($path) {
return new self($path);
}
static function debug($msg) {
if (self::$debug) echo $msg;
}
/**
* Gets basename of file (example: for file.txt will return "file")
* @return string
*/
public function getBaseName() {
return pathinfo($this->path, PATHINFO_BASENAME);
}
/**
* Gets extension of file (example: for file.txt will return "txt")
* @return string
*/
public function getExtension() {
return pathinfo($this->path, PATHINFO_EXTENSION);
}
/**
* Gets extension of file (example: for file.txt will return "file.txt")
* @return string
*/
public function getFilename() {
return pathinfo($this->path, PATHINFO_FILENAME);
}
/**
* Gets path to file of file (example: for file.txt will return path to file.txt, e.g. /home/test/)
* ! This will call absolute path!
* @return string
*/
public function getDirname() {
return pathinfo($this->path, PATHINFO_DIRNAME);
}
/**
* Gets md5 checksum of file content
* @return string
*/
public function getMd5() {
return md5_file($this->path);
}
/**
* Gets sha1 checksum of file content
* @return string
*/
public function getSha1() {
return sha1_file($this->path);
}
/**
* Constructor - do not call directly
* @param string $path
*/
function __construct($path, $absolutizePath = true) {
if (!static::isReadableFile($path)) {
throw new BigFileToolsException("File not found at $path");
}
if($absolutizePath) {
$this->setPath($path);
}else{
$this->setAbsolutePath($path);
}
}
/**
* Tries to absolutize path and than updates instance state
* @param string $path
*/
function setPath($path) {
$this->setAbsolutePath(static::absolutizePath($path));
}
/**
* Setts absolute path
* @param string $path
*/
function setAbsolutePath($path) {
$this->path = $path;
}
/**
* Gets current filepath
* @return string
*/
function getPath($a = "") {
if(a != "") {
trigger_error("getPath with absolutizing argument is deprecated!", E_USER_DEPRECATED);
}
return $this->path;
}
/**
* Converts relative path to absolute
*/
static function absolutizePath($path) {
$path = realpath($path);
if(!$path) {
// TODO: use hack like http://stackoverflow.com/questions/4049856/replace-phps-realpath or http://www.php.net/manual/en/function.realpath.php#84012
// probaly as optinal feature that can be turned on when you know, what are you doing
throw new BigFileToolsException("Not possible to resolve absolute path.");
}
return $path;
}
static function isReadableFile($file) {
// Do not use is_file
// @link https://bugs.php.net/bug.php?id=27792
// $readable = is_readable($file); // does not always return correct value for directories
$fp = @fopen($file, "r"); // must be file and must be readable
if($fp) {
fclose($fp);
return true;
}
return false;
}
/**
* Moves file to new location / rename
* @param string $dest
*/
function move($dest) {
if (move_uploaded_file($this->path, $dest)) {
$this->setPath($dest);
return TRUE;
} else {
@unlink($dest); // needed in PHP < 5.3 & Windows; intentionally @
if (rename($this->path, $dest)) {
$this->setPath($dest);
return TRUE;
} else {
if (copy($this->path, $dest)) {
unlink($this->path); // delete file
$this->setPath($dest);
return TRUE;
}
return FALSE;
}
}
}
/**
* Changes path of this file object
* @param string $dest
*/
function relocate($dest) {
trigger_error("Relocate is deprecated!", E_USER_DEPRECATED);
$this->setPath($dest);
}
/**
* Size of file
*
* Profiling results:
* sizeCurl 0.00045299530029297
* sizeNativeSeek 0.00052094459533691
* sizeCom 0.0031449794769287
* sizeExec 0.042937040328979
* sizeNativeRead 2.7670161724091
*
* @return string | float
* @throws BigFileToolsException
*/
public function getSize($float = false) {
if ($float == true) {
return (float) $this->getSize(false);
}
$return = $this->sizeCurl();
if ($return) {
$this->debug("sizeCurl succeeded");
return $return;
}
$this->debug("sizeCurl failed");
$return = $this->sizeNativeSeek();
if ($return) {
$this->debug("sizeNativeSeek succeeded");
return $return;
}
$this->debug("sizeNativeSeek failed");
$return = $this->sizeCom();
if ($return) {
$this->debug("sizeCom succeeded");
return $return;
}
$this->debug("sizeCom failed");
$return = $this->sizeExec();
if ($return) {
$this->debug("sizeExec succeeded");
return $return;
}
$this->debug("sizeExec failed");
if (!self::$fastMode) {
$return = $this->sizeNativeRead();
if ($return) {
$this->debug("sizeNativeRead succeeded");
return $return;
}
$this->debug("sizeNativeRead failed");
}
throw new BigFileToolsException("Can not size of file $this->path !");
}
// <editor-fold defaultstate="collapsed" desc="size* implementations">
/**
* Returns file size by using native fseek function
* @see http://www.php.net/manual/en/function.filesize.php#79023
* @see http://www.php.net/manual/en/function.filesize.php#102135
* @return string | bool (false when fail)
*/
protected function sizeNativeSeek() {
$fp = fopen($this->path, "rb");
if (!$fp) {
return false;
}
flock($fp, LOCK_SH);
$result= fseek($fp, 0, SEEK_END);
if ($result===0) {
if (PHP_INT_SIZE < 8) {
// 32bit
$return = 0.0;
$step = 0x7FFFFFFF;
while ($step > 0) {
if (0 === fseek($fp, - $step, SEEK_CUR)) {
$return += floatval($step);
} else {
$step >>= 1;
}
}
}
else { //64bit
$return = ftell($fp);
}
}
else $return = false;
flock($fp, LOCK_UN);
fclose($fp);
return $return;
}
/**
* Returns file size by using native fread function
* @see http://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2gb-file-without-external-program/5504829#5504829
* @return string | bool (false when fail)
*/
protected function sizeNativeRead() {
$fp = fopen($this->path, "rb");
if (!$fp) {
return false;
}
flock($fp, LOCK_SH);
rewind($fp);
$offset = PHP_INT_MAX - 1;
$size = (string) $offset;
if (fseek($fp, $offset) !== 0) {
flock($fp, LOCK_UN);
fclose($fp);
return false;
}
$chunksize = 1024 * 1024;
while (!feof($fp)) {
$read = strlen(fread($fp, $chunksize));
if (self::$mathLib == self::MATH_BCMATH) {
$size = bcadd($size, $read);
} elseif (self::$mathLib == self::MATH_GMP) {
$size = gmp_add($size, $read);
} else {
throw new BigFileToolsException("No mathematical library available");
}
}
if (self::$mathLib == self::MATH_GMP) {
$size = gmp_strval($size);
}
flock($fp, LOCK_UN);
fclose($fp);
return $size;
}
/**
* Returns file size using curl module
* @see http://www.php.net/manual/en/function.filesize.php#100434
* @return string | bool (false when fail or cUrl module not available)
*/
protected function sizeCurl() {
// curl solution - cross platform and really cool :)
if (self::$FTPProtocolCurlEnabled && function_exists("curl_init")) {
$ch = curl_init("file://" . $this->path);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);
if ($data=="" || empty($data)) $this->debug(stripslashes(curl_error($ch)));
curl_close($ch);
if ($data !== false && preg_match('/Content-Length: (\d+)/', $data, $matches)) {
return (string) $matches[1];
}
} else {
return false;
}
}
/**
* Returns file size by using external program (exec needed)
* @see http://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2gb-file-without-external-program/5502328#5502328
* @return string | bool (false when fail or exec is disabled)
*/
protected function sizeExec() {
// filesize using exec
if (function_exists("exec")) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') { // Windows
// Try using the NT substition modifier %~z
$escapedPath = escapeshellarg($this->path);
$size = trim(exec("for %F in ($escapedPath) do @echo %~zF"));
}else{ // other OS
// If the platform is not Windows, use the stat command (should work for *nix and MacOS)
if (self::$posix) {
$tmpsize=trim(exec("du \"".$this->path."\" | cut -f1"));
//du returns blocks/KB
$size=(int)$tmpsize*1024; //make it bytes
}
else $size=trim(exec('stat "'.$this->path.'" | grep -i -o -E "Size: ([0-9]+)" | cut -d" " -f2'));
if (self::$debug) var_dump($size);
return $size;
}
}
return false;
}
/**
* Returns file size by using Windows COM interface
* @see http://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2gb-file-without-external-program/5502328#5502328
* @return string | bool (false when fail or COM not available)
*/
protected function sizeCom() {
if (class_exists("COM")) {
// Use the Windows COM interface
$fsobj = new COM('Scripting.FileSystemObject');
if (dirname($this->path) == '.')
$this->path = ((substr(getcwd(), -1) == DIRECTORY_SEPARATOR) ? getcwd() . basename($this->path) : getcwd() . DIRECTORY_SEPARATOR . basename($this->path));
$f = $fsobj->GetFile($this->path);
return (string) $f->Size;
}
}
// </editor-fold>
}
BigFileTools::init();
class BigFileToolsException extends Exception{}
最简单的方法是简单地为您的号码添加最大值。这意味着在x86平台上长号加2 ^ 32:
if($size < 0) $size = pow(2,32) + $size;
示例:Big_File.exe - 3,30Gb(3.554.287.616 b)您的函数返回-740679680,因此您添加2 ^ 32(4294967296)并获得3554287616。
你得到负数,因为你的系统为负号保留了一位内存,所以你留下2 ^ 31(2.147.483.648 = 2G)的最大值为负数或正数。当系统达到此最大值时,它不会停止,而只是覆盖最后一个保留位,您的号码现在被强制为负数。简单来说,当您超过最大正数时,您将被强制为最大负数,因此2147483648 + 1 = -2147483648。进一步增加朝向零并且再次朝向最大数量。
正如您所看到的那样,它就像一个关闭循环的最高和最低数字的圆圈。
x86架构可以在一个刻度中“消化”的总最大数量是2 ^ 32 = 4294967296 = 4G,因此只要您的数量低于该数量,这个简单的技巧将始终有效。在更高的数字中,您必须知道已经通过循环点的次数,并将其乘以2 ^ 32并将其添加到结果中:
$size = pow(2,32) * $loops_count + $size;
在基本的PHP函数中,这很难做到,因为没有函数会告诉你它已经通过循环点的次数,所以这对于超过4Gigs的文件不起作用。
我写了一个函数,它准确地返回文件大小并且非常快:
function file_get_size($file) {
//open file
$fh = fopen($file, "r");
//declare some variables
$size = "0";
$char = "";
//set file pointer to 0; I'm a little bit paranoid, you can remove this
fseek($fh, 0, SEEK_SET);
//set multiplicator to zero
$count = 0;
while (true) {
//jump 1 MB forward in file
fseek($fh, 1048576, SEEK_CUR);
//check if we actually left the file
if (($char = fgetc($fh)) !== false) {
//if not, go on
$count ++;
} else {
//else jump back where we were before leaving and exit loop
fseek($fh, -1048576, SEEK_CUR);
break;
}
}
//we could make $count jumps, so the file is at least $count * 1.000001 MB large
//1048577 because we jump 1 MB and fgetc goes 1 B forward too
$size = bcmul("1048577", $count);
//now count the last few bytes; they're always less than 1048576 so it's quite fast
$fine = 0;
while(false !== ($char = fgetc($fh))) {
$fine ++;
}
//and add them
$size = bcadd($size, $fine);
fclose($fh);
return $size;
}
我已经开始了名为Big File Tools的项目。 proven适用于Linux,Mac和Windows(甚至是32位变体)。即使对于大文件(> 4GB),它也能提供字节精确的结果。在内部,它使用brick/math - 任意精度算术库。
使用composer安装它。
composer install jkuchar/BigFileTools
并使用它:
<?php
$file = BigFileTools\BigFileTools::createDefault()->getFile(__FILE__);
echo $file->getSize() . " bytes\n";
结果是BigInteger,因此您可以使用结果进行计算
$sizeInBytes = $file->getSize();
$sizeInMegabytes = $sizeInBytes->toBigDecimal()->dividedBy(1024*1024, 2, \Brick\Math\RoundingMode::HALF_DOWN);
echo "Size is $sizeInMegabytes megabytes\n";
Big File Tools内部使用驱动程序可靠地确定所有平台上的确切文件大小。以下是可用驱动程序列表(更新2016-02-05)
| Driver | Time (s) ↓ | Runtime requirements | Platform
| --------------- | ------------------- | -------------- | ---------
| CurlDriver | 0.00045299530029297 | CURL extension | -
| NativeSeekDriver | 0.00052094459533691 | - | -
| ComDriver | 0.0031449794769287 | COM+.NET extension | Windows only
| ExecDriver | 0.042937040328979 | exec() enabled | Windows, Linux, OS X
| NativeRead | 2.7670161724091 | - | -
你可以使用BigFileTools默认选择任何这些或最快的(BigFileTools::createDefault()
)
use BigFileTools\BigFileTools;
use BigFileTools\Driver;
$bigFileTools = new BigFileTools(new Driver\CurlDriver());
<?php
######################################################################
# Human size for files smaller or bigger than 2 GB on 32 bit Systems #
# size.php - 1.1 - 17.01.2012 - Alessandro Marinuzzi - www.alecos.it #
######################################################################
function showsize($file) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
if (class_exists("COM")) {
$fsobj = new COM('Scripting.FileSystemObject');
$f = $fsobj->GetFile(realpath($file));
$file = $f->Size;
} else {
$file = trim(exec("for %F in (\"" . $file . "\") do @echo %~zF"));
}
} elseif (PHP_OS == 'Darwin') {
$file = trim(shell_exec("stat -f %z " . escapeshellarg($file)));
} elseif ((PHP_OS == 'Linux') || (PHP_OS == 'FreeBSD') || (PHP_OS == 'Unix') || (PHP_OS == 'SunOS')) {
$file = trim(shell_exec("stat -c%s " . escapeshellarg($file)));
} else {
$file = filesize($file);
}
if ($file < 1024) {
echo $file . ' Byte';
} elseif ($file < 1048576) {
echo round($file / 1024, 2) . ' KB';
} elseif ($file < 1073741824) {
echo round($file / 1048576, 2) . ' MB';
} elseif ($file < 1099511627776) {
echo round($file / 1073741824, 2) . ' GB';
} elseif ($file < 1125899906842624) {
echo round($file / 1099511627776, 2) . ' TB';
} elseif ($file < 1152921504606846976) {
echo round($file / 1125899906842624, 2) . ' PB';
} elseif ($file < 1180591620717411303424) {
echo round($file / 1152921504606846976, 2) . ' EB';
} elseif ($file < 1208925819614629174706176) {
echo round($file / 1180591620717411303424, 2) . ' ZB';
} else {
echo round($file / 1208925819614629174706176, 2) . ' YB';
}
}
?>
使用方法如下:
<?php include("php/size.php"); ?>
你想要的地方:
<?php showsize("files/VeryBigFile.rar"); ?>
如果你想改进它,欢迎你!
我找到了一个很好的Linux / Unix解决方案,只能用32位php获取大文件的文件大小。
$file = "/path/to/my/file.tar.gz";
$filesize = exec("stat -c %s ".$file);
您应该将$filesize
作为字符串处理。如果filesize大于PHP_INT_MAX,则尝试转换为int会导致filesize = PHP_INT_MAX。
但是,虽然作为字符串处理,但以下人类可读算法的工作原
formatBytes($filesize);
public function formatBytes($size, $precision = 2) {
$base = log($size) / log(1024);
$suffixes = array('', 'k', 'M', 'G', 'T');
return round(pow(1024, $base - floor($base)), $precision) . $suffixes[floor($base)];
}
所以我对大于4 Gb的文件的输出是:
4.46G
$file_size=sprintf("%u",filesize($working_dir."\\".$file));
这适用于Windows Box。
我在这里查看错误日志:https://bugs.php.net/bug.php?id=63618并找到了这个解决方案。
如果您有FTP服务器,可以使用fsockopen:
$socket = fsockopen($hostName, 21);
$t = fgets($socket, 128);
fwrite($socket, "USER $myLogin\r\n");
$t = fgets($socket, 128);
fwrite($socket, "PASS $myPass\r\n");
$t = fgets($socket, 128);
fwrite($socket, "SIZE $fileName\r\n");
$t = fgets($socket, 128);
$fileSize=floatval(str_replace("213 ","",$t));
echo $fileSize;
fwrite($socket, "QUIT\r\n");
fclose($socket);
(在ftp_size页面上发表评论)
您可能希望为您使用的函数添加一些替代方法,例如调用系统函数(如“dir”/“ls”)并从中获取信息。它们当然是安全的主题,你可以检查的东西,最终只能作为最后的手段恢复到慢速方法。
一种选择是寻求2gb标记,然后从那里读取长度......
function getTrueFileSize($filename) {
$size = filesize($filename);
if ($size === false) {
$fp = fopen($filename, 'r');
if (!$fp) {
return false;
}
$offset = PHP_INT_MAX - 1;
$size = (float) $offset;
if (!fseek($fp, $offset)) {
return false;
}
$chunksize = 8192;
while (!feof($fp)) {
$size += strlen(fread($fp, $chunksize));
}
} elseif ($size < 0) {
// Handle overflowed integer...
$size = sprintf("%u", $size);
}
return $size;
}
所以基本上,它寻求PHP中可表示的最大正有符号整数(对于32位系统为2gb),然后使用8kb块从那时开始读取(这应该是最佳内存效率与磁盘传输效率的公平权衡)。
另请注意,我没有添加$chunksize
大小。原因是fread
实际上可能会返回比$chunksize
更多或更少的字节,具体取决于许多可能性。因此,请使用strlen
来确定已解析字符串的长度。
当使用IEEE double(大多数系统)时,低于~4EB(etabytes = 10 ^ 18字节)的文件大小确实适合作为精确数字的两倍(并且在使用标准算术运算时应该没有精度损失)。