我的项目使用的是sping boot 2.3.3.RELEASE。
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.3.RELEASE</version>
<relativePath/>
</parent>
我正在尝试使用
Resttemplate
从服务器检索文件。
我的程序可以很好地处理小文件。但我的程序有时无法正确获取更大的文件。其中一个示例文件的文件大小约为 12.9M(由
13572529
显示 ls -l
)。我下载的文件大小总是变化。
我可以使用
curl
正确检索文件。
我使用的代码:
@Override
public String downloadFile(String logId, String urlStr, String fName, String localPath) {
long begin = System.currentTimeMillis();
String filePath = null;
try {
log.info("logId={}-Start to download, fName={}, data={}, filePath={}", logId, fName, urlStr, localPath);
File dir = new File(localPath);
if (!dir.exists()) {// Check whether the folder exists
dir.mkdir();
}
filePath = localPath + fName;
File file = new File(filePath);
// Do not download the target file if it exists
if (file.exists()){
return filePath;
}
RequestCallback requestCallback = request -> request.getHeaders()
.setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL));
// Fetch data as stream instead loading all data into memory
String finalFilePath = filePath;
RestTemplate restTemplate = new RestTemplate();
restTemplate.execute(urlStr, HttpMethod.GET, requestCallback, clientHttpResponse -> {
long contentLength = clientHttpResponse.getHeaders().getContentLength();
log.info("logId={}-Content-Length in Response Header: {}", logId, contentLength);
StreamUtils.copy(clientHttpResponse.getBody(), new FileOutputStream(finalFilePath));
log.info("logId={}-Content-Length in Response Header: {}- file length: {}", logId, contentLength ,file.length());
return null;
});
log.info("logId={}-download success, fName={}, ursStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);
} catch (Exception e) {
log.error("logId={}-download exception, fName={}, urlStr={}, e=", logId, fName, urlStr, e);
return null;
}
log.info("logId={}-download success, fName={}, urlStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);
return filePath;
}
在日志中,响应头中的内容长度始终是正确的,即
13572529
。但文件长度总是在变化。
我复制
StreamUtils.copy()
的代码来记录流大小。
@Override
public String downloadFile(String logId, String urlStr, String fName, String localPath) {
long begin = System.currentTimeMillis();
String filePath = null;
try {
log.info("logId={}-Start to download, fName={}, data={}, filePath={}", logId, fName, urlStr, localPath);
File dir = new File(localPath);
if (!dir.exists()) {// Check whether the folder exists
dir.mkdir();
}
filePath = localPath + fName;
File file = new File(filePath);
// Do not download the target file if it exists
if (file.exists()){
return filePath;
}
RequestCallback requestCallback = request -> request.getHeaders()
.setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL));
// Fetch data as stream instead loading all data into memory
String finalFilePath = filePath;
RestTemplate restTemplate = new RestTemplate();
restTemplate.execute(urlStr, HttpMethod.GET, requestCallback, clientHttpResponse -> {
long contentLength = clientHttpResponse.getHeaders().getContentLength();
log.info("logId={}-Content-Length in Response Header: {}", logId, contentLength);
copy(clientHttpResponse.getBody(), new FileOutputStream(finalFilePath));
log.info("logId={}-Content-Length in Response Header: {}- file length: {}", logId, contentLength ,file.length());
return null;
});
log.info("logId={}-download success, fName={}, ursStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);
} catch (Exception e) {
log.error("logId={}-download exception, fName={}, urlStr={}, e=", logId, fName, urlStr, e);
return null;
}
log.info("logId={}-download success, fName={}, urlStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);
return filePath;
}
public static int copy(InputStream in, OutputStream out) throws IOException {
Assert.notNull(in, "No InputStream specified");
Assert.notNull(out, "No OutputStream specified");
log.info("inputStream.................|{}", in.available()); // Always changes and less than content length. and less than byteCount
int byteCount = 0;
byte[] buffer = new byte[4096]; // 4096 is the value of BUFFER_SIZE in StreamUtils.copy()
int bytesRead;
while ((bytesRead = in.read(buffer)) != -1) {
out.write(buffer, 0, bytesRead);
byteCount += bytesRead;
}
out.flush();
log.info("inputStream.................|{}-{}", in.available(), byteCount); // in.available() is always 0. byteCount sometimes can get the correct value, `13572529`.
return byteCount;
}
这部分经过多次尝试后的日志:
inputStream.................|157672
inputStream.................|0-314056
inputStream.................|14320
inputStream.................|0-2206592
inputStream.................|32615
inputStream.................|0-13572529
inputStream.................|14320
inputStream.................|0-546655
in.available()
显示了输入流的大小,即clientHttpResponse.getBody()
,但为什么大小总是小于byteCount和content-length。
我阅读了文章 spring-resttemplate-download-large-file、download-large-file-through-spring-rest-template、spring-resttemplate-large-files-contentlength-auto-changed、download -来自服务器的大文件使用rest-template-java-spring-mvc。我仍然没有找到我的解决方案,并对日志感到困惑。
如何获得较大文件的正确大小?谢谢您的考虑。
您可以从一些最佳实践开始(假设您无法将 Sprint Boot 从旧的 2020 年 8 月 v2.3.3 升级到 v3.0.0+ 版本,目前为 v3.2.1):
BufferedOutputStream
。这可以减少磁盘写入操作的数量,而磁盘写入操作可能成为瓶颈。我理解 Eugen Baeldung 的文章“通过 Spring RestTemplate 下载大文件”确实使用了 StreamUtils.copy
,但是 BufferedOutputStream
在 Java 中处理文件 I/O 时仍然是一个很好的实践,尤其是对于大型文件文件。
添加一些错误处理:确保文件下载过程中的任何异常都得到正确处理,并考虑在失败时实现重试机制。
SimpleClientHttpRequestFactory
并将 bufferRequestBody
设置为 false:它可以防止整个请求正文加载到内存中,这可能会导致大文件出现内存不足错误。这是来自您提到的文章:“Spring Rest 模板允许高效下载大文件”,来自 Charlotte Dennis。 是的,自 Framework 6.1 以来,
setBufferRequestBody
已被弃用,但它仍然是当前 Spring Boot 版本的正确方法。
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpMethod;
import org.springframework.http.MediaType;
import org.springframework.http.client.SimpleClientHttpRequestFactory;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.web.client.RequestCallback;
import org.springframework.web.client.ResponseExtractor;
import org.springframework.web.client.RestTemplate;
import org.springframework.util.StreamUtils;
import java.io.BufferedOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.Arrays;
public class FileDownloader {
private final RestTemplate restTemplate;
@Autowired
public FileDownloader(RestTemplate restTemplate) {
this.restTemplate = restTemplate;
}
public String downloadFile(String urlStr, String localPath, String fileName) {
long begin = System.currentTimeMillis();
String filePath = localPath + fileName;
Path path = Paths.get(filePath);
SimpleClientHttpRequestFactory requestFactory = new SimpleClientHttpRequestFactory();
requestFactory.setBufferRequestBody(false);
requestFactory.setConnectTimeout(10000); // 10 seconds
requestFactory.setReadTimeout(60000); // 60 seconds
restTemplate.setRequestFactory(requestFactory);
RequestCallback requestCallback = request -> request.getHeaders()
.setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL));
ResponseExtractor<Void> responseExtractor = new ResponseExtractor<Void>() {
@Override
public Void extractData(ClientHttpResponse response) throws IOException {
try (BufferedOutputStream outputStream = new BufferedOutputStream(Files.newOutputStream(path, StandardOpenOption.CREATE, StandardOpenOption.WRITE))) {
StreamUtils.copy(response.getBody(), outputStream);
}
return null;
}
};
try {
restTemplate.execute(urlStr, HttpMethod.GET, requestCallback, responseExtractor);
long end = System.currentTimeMillis();
System.out.println("Download completed. Time taken: " + (end - begin) + " ms");
} catch (Exception e) {
System.err.println("Error occurred during file download: " + e.getMessage());
return null;
}
return filePath;
}
}
还要考虑验证服务器处理大文件请求的能力,并确保没有可能干扰或截断文件下载的中介(如代理或负载均衡器)。
您可以改进 downloadFile()
功能,在下载后验证文件(例如,通过校验和)以确保其完整且未损坏。