resttemplate 无法从服务器检索文件的完整数据

Question

我的项目使用的是sping boot 2.3.3.RELEASE。

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.3.3.RELEASE</version>
    <relativePath/>
</parent>

我正在尝试使用

Resttemplate

从服务器检索文件。

我的程序可以很好地处理小文件。但我的程序有时无法正确获取更大的文件。其中一个示例文件的文件大小约为 12.9M（由

13572529

显示

ls -l

）。我下载的文件大小总是变化。

我可以使用

curl

正确检索文件。

我使用的代码：

    @Override
    public String downloadFile(String logId, String urlStr, String fName, String localPath) {
        long begin = System.currentTimeMillis();
        String filePath = null;
        try {
            log.info("logId={}-Start to download, fName={}, data={}, filePath={}", logId, fName, urlStr, localPath);

            
            File dir = new File(localPath);
            if (!dir.exists()) {// Check whether the folder exists
                dir.mkdir();
            }

            filePath = localPath + fName;

            File file = new File(filePath);
            // Do not download the target file if it exists
            if (file.exists()){
                return filePath;
            }

            RequestCallback requestCallback = request -> request.getHeaders()
                    .setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL));

            // Fetch data as stream instead loading all data into memory
            String finalFilePath = filePath;
            RestTemplate restTemplate = new RestTemplate();
            restTemplate.execute(urlStr, HttpMethod.GET, requestCallback, clientHttpResponse -> {
                
                long contentLength = clientHttpResponse.getHeaders().getContentLength();
                log.info("logId={}-Content-Length in Response Header: {}", logId, contentLength);
                StreamUtils.copy(clientHttpResponse.getBody(), new FileOutputStream(finalFilePath));
                log.info("logId={}-Content-Length in Response Header: {}- file length: {}", logId, contentLength ,file.length());
                return null;
            });

            log.info("logId={}-download success, fName={}, ursStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);

        } catch (Exception e) {
            log.error("logId={}-download exception, fName={}, urlStr={}, e=", logId, fName, urlStr, e);
            return null;
        }

        log.info("logId={}-download success, fName={}, urlStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);
        return filePath;
    }

在日志中，响应头中的内容长度始终是正确的，即

13572529

。但文件长度总是在变化。

我复制

StreamUtils.copy()

的代码来记录流大小。

    @Override
    public String downloadFile(String logId, String urlStr, String fName, String localPath) {
        long begin = System.currentTimeMillis();
        String filePath = null;
        try {
            log.info("logId={}-Start to download, fName={}, data={}, filePath={}", logId, fName, urlStr, localPath);

            
            File dir = new File(localPath);
            if (!dir.exists()) {// Check whether the folder exists
                dir.mkdir();
            }

            filePath = localPath + fName;

            File file = new File(filePath);
            // Do not download the target file if it exists
            if (file.exists()){
                return filePath;
            }

            RequestCallback requestCallback = request -> request.getHeaders()
                    .setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL));

            // Fetch data as stream instead loading all data into memory
            String finalFilePath = filePath;
            RestTemplate restTemplate = new RestTemplate();
            restTemplate.execute(urlStr, HttpMethod.GET, requestCallback, clientHttpResponse -> {
                
                long contentLength = clientHttpResponse.getHeaders().getContentLength();
                log.info("logId={}-Content-Length in Response Header: {}", logId, contentLength);
                copy(clientHttpResponse.getBody(), new FileOutputStream(finalFilePath));
                log.info("logId={}-Content-Length in Response Header: {}- file length: {}", logId, contentLength ,file.length());
                return null;
            });

            log.info("logId={}-download success, fName={}, ursStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);

        } catch (Exception e) {
            log.error("logId={}-download exception, fName={}, urlStr={}, e=", logId, fName, urlStr, e);
            return null;
        }

        log.info("logId={}-download success, fName={}, urlStr={}, filePath={}, time cost={}ms", logId, fName, urlStr, filePath, System.currentTimeMillis() - begin);
        return filePath;
    }


    public static int copy(InputStream in, OutputStream out) throws IOException {
        Assert.notNull(in, "No InputStream specified");
        Assert.notNull(out, "No OutputStream specified");
        log.info("inputStream.................|{}", in.available()); // Always changes and less than content length. and less than byteCount
        int byteCount = 0;
        byte[] buffer = new byte[4096]; // 4096 is the value of BUFFER_SIZE in StreamUtils.copy()
        int bytesRead;
        while ((bytesRead = in.read(buffer)) != -1) {
            out.write(buffer, 0, bytesRead);
            byteCount += bytesRead;
        }
        out.flush();
        log.info("inputStream.................|{}-{}", in.available(), byteCount); // in.available() is always 0. byteCount sometimes can get the correct value, `13572529`.
        return byteCount;
    }

这部分经过多次尝试后的日志：

inputStream.................|157672
inputStream.................|0-314056
inputStream.................|14320
inputStream.................|0-2206592
inputStream.................|32615
inputStream.................|0-13572529
inputStream.................|14320
inputStream.................|0-546655

in.available()

显示了输入流的大小，即

clientHttpResponse.getBody()

，但为什么大小总是小于byteCount和content-length。

我阅读了文章 spring-resttemplate-download-large-file、download-large-file-through-spring-rest-template、spring-resttemplate-large-files-contentlength-auto-changed、download -来自服务器的大文件使用rest-template-java-spring-mvc。我仍然没有找到我的解决方案，并对日志感到困惑。

如何获得较大文件的正确大小？谢谢您的考虑。

Answer 1

您可以从一些最佳实践开始（假设您无法将 Sprint Boot 从旧的 2020 年 8 月 v2.3.3 升级到 v3.0.0+ 版本，目前为 v3.2.1）：

将大文件写入磁盘时使用
```
BufferedOutputStream
```
。这可以减少磁盘写入操作的数量，而磁盘写入操作可能成为瓶颈。我理解 Eugen Baeldung 的文章“通过 Spring RestTemplate 下载大文件”确实使用了
```
StreamUtils.copy
```
，但是
```
BufferedOutputStream
```
在 Java 中处理文件 I/O 时仍然是一个很好的实践，尤其是对于大型文件文件。
添加一些错误处理：确保文件下载过程中的任何异常都得到正确处理，并考虑在失败时实现重试机制。
使用
```
SimpleClientHttpRequestFactory
```
并将
bufferRequestBody
```
 设置为 false
```
：它可以防止整个请求正文加载到内存中，这可能会导致大文件出现内存不足错误。这是来自您提到的文章：“
Spring Rest 模板允许高效下载大文件”，来自 Charlotte Dennis。是的，自 Framework 6.1 以来，
setBufferRequestBody
```
 已被弃用，但它仍然是当前 Spring Boot 版本的正确方法。
```
为连接和读取操作设置适当的超时值，以防止进程无限期挂起。

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpMethod;
import org.springframework.http.MediaType;
import org.springframework.http.client.SimpleClientHttpRequestFactory;
import org.springframework.http.client.ClientHttpResponse;
import org.springframework.web.client.RequestCallback;
import org.springframework.web.client.ResponseExtractor;
import org.springframework.web.client.RestTemplate;
import org.springframework.util.StreamUtils;

import java.io.BufferedOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.Arrays;

public class FileDownloader {

    private final RestTemplate restTemplate;

    @Autowired
    public FileDownloader(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    public String downloadFile(String urlStr, String localPath, String fileName) {
        long begin = System.currentTimeMillis();
        String filePath = localPath + fileName;
        Path path = Paths.get(filePath);

        SimpleClientHttpRequestFactory requestFactory = new SimpleClientHttpRequestFactory();
        requestFactory.setBufferRequestBody(false);
        requestFactory.setConnectTimeout(10000); // 10 seconds
        requestFactory.setReadTimeout(60000); // 60 seconds
        restTemplate.setRequestFactory(requestFactory);

        RequestCallback requestCallback = request -> request.getHeaders()
                .setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL));

        ResponseExtractor<Void> responseExtractor = new ResponseExtractor<Void>() {
            @Override
            public Void extractData(ClientHttpResponse response) throws IOException {
                try (BufferedOutputStream outputStream = new BufferedOutputStream(Files.newOutputStream(path, StandardOpenOption.CREATE, StandardOpenOption.WRITE))) {
                    StreamUtils.copy(response.getBody(), outputStream);
                }
                return null;
            }
        };

        try {
            restTemplate.execute(urlStr, HttpMethod.GET, requestCallback, responseExtractor);
            long end = System.currentTimeMillis();
            System.out.println("Download completed. Time taken: " + (end - begin) + " ms");
        } catch (Exception e) {
            System.err.println("Error occurred during file download: " + e.getMessage());
            return null;
        }

        return filePath;
    }
}

还要考虑验证服务器处理大文件请求的能力，并确保没有可能干扰或截断文件下载的中介（如代理或负载均衡器）。

您可以改进
downloadFile()

 功能，在下载后验证文件（例如，通过校验和）以确保其完整且未损坏。

resttemplate 无法从服务器检索文件的完整数据

问题描述投票：0回答：1

1个回答

最新问题

resttemplate 无法从服务器检索文件的完整数据

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1