来自URL的文件正在从浏览器下载,但在JAVA中出现问题(我正在与此同时发送标题)

问题描述 投票:0回答:1

我正在尝试从JAVA中的HTTPS请求从URL下载文件。下载可以在Web浏览器中完美地进行。但是,当我从请求中带有标头的JAVA连接进行请求时,我遇到了禁止(403)的问题。下面是我的代码,用于下载文件。错误出现在下面的行

InputStream in = connection.getInputStream();

public void DownloadFile(String year,String month,String day) throws IOException {
        try {
            //String tempURL = defaultUrl + "/" + year + "/" + month + "/" + "cm" + day + month + year + "bhav.csv.zip";
            String tempURL = "https://www.nseindia.com/content/historical/EQUITIES/1995/JAN/cm04JAN1995bhav.csv.zip";
            URL url = new URL(tempURL);
            HttpURLConnection connection = (HttpURLConnection) url.openConnection();
            connection.setRequestMethod("GET");
            connection.setRequestProperty("Host","www.nseindia.com:443");
            connection.setRequestProperty("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3");
            connection.setRequestProperty("Sec-Fetch-Mode","navigate");
            connection.setRequestProperty("Sec-Fetch-Site","same-origin");
            connection.setRequestProperty("Sec-Fetch-User","?1");
            connection.setRequestProperty("Upgrade-Insecure-Requests","1");
            connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36");
            InputStream in = connection.getInputStream();
            File file = new File(destFolder + "\\" + day + "_" + month + "_" + year + ".zip");
            file.createNewFile();
            FileOutputStream out = new FileOutputStream(file);
            copy(in, out, 1024);
            connection.disconnect();
            out.close();
            System.out.println("Downloaded ......... " + day + "_" + month + "_" + year + ".zip");
        }catch (Exception ex)
        {
            ex.printStackTrace();
            System.out.println("Not Found ......... " + day + "_" + month + "_" + year + ".zip");
        }

    }



public static void copy(InputStream input, OutputStream output, int bufferSize) throws IOException {
        byte[] buf = new byte[bufferSize];
        int n = input.read(buf);
        while (n >= 0) {
            output.write(buf, 0, n);
            n = input.read(buf);
        }
        output.flush();
    }

我已经使用“实时HTTP标头”捕获了请求标头,该标头是在通过Google Chrome浏览器下载时生成的。

请求和响应头在后面

GET /content/historical/EQUITIES/1995/JAN/cm02JAN1995bhav.csv.zip HTTP/1.1
Host: www.nseindia.com:443
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: same-origin
Sec-Fetch-User: ?1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36

HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 4177
Content-Type: application/zip
Date: Tue, 10 Dec 2019 08:21:48 GMT
ETag: "1051-47ca323fae000"
Last-Modified: Fri, 08 Jan 2010 08:40:32 GMT
Server: Apache
X-FRAME-OPTIONS: SAMEORIGIN

java http-status-code-403
1个回答
0
投票

我遇到了同样的问题,似乎是一种服务器保护措施,可以避免来自Java客户端的请求。但是我不明白它是如何工作的。我还再现了与我的Web浏览器完全相同的请求:标头,Cookie,参数等...

© www.soinside.com 2019 - 2024. All rights reserved.