下面的工作代码突然开始失败并出现错误
HTTP 获取 URL 时出错。状态=403,URL=[https://www.valueresearchonline.com/funds/26123/motilal-oswal-flexi-cap-fund-regular-plan]
我通过使用 ( [email protected] / 0987654321 ) 登录 https://www.valueresearchonline.com/
获得了低于 Cookie 值的信息我不明白什么突然坏了?
Jsoup.connect("https://www.valueresearchonline.com/funds/26123/motilal-oswal-flexi-cap-fund-regular-plan")
.timeout(15000)
.userAgent("Mozilla")
.header("Cookie", "PHPSESSID=6d9v48p1i5lpgm7pi75ag0okvq; currency=INR; magnitude=LC; ad=ee5ceff9a39de83a4dfcd9cc96efd7aa04912966; ad=ee5ceff9a39de83a4dfcd9cc96efd7aa04912966; wec=296642702; nobtlgn=714443298; ac=68156306%7C526294102%7C424761678; ac=68156306%7C526294102%7C424761678; _gcl_au=1.1.443352322.1663577511; _gid=GA1.2.307799405.1663577512; _fbp=fb.1.1663577512204.684634655; _clck=fgstbv|1|f50|0; __gads=ID=5ed6c786ebad7ee2-2263761e9bd600b6:T=1663577512:S=ALNI_Mb0p_Gif2EChNCORy7JOdTU7x4kjA; __gpi=UID=000009ce9ba54907:T=1663577512:RT=1663577512:S=ALNI_MbPL5xoHxgY76gU9mzDzJltULL80Q; __cf_bm=iIVI9aabT4vAdAmvQQzQTDDs9z4MPaMB1gv602Vn2rI-1663577514-0-AQ9ZKhXneLwVKm6CKEzLoY2EKcrIlNB82wgEPDw7taV6k/fnqTzp0L5zrpAl0fnkF1dn7Ac1DyNdfOnsgCTjBZx5Y6ia4Pvj2ceyIBfyXcIYpR8JkYTYGHfqPlrncv7k6Q==; alp=VROL; aa=364476%7C230264168%7C651860858; aa=364476%7C230264168%7C651860858; arl=604590238; arl=604590238; PERMA-ALERT=0; pgv=6; _ga=GA1.1.1410692956.1663577512; _ga_N9R425YFBJ=GS1.1.1663577511.1.1.1663577540.31.0.0; _clsk=1prm412|1663577540567|4|1|l.clarity.ms/collect")
.method(Connection.Method.GET)
.execute();
示例 2:失败
Connection.Response initial1 = Jsoup.connect("https://www.valueresearchonline.com/login/?target=%2fmyaccount")
.timeout(60000)
.userAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0")
.data("username", "[email protected]")
.data("password", "0987654321")
.method(Connection.Method.POST)
.execute();
System.out.println("Call 1.1 " + initial1.statusCode());
System.out.println(initial1.cookies().values());
从 chrome 添加登录调试控制台
更新
我可以使用 Chrome 控制台中的 cookie 的硬编码值来实现此功能,但我不确定如何动态获取 cookie 的值?
请检查
第 1 步(硬编码 cookie 值)
String apiUrl = "https://www.valueresearchonline.com/api/check-user/";
// Connect to the API URL
Connection.Response response1 = Jsoup.connect(apiUrl)
.method(Connection.Method.GET)
.timeout(60000)
.header("User-Agent", "Mozilla/5.0")
.header("Accept", "application/json")
//Hard Coded, taken from Chrome after valid email is entered.
//Can we get this cookie value dynamically or runtime ??
.header("Cookie", "HARD_CODED_COOKIE_VALUE_FROM_CHRIME")
.data("q", "[email protected]")
.data("password", "1")
.ignoreContentType(true) // Ignore content type to parse non-HTML response
.execute();
// Get the cookies from the response
Map<String, String> cookies1 = response1.cookies();
第 2 步 - 第 1 步生成的 Cookie 值
Connection.Response response2 = Jsoup.connect("https://www.valueresearchonline.com/login/?target=%2f%3f&utm_source=home&utm_medium=vro&utm_campaign=desktop-profile-menu")
.timeout(60000)
.userAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0")
.header("Cookie", getCookiesString(cookies1))
.header("Host", "www.valueresearchonline.com")
.data("username", "[email protected]")
.data("password", "0987654321")
.method(Connection.Method.POST)
.ignoreContentType(true)
.execute();
如何在不进行硬编码的情况下获取第 1 步 cookie 值?
Jsoup Connection 接口具有允许您声明 cookie 而不是为 Cookie 添加标头的方法
/**
* Set a cookie to be sent in the request.
* @param name name of cookie
* @param value value of cookie
* @return this Connection, for chaining
*/
Connection cookie(String name, String value);
/**
* Adds each of the supplied cookies to the request.
* @param cookies map of cookie name {@literal ->} value pairs
* @return this Connection, for chaining
*/
Connection cookies(Map<String, String> cookies);
通过这样做,您确保只生成一个名称为
Cookie
的标头,并与有效数据一起发送。