为什么总是将a0:附加到selenium pagesource中的所有标记

问题描述 投票:2回答:1

我在java中使用selenium webdriver来获取url https://www.kapanlagi.com/的页面源,以便我可以在网页上自动执行某些操作。不幸的是,当我使用driver.getPageSource();我可以得到源代码,但它有一个附加到所有标签的a0:如下所示。源代码示例如下:

<a0:meta charset="utf-8" />
<a0:meta content="no-cache" http-equiv="Cache-Control" />
<a0:meta content="no-cache" http-equiv="Pragma" />
<a0:meta content="Tue, 22 Jan 2013 02:30:01 GMT" http-equiv="Expires" />
<a0:meta content="900" http-equiv="Refresh" />
<a0:meta content="KapanLagi.com, situs entertainment terbesar di Indonesia. Berita, gosip, resensi film &amp; musik, foto, game, kartu ucapan, dan banyak lagi. Kalau bukan sekarang, Kapan Lagi?" name="description" />
<a0:meta content="berita, infotainment, gossip, gosip, artis, artis indonesia, indonesia, game, entertainment, film, bioskop, resensi, musik, zodiac, kartu ucapan, kartu, kartu lebaran" name="keywords" />
<a0:meta content="1048538409" property="fb:admins" />
<a0:meta content="166048096750307" property="fb:app_id" />

<a0:link href="/manifest.json" rel="manifest" />
<a0:link rel="shortcut icon" href="https://cdns.klimg.com/kapanlagi.com/v5/i/favicon.ico" />
<a0:link href="https://cdns.klimg.com/" rel="dns-prefetch" />
<a0:link href="/feed/entertainment.xml" title="KapanLagi.com Atom Feed" type="application/atom+xml" rel="alternate" />
<a0:link href="https://m.kapanlagi.com/" media="only screen and (max-width: 640px)" rel="alternate" />  
<a0:link href="https://www.kapanlagi.com/" rel="canonical" />
<a0:link href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon.png" rel="apple-touch-icon" />
<a0:link href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-precomposed.png" rel="apple-touch-icon" />
<a0:link href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-114x114-precomposed.png" rel="apple-touch-icon" />
<a0:link href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-120x120-precomposed.png" rel="apple-touch-icon" />    
<a0:link href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-152x152-precomposed.png" rel="apple-touch-icon" />
<a0:title>Kalau Bukan Sekarang, Kapan Lagi? - KapanLagi.com</a0:title>
java selenium selenium-webdriver
1个回答
0
投票

你没有提到你正在使用的二进制文件的版本,但使用Selenium Java客户端v3.9.1,GeckoDriver v0.19.1和Firefox Quantum v58.0.2(64位)我能够看到一个没有任何前缀的正确的PageSource a0:如下:

  • 代码块: System.setProperty("webdriver.gecko.driver", "C:\\Utility\\BrowserDrivers\\geckodriver.exe"); WebDriver driver = new FirefoxDriver(); driver.get("https://www.kapanlagi.com/"); System.out.println(driver.getPageSource());
  • 控制台输出: 1520062739574 geckodriver INFO geckodriver 0.19.1 1520062739607 geckodriver INFO Listening on 127.0.0.1:12306 1520062740588 mozrunner::runner INFO Running command: "C:\\Program Files\\Mozilla Firefox\\firefox.exe" "-marionette" "-profile" "C:\\Users\\ATECHM~1\\AppData\\Local\\Temp\\rust_mozprofile.R5Wv9lx9f5K5" 1520062744680 Marionette INFO Enabled via --marionette 1520062762429 Marionette INFO Listening on port 2481 1520062763089 Marionette WARN TLS certificate errors will be ignored for this session Mar 03, 2018 1:09:23 PM org.openqa.selenium.remote.ProtocolHandshake createSession INFO: Detected dialect: W3C <html xmlns="https://www.w3.org/1999/xhtml" xml:lang="en" class="firefox" lang="en"><head> <meta charset="utf-8"> <meta http-equiv="Cache-Control" content="no-cache"> <meta http-equiv="Pragma" content="no-cache"> <meta http-equiv="Expires" content="Tue, 22 Jan 2013 02:30:01 GMT"> <meta http-equiv="Refresh" content="900"> <meta name="description" content="KapanLagi.com, situs entertainment terbesar di Indonesia. Berita, gosip, resensi film &amp; musik, foto, game, kartu ucapan, dan banyak lagi. Kalau bukan sekarang, Kapan Lagi?"> <meta name="keywords" content="berita, infotainment, gossip, gosip, artis, artis indonesia, indonesia, game, entertainment, film, bioskop, resensi, musik, zodiac, kartu ucapan, kartu, kartu lebaran"> <meta property="fb:admins" content="1048538409"> <meta property="fb:app_id" content="166048096750307"> <link rel="manifest" href="/manifest.json"> <link href="https://cdns.klimg.com/kapanlagi.com/v5/i/favicon.ico" rel="shortcut icon"> <link rel="dns-prefetch" href="https://cdns.klimg.com/"> <link rel="alternate" type="application/atom+xml" title="KapanLagi.com Atom Feed" href="/feed/entertainment.xml"> <link rel="alternate" media="only screen and (max-width: 640px)" href="https://m.kapanlagi.com/"> <link rel="canonical" href="https://www.kapanlagi.com/"> <link rel="apple-touch-icon" href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon.png"> <link rel="apple-touch-icon" href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-precomposed.png"> <link rel="apple-touch-icon" href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-114x114-precomposed.png"> <link rel="apple-touch-icon" href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-120x120-precomposed.png"> <link rel="apple-touch-icon" href="https://cdns.klimg.com/kapanlagi.com/v5/i/channel/apple-touch-icon-152x152-precomposed.png"> <title>Kalau Bukan Sekarang, Kapan Lagi? - KapanLagi.com</title>
© www.soinside.com 2019 - 2024. All rights reserved.