Web搜集.js页面

问题描述 投票:-1回答:1

我有一个收费控制器,该控制器具有我可以检查的嵌入式Web服务器。另一个周末,我意识到断路器被掀开,电池没电了。因此,我的计划是编写一个Python脚本来检查电池电压,并在电池电压低时提醒我。

在浏览器中打开网页时,网页会动态刷新。换句话说,当我检查HTML时,找不到“当前值”值。

web page inspection

在上图中,我突出显示了

<input type="text" class="majval" name="lblcurrentValue">

在HTML的<head>中,引用了javascript ...

<script type="text/javascript" src="liveview.js"></script>

当前代码尝试:

#!/usr/bin/env python3


import bs4 as bs
import sys
import urllib.request
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl

class Page(QWebEnginePage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebEnginePage.__init__(self)
        self.html = ''
        self.loadFinished.connect(self._on_load_finished)
        self.load(QUrl(url))
        self.app.exec_()

    def _on_load_finished(self):
        self.html = self.toHtml(self.Callable)
        print('Load finished')

    def Callable(self, html_str):
        self.html = html_str
        self.app.quit()


def main():
    page = Page('http://192.168.86.24/')
    soup = bs.BeautifulSoup(page.html, 'html.parser')
    js_test = soup.find('input', class_='majval')
    print(js_test)

if __name__ == '__main__': main()

运行上述程序时,我得到

<input class="majval" name="lblcurrentValue" type="text"/>

...回到我的起点。 (当我将其用作原始提供的代码时,此代码有效。)>

所以,我在做什么错?我想激活liveview.js代码(我认为),以便填充lblcurrentValue的表格。

编辑

这里是汤的HTML输出=。

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta content="text/html; charset=utf-8" http-equiv="content-type"/>
<meta content="" name="TriStarLive"/>
<title>TriStar - Live Data</title>
<link href="favicon.ico" rel="shortcut icon" type="image/x-icon"/>
<link href="ss.css" rel="stylesheet" type="text/css"/><!--[if lt IE 8]><link href="ssie7.css" rel="stylesheet" type="text/css"><![endif]--><!--[if gte IE 8]><link href="ssie.css" rel="stylesheet" type="text/css"><![endif]--><!-- Copyright 2009, Morningstar Corporation -->
<!--[if lt IE 8]>
<link href="ssie7.css" rel="stylesheet" type="text/css">
<![endif]-->
<!--[if gte IE 8]>
<link href="ssie.css" rel="stylesheet" type="text/css">
<![endif]-->
<!--  Copyright 2014, Morningstar Corporation    -->
<script src="MBID.js" type="text/javascript"></script>
<script src="utilities.js" type="text/javascript"></script>
<script src="product.js" type="text/javascript"></script>
<script src="liveview.js" type="text/javascript"></script>
</head>
<body onload="LVInit('TriStar')">
<div id="menuid"><div class="idTopBorder"></div><div id="idTopBar"><div class="content"><div class="right"><h3>TriStar</h3></div><a href="http://www.morningstarcorp.com/"><img alt="logo" class="image" src="MSLogo.png"/><h1><br/></h1><h2></h2></a></div></div><div class="PrimNav"><ul><li><a class="first" href="liveview.html">Live View</a></li><li><a class="" href="network.html">Network</a></li><li><a class="" href="datalog.html">Data Log</a></li><li><a class="" href="system.html">System</a></li></ul></div></div>
<div class="idPage">
<div class="lvcont_TS">
<h5>Live Data View</h5>
<div class="fsLVMajL">
<div class="fsLVtitle">Battery</div>
<form action="" name="fD0"><div class="valdisp"> <input class="majlbl" name="lblDataName" type="text" value=""/><input class="majval" name="lblcurrentValue" type="text"/> </div></form>
<form action="" name="fD1"><div class="valdisp"> <input class="majlbl" name="lblDataName" type="text" value=""/><input class="majval" name="lblcurrentValue" type="text"/> </div></form>
<form action="" name="fD2"><div class="valdisp"> <input class="majlbl" name="lblDataName" type="text" value=""/><input class="majval" name="lblcurrentValue" type="text"/> </div></form>
</div>
<div class="fsLVMajR">
<div class="fsLVtitle">Array</div>
<form action="" name="fD3"><div class="valdisp"> <input class="majlbl" name="lblDataName" type="text" value=""/><input class="majval" name="lblcurrentValue" type="text"/> </div></form>
<form action="" name="fD4"><div class="valdisp"> <input class="majlbl" name="lblDataName" type="text" value=""/><input class="majval" name="lblcurrentValue" type="text"/> </div></form>
<form action="" name="fD5"><div class="valdisp"> <input class="majlbl" name="lblDataName" type="text" value=""/><input class="majval" name="lblcurrentValue" type="text"/> </div></form>
</div>
<div class="fsLVMajL">
<div class="fsLVtitle">Temperatures</div>
<form action="" name="fDBT"><div class="valdisp">
<input class="majlbl" name="lblDataName" type="text" value=""/>
<input class="majval" name="lblcurrentValue" type="text"/></div></form>
<form action="" name="fDHST"><div class="valdisp">
<input class="majlbl" name="lblDataName" type="text" value=""/>
<input class="majval" name="lblcurrentValue" type="text"/></div></form>
</div>
<div class="fsLVMajR">
<div class="fsLVtitle">Resettable Counters</div>
<form action="" name="fD6"><div class="valdisp">
<input class="majlbl" name="lblDataName" type="text" value=""/>
<input class="majval" name="lblcurrentValue" type="text"/></div></form>
<form action="" name="fD7"><div class="valdisp">
<input class="majlbl" name="lblDataName" type="text" value=""/>
<input class="majval" name="lblcurrentValue" type="text"/></div></form>
</div>
<div id="fsLVMajW">
<div class="fsLVtitle">Errors</div>
<form action="" name="fDAlarms"><div class="errvaldisp">
<input class="errmajlblL_TS" name="lblError" type="text" value=""/>
<input class="errmajlblR_TS" name="lblAlarm" type="text" value=""/>
<textarea class="majvaltextL" cols="1" name="lblvalError" rows="8"></textarea>
<textarea class="majvaltextR_TS" cols="1" name="lblvalAlarm" rows="8"></textarea>
</div></form>
</div>
<div id="errdiv"></div>
<br/>
<br/>
</div>
<form action="" name="flastU"><div class="clastU">
<input class="lastU" name="valLastU" type="text" value=""/>
</div></form>
</div><!-- idPage -->
<div id="footid"><div class="FootBar"></div><div id="idFooter"><div id="idFooterContent"><table style="width:100%"><tbody><tr><td width="120">TriStar</td><td width="110">Version v01.04.14</td><td width="130"></td><td width="120">EMC-1</td><td>Version v01.01.09  Build 1</td></tr><tr><td>Serial #17040126</td><td></td><td></td><td>Serial #17100061</td><td></td></tr></tbody></table></div> </div><br/><a href="http://www.morningstarcorp.com/"></a><p style="text-align:center"><a href="http://www.morningstarcorp.com/">© Copyright 2016 Morningstar Corporation</a><br/></p></div>
</body></html>

我有一个收费控制器,该控制器具有我可以检查的嵌入式Web服务器。另一个周末,我意识到断路器被掀开,电池没电了。所以,我的计划是...

python web-scraping
1个回答
0
投票

您知道吗???我目前被卡在相同的泡菜中...

© www.soinside.com 2019 - 2024. All rights reserved.