我正在尝试编写一个简单的代码来访问 rpachallenge.com,切换到电影搜索选项卡,搜索电影,然后将三部电影的标题卡图像、标题和描述下载到 JSON 文件。到目前为止,我可以做所有事情,包括搜索,但我无法找到标题卡的唯一有效定位器。
当我检查页面时,三个生成的电影中的每一个都有如下代码:
<div _ngcontent-c3 class="cardItem">
<div _ngcontent-3 class="col s4 ">
<!--bindings={"ng-reflect-ng-if":"/d5NXSklXo0qyIYkgV94XAgMIckC.j"}-->
<div _ngcontent-c3 class="card col s12"> =
<div _ngcontent-c3 class="card-image">
<img _ngcontent-c3 src="http://image.tmdb.org/t/p/w500//d5NXSklXo0qyIYkgV94XAgMIckC.jpg">
<!--bindings={ "ng-reflect-ng-if": "true" >-->
...
</div>
<div _ngcontent-c3 class="card-content">
<span _ngcontent-c3 class="card-title activator">Dune </span> =
...
<p _ngcontent-c3> Paul Atreides, a brilliant and gifted young man born into a great dest...</p>
</div>
</div>
</div>
</div>
除了(d5NXSklXo0qyIYkgV94XAgMIckC.j)、图像源(http:////image.tmdb.org/t/p/w500//d5NXSklXo0qyIYkgV94XAgMIckC.jpg)、电影标题之外,所有代码都是相同的(沙丘),以及简短的电影描述(Paul Atreides,一位出色的......)。
当我尝试使用其 class="card-image" 查找图像时,出现错误
WebDriverException: ... {"code":-32000,"message":"Cannot take screenshot with 0 height."}
查看后,我发现收到此错误,因为多个元素都以这种方式标记。但是,我找不到标题卡图像或我正在寻找的任何其他元素的唯一定位器,因为显示的所有三部电影看起来都具有相同的代码。
我认为这应该可以使用 Selenium 库中的关键字来实现,但我对此是全新的,找不到我需要的东西。这是我的代码,包括我尝试过并注释掉的一些内容。正如你所看到的,我不知道我在做什么......任何帮助将不胜感激。
*** Settings ***
Documentation Searches for given movie on RPA Challenge site
... and saves resulting data in JSON.
Library OperatingSystem
Library RPA.Browser.Selenium
Suite Teardown Close All Browsers
*** Variables ***
${RPA_Challenge_URL} https://rpachallenge.com
${img_URL} http://image.tmdb.org
${movie} Dune
*** Keywords ***
Open RPA Challenge Page
${use_chrome} = Get Environment Variable USE_CHROME ${EMPTY}
IF "${use_chrome}" != ""
Open Available Browser ${RPA_Challenge_URL} browser_selection=Chrome
... download=${True} # forces Chrome and matching webdriver download
ELSE
Open Available Browser ${RPA_Challenge_URL} # opens any available browser
END
Switch to Movie Search Tab
Click Link xpath://a[text()='Movie Search']
Search for Given Movie
Input Text name:searchStr ${movie}
Press Keys name:searchStr ENTER
Wait Until Page Contains Element css:div[class='card-image']
# WebDriverException: ... {"code":-32000,"message":"Cannot take screenshot with 0 height."}
Save Movie Card Image
# Assign ID to Element //ul[@class='card-image' and ./li[contains(., '_ngcontent-c3')]] movie_cover
Capture Element Screenshot css:div[class='card-image']
# Capture Element Screenshot xpath://a[contains(text(), 'image.tmdb.org'.)]
# Capture Element Screenshot xpath://a[tag:img]
# Capture Element Screenshot css:div[class='card-image' and contains(text(), 'image')] # invalid
# Element with locator 'css:div[class='card-title-activator']' not found.
Save Movie Title
Capture Element Screenshot css:div[class='card-title-activator']
# Return JSON
# print(driver.find_element_by_xpath("//div[@id='json']").text)
# Does the above need to go in a .py file?
*** Tasks ***
Return movie information from RPA Challenge site
TRY
Open RPA Challenge Page
Switch to Movie Search Tab
Search for Given Movie
# Save Movie Title
Save Movie Card Image
EXCEPT
${err_ss} = Set Variable ${OUTPUT_DIR}${/}error.png
Capture Page Screenshot ${err_ss}
Fail Checkout the screenshot: ${err_ss}
END
您可以使用
FOR
循环(文档)来获取标题、描述等。例如,我可以使用以下代码获取所有电影标题作为列表:
Get movie titles
# on a 'Search for Movies' page
${mymovietitles} Create List # creates an empty list
${titleselements}= SeleniumLibrary.Get WebElements xpath=//span[@class='card-title activator'] # targets all 3 (Dune) movie titles
FOR ${titleelement} IN @{titleselements}
${title}= SeleniumLibrary.Get Text ${titleelement}
Collections.Append To List ${mymovietitles} ${title}
END
Log To Console My movie titles are ${mymovietitles}
输出:
...My movie titles are ['Dune', 'Dune', 'Dune: Part Two']
我使用了
SeleniumLibrary
(以及用于列表的 Collections 库),但您也可以在 RPA.Browser.Selenium
中使用类似的关键字。