如何从网页中读取内容,然后使用urllib将其输出

问题描述 投票:1回答:2

我正在尝试从网站上获取随机文本。我的代码:

import urllib.request
import re

url = "https://www.randomlists.com/random-words"

print(url)
request = urllib.request.urlopen(url).read()
request.decode("utf-8")



print(request)

但是输出:

https://www.randomlists.com/random-words
b'<!doctype html> <html lang="en-US"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Random Word Generator &mdash; Get a list of random words</title> <meta name="description" content="A word randomizer for finding quick inspiration. Generate a random list of words from 2500+ of the most common English words. Also filter by part of speech!" /> <link rel="canonical" href="https://www.randomlists.com/random-words" /> <meta property="og:locale" content="en_US" /> <meta property="og:type" content="website" /> <meta property="og:title" content="Random Word Generator &mdash; Get a list of random words" /> <meta property="og:description" content="A word randomizer for finding quick inspiration. Generate a random list of words from 2500+ of the most common English words. Also filter by part of speech!" /> <meta property="og:url" content="https://www.randomlists.com/random-words" /> <meta property="og:site_name" content="Random Lists" /> <meta property="og:image" content="https://www.randomlists.com/img/og-image.jpg" /> <meta property="og:image:secure_url" content="https://www.randomlists.com/img/og-image.jpg" /> <meta property="og:image:height" content="1200" /> <meta property="og:image:width" content="1200" /> <meta property="og:image:alt" content="Playing cards randomly spread about" /> <meta property="og:image:type" content="image/jpg" /> <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png"> <link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png"> <link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png"> <link rel="manifest" href="/manifest.json"> <link rel="mask-icon" href="/safari-pinned-tab.svg" color="#000000"> <meta name="msapplication-TileColor" content="#ffffff"> <meta name="theme-color" content="#ffffff"> <link rel="preload" data-rand_json href="/data/words.json" as="fetch"> <style>@charset "UTF-8";.Header-content,.layout-main,.main_width{width:100%}@media only screen and (min-width:500px){.Header-content,.layout-main,.main_width{width:calc(100% - 120px - 1rem)}}@media only screen and (min-width:700px){.Header-content,.layout-main,.main_width{width:calc(100% - 160px - 1rem)}}@media only screen and (min-width:900px){.Header-content,.layout-main,.main_width{max-width:calc(100% - 300px - 1rem)}}.Header-content,.layout-main,.main_width{width:100%}@media only screen and (min-width:500px){.Header-content,.layout-main,.main_width{width:calc(100% - 120px - 1rem)}}@media only screen and (min-width:700px){.Header-content,.layout-main,.main_width{width:calc(100% - 160px - 1rem)}}@media only screen and (min-width:900px){.Header-content,.layout-main,.main_width{max-width:calc(100% - 300px - 1rem)}}html{font-size:14px}@media only screen and (min-width:728px){html{font-size:16px}}body{background:#efefea}body,input{margin:0;font-family:Roboto;line-height:1.4em}h1,h2,h3,p{margin-top:0;margin-bottom:1rem}h1,h2,h3{font-family:"Roboto Condensed";font-weight:400;line-height:1.1em}h1{font-size:1.8rem}h2{font-size:1.4rem}h3{font-size:1.2rem}strong{font-weight:500}.section_wide{max-width:1200px;margin-left:auto;margin-right:auto;box-sizing:border-box}.section_wide.section_gutter{box-sizing:content-box}.section_gutter{padding-left:1rem;padding-right:1rem;box-sizing:border-box}.button{display:-webkit-inline-box;display:inline-flex;-webkit-box-orient:vertical;-webkit-box-direction:normal;flex-direction:column;-webkit-box-align:center;align-items:center;text-align:center;-webkit-box-pack:center;justify-content:center;text-decoration:none;color:#fff;box-sizing:border-box;border-width:0;background:#0e1a35;cursor:pointer;height:44px;width:48px;padding:2px;font-size:8px;line-height:1.25em;-webkit-transition:all ease .2s;transition:all ease .2s;text-transform:uppercase;font-family:Arial}.button:before{font-family:icon;font-size:1.2rem;line-height:1em;margin-bottom:.4rem}.button:hover{background:#224287}.button:focus{outline-style:solid;outline-width:2px}.button--yo{background:#1985a3!important}.button--yay{background:#19a337!important}.button--yay:before{content:"\xef\x80\x8c"!important}.button--edit:before{content:"\xef\x81\x80"!important}.button--wide{width:64px}.button:disabled{background:#222;cursor:no-drop;opacity:.8}.js_notice{position:fixed;bottom:0;left:0;width:100%;padding:1rem;background:#a32019;color:#fff;text-align:center}.js_notice a{color:currentColor}@font-face{font-family:icon;src:url(/vendor/icomoon/fonts/icomoon.ttf?cq58c8) format("truetype"),url(/vendor/icomoon/fonts/icomoon.woff?cq58c8) format("woff"),url(/vendor/icomoon/fonts/icomoon.svg?cq58c8#icomoon) format("svg");font-weight:400;font-style:normal}.layout{display:-webkit-box;display:flex;flex-wrap:wrap;-webkit-box-pack:justify;justify-content:space-between}.layout-main{background:#fff;box-shadow:0 0 1rem rgba(0,0,0,.2)}.layout-top{margin:.5rem 0 1.5rem;display:-webkit-box;display:flex;-webkit-box-align:center;align-items:center;-webkit-box-pack:center;justify-content:center}.adsbygoogle--top{width:320px;min-height:50px;max-height:100px}@media only screen and (min-width:630px){.adsbygoogle--top{width:468px;height:60px;min-height:0;max-height:none}}@media only screen and (min-width:1065px){.adsbygoogle--top{width:728px;height:90px}}.adsbygoogle--side1,.adsbygoogle--side2,.layout-side{display:none}@media only screen and (min-width:500px){.layout-side{display:-webkit-box;display:flex;display:block;margin:-2rem 0 0;width:120px}.adsbygoogle--side1,.adsbygoogle--side2{display:block;width:120px;height:600px}}@media only screen and (min-width:700px){.layout-side{width:160px}.adsbygoogle--side1,.adsbygoogle--side2{width:160px;height:600px}}@media only screen and (min-width:900px){.layout-side{width:300px}.adsbygoogle--side1{width:300px;height:250px}.adsbygoogle--side2{width:300px;height:600px}}.Header{background:#182e5e;color:#fff;border-bottom:solid 1px rgba(0,0,0,.1)}.Header a{color:inherit}.Header-content{display:-webkit-box;display:flex;height:3rem}.Header-logo{margin-left:-1rem;margin-right:auto}.Header-logo a{display:-webkit-box;display:flex;-webkit-box-align:center;align-items:center;font-size:1.5rem;line-height:1em;height:3rem;padding:0 1rem;text-decoration:none;white-space:nowrap;position:relative;margin-left:-.2rem}.Header-logo a:before{content:\'\';background:url(data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz48c3ZnIHZlcnNpb249IjEuMSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgeD0iMHB4IiB5PSIwcHgiIHZpZXdCb3g9IjAgMCAyNCAyNCIgc3R5bGU9ImVuYWJsZS1iYWNrZ3JvdW5kOm5ldyAwIDAgMjQgMjQ7IiB4bWw6c3BhY2U9InByZXNlcnZlIj48cmVjdCB4PSIxIiB5PSIxIiBmaWxsPSIjZmZmIiB3aWR0aD0iMjIiIGhlaWdodD0iMjIiLz48Zz48cGF0aCBkPSJNMjMsMEgxQzAuNCwwLDAsMC40LDAsMXYyMmMwLDAuNiwwLjQsMSwxLDFoMjJjMC42LDAsMS0wLjQsMS0xVjFDMjQsMC40LDIzLjYsMCwyMywweiBNMjIsMjJIMlYyaDIwVjIyeiIvPjxwYXRoIGQ9Ik0xMiwxNGMxLjEsMCwyLTAuOSwyLTJzLTAuOS0yLTItMnMtMiwwLjktMiwyUzEwLjksMTQsMTIsMTR6Ii8+PHBhdGggZD0iTTE3LDE5YzEuMSwwLDItMC45LDItMnMtMC45LTItMi0ycy0yLDAuOS0yLDJTMTUuOSwxOSwxNywxOXoiLz48cGF0aCBkPSJNNyw5YzEuMSwwLDItMC45LDItMlM4LjEsNSw3LDVTNSw1LjksNSw3UzUuOSw5LDcsOXoiLz48cGF0aCBkPSJNMTcsOWMxLjEsMCwyLTAuOSwyLTJzLTAuOS0yLTItMnMtMiwwLjktMiwyUzE1LjksOSwxNyw5eiIvPjxwYXRoIGQ9Ik03LDE5YzEuMSwwLDItMC45LDItMnMtMC45LTItMi0ycy0yLDAuOS0yLDJTNS45LDE5LDcsMTl6Ii8+PC9nPjwvc3ZnPg==) no-repeat 0 0;background-size:1.25rem 1.25rem;width:1.25rem;height:1.25rem;margin-right:.2rem}.Header-menu{display:-webkit-box;display:flex;margin-right:-1rem}.Header-menu>a{height:3rem;width:3rem;margin-left:2px}.Header-menu [href*=search]:before{content:"\xef\x80\x82"}.Header-menu [href*=browse]:before{content:"\xef\x83\x89";margin-bottom:.2rem;line-height:1.4rem}.Header-suggested{margin:0 .4rem 0 0;overflow:hidden;height:3rem;font-size:.9rem}.Header-suggested ul{margin:0;padding:0;list-style:none;display:-webkit-box;display:flex;flex-wrap:wrap;display:flex;-webkit-box-pack:end;justify-content:flex-end}.Header-suggested li{display:-webkit-box;display:flex;padding:0}.Header-suggested a{display:-webkit-box;display:flex;-webkit-box-align:center;align-items:center;line-height:1em;height:3rem;padding:0 .5rem}@media only screen and (max-width:620px){.Header-suggested{display:none}}.Header-search{display:none}.Rand-header{margin:1rem 0 2rem;padding-top:1rem;display:-webkit-box;display:flex;-webkit-box-pack:justify;justify-content:space-between}.Rand-headline{margin:0 1rem 0 0;align-self:center}.Rand-tools1{align-self:flex-start;margin:0;white-space:nowrap;position:relative}.Rand-tools2{display:-webkit-box;display:flex;-webkit-box-pack:center;justify-content:center;position:relative;margin:2rem 0;text-align:center}.ShareButtons:not([hidden]){display:-webkit-box;display:flex;-webkit-box-pack:center;justify-content:center;text-align:center;position:absolute;top:calc(100%);left:50%;-webkit-transform:translateX(-50%);transform:translateX(-50%);z-index:10;padding:4px;background:rgba(255,255,255,.8)}.Rand-tools1 button:not(:first-child),.Rand-tools2 button:not(:first-child),.ShareButtons button:not(:first-child){margin-left:4px}.Rand-ad{display:-webkit-box;display:flex;margin:1rem -1rem;-webkit-box-pack:center;justify-content:center}.adsbygoogle--rand{width:300px;height:250px}@media only screen and (min-width:350px){.adsbygoogle--rand{width:336px;height:280px}}@media only screen and (min-width:1065px){.adsbygoogle--rand{width:728px;height:90px}}@media only screen and (min-width:468px){.Rand-grid{display:-webkit-box;display:flex;-webkit-box-pack:justify;justify-content:space-between;-webkit-box-align:start;align-items:flex-start}.Rand-grid>*{width:calc(50% - 1.5rem)}}.RandOptions+.RandCopy{padding-top:1rem}.RandCopy-nav .select-nav a:after{content:","}.RandCopy-nav *{display:inline;padding:0}.RandOptions{background:rgba(1,1,1,.02);padding:1rem}.RandOptions h2{margin:0 0 1rem}.RandOptions-row{margin:0 0 1rem}.RandOptions-row input:not([type=checkbox]),.RandOptions-row select,.RandOptions-row textarea{padding:.5rem;box-sizing:border-box;min-width:140px;max-width:100%}.RandOptions-row label:first-child{display:block;margin-bottom:.1rem}.RandOptions-row input:not(:first-child),.RandOptions-row select{display:block}.RandOptions-row textarea{min-height:200px;width:100%}.RandOptions-row input[type=number]{min-width:0;width:5em}.RandOptions-row input[type=checkbox]{width:1.5rem;height:1.5rem;vertical-align:bottom}.RandOptions-helper{display:block;font-size:.7rem;line-height:1.1em;margin-top:.1rem}[data-action=rerun]:before{content:"\xef\x81\xb4"}[data-action=options]:before{content:"\xef\x80\x93"}[data-action=share]:before{content:"\xef\x87\xa0"}[data-share=copy]:before{content:"\xef\x83\x85"}[data-share=facebook]:before{content:"\xef\x82\x9a"}[data-share=twitter]:before{content:"\xef\x82\x99"}[data-share=reddit]:before{content:"\xef\x8a\x81"}.Rand-stage-loading{width:100%;text-align:center;font-size:12px;text-indent:.8em;text-transform:uppercase;line-height:1em;background:transparent url(/img/loading.gif) no-repeat 50% .5rem;padding:2rem 1rem 1rem}.Rand-stage--no_images{text-align:left!important}.Rand-stage--no_images .img,.Rand-stage--no_images img{display:none!important}.Rand-stage--highlight,.RandOptions--highlight{background:rgba(1,1,1,.04)!important}.Rand-stage{display:-webkit-box;display:flex;margin:0;background:rgba(1,1,1,.02);padding:1rem .5rem 1rem 0}.Rand-stage ol,.Rand-stage ul{display:-webkit-box;display:flex;flex-wrap:wrap;width:100%;margin:0;padding:0;list-style:none}.Rand-stage li{position:relative;margin:0 0 .5rem;padding:.5rem 1rem;box-sizing:border-box;word-break:break-word;min-width:20%}@media only screen and (min-width:900px){.Rand-stage li{min-width:10%}}.Rand-stage ol{counter-reset:li;padding-left:.5rem}.Rand-stage ol>li{position:relative}.Rand-stage ol>li:before{counter-increment:li;content:counter(li);position:absolute;right:calc(100% - .8rem + .5ex);top:calc(.5rem + .5em);font-size:.8rem;line-height:1em;font-family:monospace;opacity:.5;text-align:right;white-space:nowrap}.Rand-stage img{max-width:100%;height:auto}.Rand-stage>ol>li img{display:inline-block;margin:0 0 4px}.Rand-stage>ol>li a.biglink{display:block;position:relative;padding:0 1rem 0 0;text-decoration:none}.Rand-stage>ol>li a.biglink:after{content:"\xef\x82\x8e";font-family:icon;position:absolute;bottom:0;right:0;font-size:1rem;color:#000;-webkit-transition:opacity .3s ease,color .3s ease;transition:opacity .3s ease,color .3s ease;opacity:.75}.Rand-stage>ol>li a.biglink:hover:after{color:#0e1a35;opacity:1}span.rand_large,span.rand_medium,span.rand_small{display:block}.rand_huge{font-size:2.5rem;line-height:1.1em}.rand_large{font-size:1.5rem;line-height:1.25em}.rand_medium{font-size:1rem;color:#000}.rand_small{font-size:.8rem;color:#999}.monospace{font-family:monospace}</style> <link rel="preload" href="/css/defer.css?v=1575659034" as="style" onload="this.rel=\'stylesheet\'"> <link rel="preload" href="https://fonts.googleapis.com/css?family=Roboto%2BCondensed%7CRoboto%3A400%2C500&display=swap" as="style" onload="this.rel=\'stylesheet\'"> <script> (function(){ const re = new RegExp("items=[^\\&]"); if(re.test(location.search)){ var style = document.createElement("style"); document.getElementsByTagName(\'head\')[0].appendChild(style); style.innerHTML = ".adsbygoogle{display:none !important;height:0 !important;width:0 !important;}"; } }()); </script> <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <script> (adsbygoogle = window.adsbygoogle || []).push({ google_ad_client: "ca-pub-5711305288167877", enable_page_level_ads: true }); </script> <script src="/js/defer.js?v=1575659071" defer></script> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-40634703-1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag(\'js\', new Date()); gtag(\'config\', \'UA-40634703-1\'); </script> </head> <body> <header class="Header"> <div class="section_wide"> <div class="Header-content section_gutter"> <div class="Header-logo"> <a href="/">Random Lists</a> </div> <nav class="Header-suggested"> <ul> <li><a href="/nouns">Nouns</a></li> <li><a href="/pictionary-words">Pictionary</a></li> <li><a href="/random-verbs">Verbs</a></li> <li><a href="/random-vocabulary-words">Vocabulary Words</a></li> <li><a href="/compound-words">Compound Words</a></li> </ul> </nav> <div class="Header-menu"> <a href="/search" class="button">Search</a> <a href="/#browse" class="button">Menu</a> </div> <div class="Header-search"> <form method="get" action="/search"> <div class="Header-search_flex"> <input type="text" name="q" required placeholder="Search&hellip;"> <button class="button"><span>Search</span></button> </div> </form> </div> </div> </div> </header> <div class="layout section_wide"> <div class="layout-main"> <div class="layout-top"> <ins class="adsbygoogle adsbygoogle--top" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="3781211666"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <main class="layout-part3 section_gutter"> <article> <div class="Rand-header"> <h1 class="Rand-headline">Random word generator:</h1> <aside class="Rand-tools1"> <button data-action="rerun" class="button">Rerun</button><button data-action="options" class="button">Options</button> </aside> </div> <div class="Rand-stage"> <div class="Rand-stage-loading"> Loading&hellip; </div> </div> <aside class="Rand-tools2"> <button data-action="rerun" class="button">Rerun</button><button data-action="options" class="button">Options</button> <button data-action="share" id="Rand-ShareButtonsLabel" aria-expanded="false" aria-controls="Rand-ShareButtons" class="button">Share</button> <div class="ShareButtons" id="Rand-ShareButtons" aria-labelledby="Rand-ShareButtonsLabel" hidden> <button data-share="copy" class="button">Copy URL</button> <button data-share="facebook" class="button">Facebook</button> <button data-share="twitter" class="button">Twitter</button> <button data-share="reddit" class="button">Reddit</button> </div> </aside> <aside class="Rand-ad"> <ins class="adsbygoogle adsbygoogle--rand" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="1928302466"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </aside> <div class="Rand-grid"> <aside class="RandOptions"> <h2>Edit Settings</h2> <form action="#" id="rand_options"> <p class="RandOptions-row"> <label for="rand_jumper">Dataset</label> <select id="rand_jumper"></select> </p> <p class="RandOptions-row"> <label for="rand_options_qty">Quantity</label> <input id="rand_options_qty" type="number" name="qty" value="12"> </p> <p class="RandOptions-row"> <input type="checkbox" name="dup" id="rand_options_dup"> <label for="rand_options_dup">Duplicates</label> </p> </form> <p><button data-action="rerun" class="button">Rerun</button></p> </aside> <div class="RandCopy"> <h2>Random Word Generator</h2> <p>Supposedly there are over one million words in the English Language. We trimmed some fat to take away really odd words and determiners. Then we grabbed the most popular words and built this word randomizer. Just keep clicking generate&mdash;chances are you won\'t find a repeat!</p> <h3>Random Word Games</h3> <p>As an exercise for English students, generate a list of ten random words and have the student write a story that incorporates those words in the order they\'re generated.</p> <p>You could also take the hard work out of playing MadLibs but for that you\'ll need to separate out the parts of speech. There\'s generators for each one, just jump over using the options below.</p> <div class="RandCopy-nav"> Also try: <nav class="select-nav"> <a href="https://www.randomlists.com/random-words">Words</a> <ul> <li><a href="https://www.randomlists.com/random-adjectives">Adjectives</a></li> <li><a href="https://www.randomlists.com/random-adverbs">Adverbs</a></li> <li><a href="https://www.randomlists.com/compound-words">Compound Words</a></li> <li><a href="https://www.randomlists.com/nouns">Nouns</a></li> <li><a href="https://www.randomlists.com/random-prepositions">Prepositions</a></li> <li><a href="https://www.randomlists.com/spanish-words">Spanish Words</a></li> <li><a href="https://www.randomlists.com/random-verbs">Verbs</a></li> <li><a href="https://www.randomlists.com/random-vocabulary-words">Vocabulary Words</a></li> </ul> </nav> or just <a href="/list-randomizer">create your own list</a>. </div> </div> </div> </article> <div id="rand-image-preload"></div> <script>const rand = "words";</script> <script type="application/ld+json">{"@context":"https:\\/\\/schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Random Lists","item":"https:\\/\\/www.randomlists.com\\/"},{"@type":"ListItem","position":1,"name":"Words","item":"https:\\/\\/www.randomlists.com\\/random-words"}]}</script> </main> <div class="layout-prefooter"> <ins class="adsbygoogle adsbygoogle--prefooter" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="3781211666"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <footer class="footer section_gutter"> <nav> <ul> <li><a href="/">Home</a></li> <li><a href="/faq">FAQ</a></li> <li><a href="/privacy">Privacy Policy</a></li> <li><a href="/contact">Contact</a></li> <li><a href="/list-randomizer">Randomize your custom list</a></li> <li class="footer-sitemap"><a href="/all">Sitemap</a></li> </ul> </nav> <p class="disclosure">This site is offered as is to those who visit it. We make no guarantees regarding its services. Just enjoy yourself.</p> </footer> </div> <div class="layout-side"> <ins class="adsbygoogle adsbygoogle--side1" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="2304478467"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> <ins class="adsbygoogle adsbygoogle--side2" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="2304478467"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> </div> <script> const el_RS=document.querySelector(".Rand-stage"),el_RO=document.getElementById("rand_options"),useWebP=function(){var e=document.createElement("canvas");return!(!e.getContext||!e.getContext("2d"))&&0==e.toDataURL("image/webp").indexOf("data:image/webp")}();function randomiseNumbers(e,t,n,r){e=parseInt(e),t=parseInt(t),n=parseInt(n);var o=t-e+1,a=new Array;if(r){for(var i=0;i<n;i++)a[i]=e+Math.floor(Math.random()*o);return a}var s=1;for(i=0;i<o;i++)s=(n-a.length)/(o-i),Math.random()<=s&&a.push(i+e);return randomise(a,n,r)}function randomise(e,t,n){if(Array.isArray(e)||(e=e.split("")),e.shuffle(),n){for(var r=new Array,o=0;o<t;o++)r[o]=e[Math.floor(Math.random()*e.length)];return r}return t>0&&t<e.length?e.slice(0,t):e}Array.prototype.shuffle=function(){var e,t,n=this.length;if(0!=n)for(;--n;)e=Math.floor(Math.random()*(n+1)),t=this[n],this[n]=this[e],this[e]=t};const rand_json=function(){var e=[];const t=document.head.querySelectorAll("[data-rand_json]");for(var n=t.length-1;n>=0;n--)e.unshift(t[n].getAttribute("href"));return e}();function getRandOptions(){if(!el_RO)return;var e={};const t=el_RO.querySelectorAll("[name]");for(var n=t.length-1;n>=0;n--){const r=t[n].getAttribute("name"),o=t[n].getAttribute("id"),a=document.getElementById(o);"checkbox"===t[n].type?e[r]=a.checked:e[r]=a.value}return e}function strip(e){return(new DOMParser).parseFromString(e,"text/html").body.textContent||""}function triggerEvent(e,t){t=void 0===t?window:t;var n=document.createEvent("HTMLEvents");n.initEvent(e,!1,!0),t.dispatchEvent(n)}function getJSON(e,t){var n=new XMLHttpRequest;n.open("GET",e,!0),n.onload=function(){if(n.status>=200&&n.status<400){const e=JSON.parse(n.responseText);t(e)}},n.send()}function copyText(e){var t=document.createElement("input");t.setAttribute("style","opacity:0;position:absolute;"),t.value=e,document.body.appendChild(t),t.select(),document.execCommand("copy"),t.parentNode.removeChild(t)}if(function(){if(!el_RO)return;var e,t=(e||document.location.search).replace(/(^\\?)/,"").split("&").map(function(e){return this[(e=e.split("="))[0]]=e[1],this}.bind({}))[0];const n=el_RO.querySelectorAll("[name]");for(var r=n.length-1;r>=0;r--){var o=n[r].getAttribute("name");if(!t[o])return;if("checkbox"===n[r].type){const e=t[o]&&"false"!=t[o];n[r].checked=e}else{var a=decodeURIComponent(t[o]).replace(/\\+/g," ");n[r].value=strip(a)}}}(),el_RS){const e=function(){var e=document.getElementById("style_rand_stage");e?e.innerHTML="":((e=document.createElement("style")).setAttribute("id","style_rand_stage"),document.getElementsByTagName("head")[0].appendChild(e));var t=document.querySelector(".Rand-stage > ol");if(t){var n=t.offsetWidth,r=function(e,t,n){var r=document.querySelectorAll(e);if(0!=r.length){for(var o=0,a=0;a<r.length;a++){var i=r[a].offsetWidth;i>o&&(o=i)}var s=1/Math.floor(t/o);s=100*(s<1?s:1),n.innerHTML+=e+"{min-width:"+s+"%;}"}};r(".Rand-stage > ol > li",n,e),r(".Rand-stage > ol ol > li",n,e)}};window.addEventListener("rand_ran",e),window.addEventListener("resize",e),window.addEventListener("fonts_loaded",e)}!function(){const e=document.head.querySelectorAll("[as=style]");for(var t=e.length-1;t>=0;t--)e[t].setAttribute("rel","stylesheet")}();var runRand=function(){var a=getRandOptions();getJSON(rand_json,function(n){var e=n.data||n.RandL.items,s=!!(n&&n.RandL&&n.RandL.meta&&n.RandL.meta.img)&&n.RandL.meta.img;s.local&&s.suffix&&useWebP&&(".png"!=s.suffix&&".jpg"!=s.suffix||(s.suffix=".webp"));var r=function(a,n){return a};s&&(r=function(a,n){var e=n.prefix;return n.local?n.merge_underscores?e+=a.replace(/\\W+/g,"_").toLowerCase():e+=a.replace(/\\W/g,"_").toLowerCase():e+=a,n.suffix&&(e+=n.suffix),e});for(var i=randomiseNumbers(0,e.length-1,a.qty,a.dup),l="",t=0;t<i.length;t++){var m="";if("string"==typeof e[i[t]]||e[i[t]]instanceof String)s&&n.RandL.meta.img.local?(m+="<span class=\'img\'><img src=\'"+r(e[i[t]],s)+"\' alt=\'\'></span>",m+="<span class=\'rand_medium\'>"+e[i[t]]+"</span>"):m+="<span class=\'rand_large\'>"+e[i[t]]+"</span>";else{var d=!1;if(e[i[t]].img?(m+="<span class=\'img\'><img src=\'"+r(e[i[t]].img,s)+"\' alt=\'\'></span>",d=!0):s&&n.RandL.meta.img.local&&(m+="<span class=\'img\'><img src=\'"+r(e[i[t]].name,s)+"\' alt=\'\'></span>",d=!0),e[i[t]].yt&&(m+="<div class=\'yt\'><iframe width=\'420\' height=\'315\' src=\'//www.youtube.com/embed/"+e[i[t]].yt+"\' frameborder=\'0\' allowfullscreen></iframe></div>",d=!0),e[i[t]].name)m+="<span class=\'"+(d?"rand_medium":"rand_large")+"\'>"+e[i[t]].name+"</span>";if(e[i[t]].detail)m+="<span class=\'"+(d?"rand_small":"rand_medium")+"\'>"+e[i[t]].detail+"</span>";e[i[t]].url&&(m="<a class=\'biglink\' href=\'"+e[i[t]].url+"\' target=\'_blank\'>"+m+"</a>")}l+="<li>"+m+"</li>"}el_RS.innerHTML="<ol>"+l+"</ol>",triggerEvent("rand_ran")})};runRand(); </script> <noscript> <div class="js_notice">You need to enable JavaScript. See: <a href="https://www.enable-javascript.com/" target="_blank">How to enable JavaScript in your browser</a></div> </noscript> </body> </html>'

这是html。我也尝试过:

import urllib.request
import re

url = "https://www.randomlists.com/random-words"

print(url)
request = urllib.request.urlopen(url)

word = request.read()
word.decode("utf-8")


print(word)

但它输出相同。 如何阅读网页内容然后输出?

所需输出示例:

machine
somber
fancy
bitter
limping
lip
clear
worry
belief
arm
jam
board

该网页一次包含12个随机生成的单词。

python urllib
2个回答
1
投票

[在现代Web应用程序中经常发生,您使用的应用程序具有未记录的JSON api,它比设计用于浏览器呈现的HTML页面更适合由外部应用程序读取。


0
投票

您需要手动解析html代码,或使用将其解析为BeautifulSoup的库:https://www.crummy.com/software/BeautifulSoup/bs4/doc/

© www.soinside.com 2019 - 2024. All rights reserved.