我有以下网址https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc.
当我转到URL时,搜索变得像这个site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.
这是我的代码:
url="https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc."
url=url.replace("%3A",":").replace("%20"," ").replace("%2C+",", ")
search=re.search('.*?q=(.*)',url).groups()[0]
我觉得这样做的方法很糟糕,是否有更合理的编码技术方法
使用Python 3:
>>> import urllib.parse
>>> url="https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc."
>>> urllib.parse.unquote_plus(url)
'https://www.bing.com/search?q=site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.'
或者提取查询和unquote_plus
它:
>>> urllib.parse.unquote_plus(urllib.parse.urlsplit(url).query[2:])
'site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.'