我对API的一个问题。它的转向我空单
我试图寻找浏览器,但没有一个是我的答案。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib import re
site = "http://www.hurriyet.com.tr"
regex = "<span class='news-title'>(.+?)</span>"
comp = re.compile(regex)
print(comp) print(regex)
htmlkod = urllib.urlopen(site).read()
titles = re.findall(regex, htmlkod)
print(titles)
i=1
for title in titles:
print str(i), title.decode("iso8859-9")
print(title)
i+=1
我预计轮到我的新闻标题,但它把我“[]”空列表
我推荐使用的,而不是像正则表达式BeautifulSoup:
from urllib import urlopen
from bs4 import BeautifulSoup
site = "http://www.hurriyet.com.tr"
openurl = urlopen(site)
soup = BeautifulSoup(openurl, "html.parser")
getTitle = soup.findAll('span', attrs={'class': 'news-title'})
for title in getTitle:
print title.text