如果你必须使用正则表达式,这可以工作:
<meta.*property="og:image".*content="(.*)".*\/>
正则表达式示例:http://regex101.com/r/rX1zK7
PHP 示例
$html = '<html>
<head>
<meta property="og:image" content="http://www.moneycontrol.com/news_image_files/2013/s/Syrian_diesel_trucks_190.jpg" />
</head>
<body>
</body>
</html>';
preg_match_all('/<meta.*property="og:image".*content="(.*)".*\/>/', $html, $matches);
echo $matches[1][0];
输出:
http://www.moneycontrol.com/news_image_files/2013/s/Syrian_diesel_trucks_190.jpg
利用
DOMDocument
类
<?php
$html='<meta property="og:image" content="http://www.moneycontrol.com/news_image_files/2013/s/Syrian_diesel_trucks_190.jpg" />';
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('meta') as $tag) {
if ($tag->getAttribute('property') === 'og:image') {
echo $tag->getAttribute('content');
}
}
输出:
http://www.moneycontrol.com/news_image_files/2013/s/Syrian_diesel_trucks_190.jpg
(<meta[^>]*>)
什么意思: 选择
<meta
...一切,但不是“>”零次或多次
最后>
它适用于:
<meta .... >
和
<meta ..../>