如何在 PHP 中从 URL 获取 HTML？

Question

我想要 URL 中的 HTML 代码。

实际上我想从一个 URL 的数据中获取一些信息。

1. blog titile
2. blog image
3. blod posted date
4. blog description or actual blog text

我尝试了下面的代码但没有成功。

<?php
  $c = curl_init('http://54.174.50.242/blog/');
    curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
    //curl_setopt(... other options you want...)

    $html = curl_exec($c);

    if (curl_error($c))
        die(curl_error($c));

    // Get the status code
    $status = curl_getinfo($c, CURLINFO_HTTP_CODE);

    curl_close($c);

    echo "Status :".$status; die;
?>

请帮我从URL（http://54.174.50.242/blog/）获取必要的数据。

提前致谢。

Answer 1

你已经成功了一半。您的curl请求正在运行，并且

$html

变量包含博客页面源代码。现在您需要从 html 字符串中提取所需的数据。一种方法是使用 DOMDocument 类。

您可以从这里开始：

$c = curl_init('http://54.174.50.242/blog/');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
$html = curl_exec($c);

$dom = new DOMDocument;

// disable errors on invalid html
libxml_use_internal_errors(true);

$dom->loadHTML($html);

$list = $dom->getElementsByTagName('title');
$title = $list->length ? $list->item(0)->textContent : '';

// and so on ...

您还可以通过在 DOMDocument 类上使用方法

loadHTMLFile

来简化它，这样您就不必担心所有curl代码样板：

$dom = new DOMDocument;

// disable errors on invalid html
libxml_use_internal_errors(true);

$dom->loadHTMLFile('http://54.174.50.242/blog/');

$list = $dom->getElementsByTagName('title');
$title = $list->length ? $list->item(0)->textContent : '';
echo $title;

// and so on ...

Answer 2

您应该使用简单的 HTML 解析器。并使用提取 html

$html = @file_get_html($url);
foreach($html->find('article') as element) {  
   $title = $dom->find('h2',0)->plaintext; 
   ....      
}

我也在用这个，希望它有效。

如何在 PHP 中从 URL 获取 HTML？

问题描述投票：0回答：2

2个回答

最新问题

如何在 PHP 中从 URL 获取 HTML？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2