如何使用PHP的简单HTML DOM抓取公告板的标题和内容?

问题描述 投票:0回答:1
<?php
# simplehtmldom 포함
include('./simplehtmldom/simple_html_dom.php');

# url로 가져오기
$html = file_get_html('https://sample-ex.com/bbs/board.php?bo_table=free');
# 결과값을 담을 빈배열
$parsing = [];

# .na-table li 반복해서 내용 가져오기
foreach ($html->find('.na-table li') as $li) {
  # 결과값을 담을 임시 배열
  $tmp = [];

  $number = str_replace('번호', '', trim($li->find('div', 0)->text()));
  $title = trim($li->find('a', 0)->text());
  $writer = trim($li->find('.sv_member', 0)->text());
  $wrtieDate = str_replace('등록일', '', trim($li->find('div', 5)->text()));
  $count = str_replace('조회', '', trim($li->find('div', 6)->text()));
  $link = $li->find('a', 0)->href;

  $detail_html = file_get_html($link);
  echo $detail_html->find('.view-content', 0)->text();

  $tmp['number'] = $number;
  $tmp['title'] = $title;
  $tmp['writer'] = $writer;
  $tmp['wrtiedate'] = $wrtieDate;
  $tmp['count'] = $count;
  $tmp['link'] = $link;

  # $parsing[] 에 담기
  $parsing[] = $tmp;
}

# 메모리 누수 방지를 위해 $html 초기화
$html->clear();
unset($html);
?>
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Document</title>
  <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" crossorigin="anonymous">
</head>
<body>
  <div class="container">
    <table class="table table-hover">
      <thead>
        <tr>
          <th>번호</th>
          <th>제목</th>
          <th>이름</th>
          <th>날짜</th>
          <th>조회</th>
          <th>링크</th>
        </tr>
      </thead>
      <tbody>
        <?php foreach ($parsing as $data) { ?>
        <tr>
          <td><?php echo $data['number'] ?></td>
          <td><?php echo $data['title'] ?></td>
          <td><?php echo $data['writer'] ?></td>
          <td><?php echo $data['wrtiedate'] ?></td>
          <td><?php echo $data['count'] ?></td>
          <td><?php echo $data['link'] ?></td>
        </tr>
        <?php } ?>
      </tbody>
    </table>
  </div>
</body>
</html>

你好, 我希望你有美好的一天。

我有一个问题。我正在参考Simple HTML DOM的官方文档进行学习。我成功抓取了公告板上的帖子列表,但在尝试通过各自的链接抓取每个帖子的内容时失败了。

当我添加以下代码时:

$detail_html = file_get_html($link);
echo $detail_html->find('.view-content', 0)->text();

我遇到错误:

Fatal error: Uncaught Error: Call to a member function text() on null in /home/simple/public_html/test.php:23 Stack trace: #0 {main} thrown in /home/simple/public_html/test.php on line 23

如果您有任何解决方案或建议,请告诉我。由于这个问题是通过 ChatGPT 翻译的,所以听起来可能有点尴尬。谢谢你。

我尝试使用帖子链接抓取内容。

php simple-html-dom
1个回答
0
投票

当您检索链接时,它将如下所示。

https://sample-ex.com/bbs/board.php?bo_table=free&amp;wr_id=238

你可以使用

str_replace
将其转换成这样。

$detail_html = file_get_html(str_replace('&', '&', $link);

© www.soinside.com 2019 - 2024. All rights reserved.