php curl - 像浏览器一样模拟请求

问题描述 投票:0回答:1

我正在尝试使用 PHP Curl 检索以下页面的内容:

https://www.whitepages.com/name/Antonio-Dalesio

问题是页面识别出请求不是来自浏览器,并给出以下错误:

Enable JavaScript and cookies to continue

如何使用 PHP Curl 模拟 Firefox 等请求?

我的代码:

$url="https://www.whitepages.com/name/Antonio-Dalesio";
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
// You can add your GET or POST param

// Retrieving session ID 
$strCookie = 'PHPSESSID=' . $_COOKIE['PHPSESSID'] . '; path=/';    

// We pass the sessionid of the browser within the curl request
curl_setopt( $curl, CURLOPT_COOKIE, $strCookie ); 

// We receive the answer as if we were the browser
$curl_response = curl_exec($curl);

return $curl_response;

我尝试了我的代码,但收到以下错误:

Enable JavaScript and cookies to continue

php web-scraping curl firefox browser
1个回答
0
投票

该页面受 CloudFlare 保护。忘记curl吧,你需要一个真正的网络浏览器。例如,使用 Chromium 和 https://github.com/chrome-php/chrome 库:

<?php
declare(strict_types=1);
require_once 'vendor/autoload.php';

$factory = new HeadlessChromium\BrowserFactory('chromium-browser');
$browser = $factory->createBrowser([
    'noSandbox' => true,
    'headless' => false,
    'customFlags' => [
        '--disable-dev-shm-usage', // for docker
    ],
]);
$page = $browser->createPage();
$page->navigate('https://www.whitepages.com/name/Antonio-Dalesio')->waitForNavigation(
    \HeadlessChromium\Page::LOAD,
);
$html = $page->getHTML();
$dom = new \DOMDocument('', '');
@$dom->loadHTML($html);
$xp = new \DOMXPath($dom);
// <script type="application/ld+json">
$js = $xp->query('//script[@type="application/ld+json"]')->item(0)->textContent;
$data = json_decode($js, true, 999, JSON_THROW_ON_ERROR);
var_export($data, false);

打印

array (
  0 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio Dalesio',
    'givenName' => 'Antonio',
    'familyName' => 'Dalesio',
    'additionalName' => NULL,
    'URL' => '/name/Antonio-Dalesio/Youngstown-OH/PN3VzweYkm3',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Youngstown',
          'addressRegion' => 'OH',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(330) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Dino Mario Dalesio',
        'givenName' => 'Dino',
        'familyName' => 'Dalesio',
        'additionalName' => 'Mario',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Debra Ann Dalesio',
        'givenName' => 'Debra',
        'familyName' => 'Dalesio',
        'additionalName' => 'Ann',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Carol Louise Dalesio',
        'givenName' => 'Carol',
        'familyName' => 'Dalesio',
        'additionalName' => 'Louise',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Carmen C Dalesio',
        'givenName' => 'Carmen',
        'familyName' => 'Dalesio',
        'additionalName' => 'C',
      ),
    ),
  ),
  1 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio D Dalesio',
    'givenName' => 'Antonio',
    'familyName' => 'Dalesio',
    'additionalName' => 'D',
    'URL' => '/name/Antonio-D-Dalesio/Dyer-IN/PA8mLDAbYP3',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Dyer',
          'addressRegion' => 'IN',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(219) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Caterina D Dalesio',
        'givenName' => 'Caterina',
        'familyName' => 'Dalesio',
        'additionalName' => 'D',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Shannon Marie Wilson',
        'givenName' => 'Shannon',
        'familyName' => 'Wilson',
        'additionalName' => 'Marie',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Nico Dalesio',
        'givenName' => 'Nico',
        'familyName' => 'Dalesio',
        'additionalName' => NULL,
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Katya D Alesio',
        'givenName' => 'Katya',
        'familyName' => 'Alesio',
        'additionalName' => 'D',
      ),
      4 => 
      array (
        '@type' => 'Person',
        'name' => 'Nicola D Dalesio',
        'givenName' => 'Nicola',
        'familyName' => 'Dalesio',
        'additionalName' => 'D',
      ),
    ),
  ),
  2 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio Dalesio',
    'givenName' => 'Antonio',
    'familyName' => 'Dalesio',
    'additionalName' => NULL,
    'URL' => '/name/Antonio-Dalesio/League-City-TX/Pd96ROrjNn9',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'League City',
          'addressRegion' => 'TX',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(XXX) XXX-XXXX',
    'relatedTo' => 
    array (
    ),
  ),
  3 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio P Dalesio',
    'givenName' => 'Antonio',
    'familyName' => 'Dalesio',
    'additionalName' => 'P',
    'URL' => '/name/Antonio-P-Dalesio/Hockessin-DE/PG9RlDZYzZ3',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Hockessin',
          'addressRegion' => 'DE',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(302) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Marika P Dalesio',
        'givenName' => 'Marika',
        'familyName' => 'Dalesio',
        'additionalName' => 'P',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Stefania Difreca Panza',
        'givenName' => 'Stefania',
        'familyName' => 'Panza',
        'additionalName' => 'Difreca',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Carlo Dalesio',
        'givenName' => 'Carlo',
        'familyName' => 'Dalesio',
        'additionalName' => NULL,
      ),
    ),
  ),
  4 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony M Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'M',
    'URL' => '/name/Anthony-M-Dalesio/Coraopolis-PA/PbyPWOlXP29',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Coraopolis',
          'addressRegion' => 'PA',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(412) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Angela D Dalesio',
        'givenName' => 'Angela',
        'familyName' => 'Dalesio',
        'additionalName' => 'D',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Stephen James Dalesio',
        'givenName' => 'Stephen',
        'familyName' => 'Dalesio',
        'additionalName' => 'James',
      ),
    ),
  ),
  5 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => NULL,
    'URL' => '/name/Anthony-Dalesio/Tyrone-PA/Pe9x2nNpgW3',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Tyrone',
          'addressRegion' => 'PA',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(XXX) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Lorenzo J Dalesio',
        'givenName' => 'Lorenzo',
        'familyName' => 'Dalesio',
        'additionalName' => 'J',
      ),
    ),
  ),
  6 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony P Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'P',
    'URL' => '/name/Anthony-P-Dalesio/Weirton-WV/Pl8aDbgEE3b',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Weirton',
          'addressRegion' => 'WV',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(304) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Nicole A Dalesio',
        'givenName' => 'Nicole',
        'familyName' => 'Dalesio',
        'additionalName' => 'A',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Charlotte E Dalesio',
        'givenName' => 'Charlotte',
        'familyName' => 'Dalesio',
        'additionalName' => 'E',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Marsha C Needy',
        'givenName' => 'Marsha',
        'familyName' => 'Needy',
        'additionalName' => 'C',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Carmela D Alesio',
        'givenName' => 'Carmela',
        'familyName' => 'Alesio',
        'additionalName' => 'D',
      ),
    ),
  ),
  7 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony J Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'J',
    'URL' => '/name/Anthony-J-Dalesio/Willingboro-NJ/Pk9AxD64o8A',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Willingboro',
          'addressRegion' => 'NJ',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(609) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Veronica Canonigo Dalesio',
        'givenName' => 'Veronica',
        'familyName' => 'Dalesio',
        'additionalName' => 'Canonigo',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Jacquelin M McCann',
        'givenName' => 'Jacquelin',
        'familyName' => 'McCann',
        'additionalName' => 'M',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Margaret M Dalesio',
        'givenName' => 'Margaret',
        'familyName' => 'Dalesio',
        'additionalName' => 'M',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Michael D Dalesio',
        'givenName' => 'Michael',
        'familyName' => 'Dalesio',
        'additionalName' => 'D',
      ),
      4 => 
      array (
        '@type' => 'Person',
        'name' => 'Michael R Dalesio',
        'givenName' => 'Michael',
        'familyName' => 'Dalesio',
        'additionalName' => 'R',
      ),
    ),
  ),
  8 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony N Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'N',
    'URL' => '/name/Anthony-N-Dalesio/Baldwin-NY/PE3EXoPOvyB',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Baldwin',
          'addressRegion' => 'NY',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(718) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Sean M Lynch',
        'givenName' => 'Sean',
        'familyName' => 'Lynch',
        'additionalName' => 'M',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Patricia Ann Dalesio',
        'givenName' => 'Patricia',
        'familyName' => 'Dalesio',
        'additionalName' => 'Ann',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Maureen P Mullaney',
        'givenName' => 'Maureen',
        'familyName' => 'Mullaney',
        'additionalName' => 'P',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Patricia A Morgera',
        'givenName' => 'Patricia',
        'familyName' => 'Morgera',
        'additionalName' => 'A',
      ),
      4 => 
      array (
        '@type' => 'Person',
        'name' => 'Nicole M Dalesio',
        'givenName' => 'Nicole',
        'familyName' => 'Dalesio',
        'additionalName' => 'M',
      ),
    ),
  ),
  9 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony T Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'T',
    'URL' => '/name/Anthony-T-Dalesio/Hatfield-PA/Pl3lmaeRl8E',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Hatfield',
          'addressRegion' => 'PA',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(215) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Susan M Dalesio',
        'givenName' => 'Susan',
        'familyName' => 'Dalesio',
        'additionalName' => 'M',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Amie Jennifer Natali',
        'givenName' => 'Amie',
        'familyName' => 'Natali',
        'additionalName' => 'Jennifer',
      ),
    ),
  ),
  10 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony C Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'C',
    'URL' => '/name/Anthony-C-Dalesio/Broadview-Heights-OH/PvyBGdBJka3',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Broadview Heights',
          'addressRegion' => 'OH',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(440) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Everett W Dalesio',
        'givenName' => 'Everett',
        'familyName' => 'Dalesio',
        'additionalName' => 'W',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Emery George Dalesio',
        'givenName' => 'Emery',
        'familyName' => 'Dalesio',
        'additionalName' => 'George',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Emery P Dalesio',
        'givenName' => 'Emery',
        'familyName' => 'Dalesio',
        'additionalName' => 'P',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Kathleen M Dalesio',
        'givenName' => 'Kathleen',
        'familyName' => 'Dalesio',
        'additionalName' => 'M',
      ),
      4 => 
      array (
        '@type' => 'Person',
        'name' => 'Judy E Viancourt',
        'givenName' => 'Judy',
        'familyName' => 'Viancourt',
        'additionalName' => 'E',
      ),
    ),
  ),
  11 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Anthony Russell Dalesio',
    'givenName' => 'Anthony',
    'familyName' => 'Dalesio',
    'additionalName' => 'Russell',
    'URL' => '/name/Anthony-Russell-Dalesio/Youngstown-OH/PG9RJnOdew9',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Youngstown',
          'addressRegion' => 'OH',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(330) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Suzanne R Dalesio',
        'givenName' => 'Suzanne',
        'familyName' => 'Dalesio',
        'additionalName' => 'R',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Vincent W Dalesio',
        'givenName' => 'Vincent',
        'familyName' => 'Dalesio',
        'additionalName' => 'W',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Nicholas P Dalesio',
        'givenName' => 'Nicholas',
        'familyName' => 'Dalesio',
        'additionalName' => 'P',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Sophia M D Dalesio',
        'givenName' => 'Sophia',
        'familyName' => 'Dalesio',
        'additionalName' => 'M D',
      ),
    ),
  ),
  12 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio T Daloisio',
    'givenName' => 'Antonio',
    'familyName' => 'Daloisio',
    'additionalName' => 'T',
    'URL' => '/name/Antonio-T-Daloisio/Tuckahoe-NY/P53WWaQGR3G',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Tuckahoe',
          'addressRegion' => 'NY',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(914) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Giuseppe R Daloisio',
        'givenName' => 'Giuseppe',
        'familyName' => 'Daloisio',
        'additionalName' => 'R',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Clementina D Daloisio',
        'givenName' => 'Clementina',
        'familyName' => 'Daloisio',
        'additionalName' => 'D',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Clementina D Daloisio',
        'givenName' => 'Clementina',
        'familyName' => 'Daloisio',
        'additionalName' => 'D',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Virgilio D Aloisio',
        'givenName' => 'Virgilio',
        'familyName' => 'Aloisio',
        'additionalName' => 'D',
      ),
      4 => 
      array (
        '@type' => 'Person',
        'name' => 'Nicole Ret Oliva',
        'givenName' => 'Nicole',
        'familyName' => 'Oliva',
        'additionalName' => 'Ret',
      ),
    ),
  ),
  13 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio Daloisio',
    'givenName' => 'Antonio',
    'familyName' => 'Daloisio',
    'additionalName' => NULL,
    'URL' => '/name/Antonio-Daloisio/Yonkers-NY/Pg3bLXllAd9',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Yonkers',
          'addressRegion' => 'NY',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(914) XXX-XXXX',
    'relatedTo' => 
    array (
    ),
  ),
  14 => 
  array (
    '@context' => 'http://schema.org',
    '@type' => 'Person',
    'name' => 'Antonio Salvatore Daloisio',
    'givenName' => 'Antonio',
    'familyName' => 'Daloisio',
    'additionalName' => 'Salvatore',
    'URL' => '/name/Antonio-Salvatore-Daloisio/Manahawkin-NJ/PX3vL0O7L9k',
    'homeLocation' => 
    array (
      0 => 
      array (
        '@type' => 'Place',
        'address' => 
        array (
          '@type' => 'PostalAddress',
          'addressLocality' => 'Manahawkin',
          'addressRegion' => 'NJ',
          'addressCountry' => 'US',
        ),
      ),
    ),
    'telephone' => '(609) XXX-XXXX',
    'relatedTo' => 
    array (
      0 => 
      array (
        '@type' => 'Person',
        'name' => 'Sabrina A Daloisio',
        'givenName' => 'Sabrina',
        'familyName' => 'Daloisio',
        'additionalName' => 'A',
      ),
      1 => 
      array (
        '@type' => 'Person',
        'name' => 'Maria D Daloisio',
        'givenName' => 'Maria',
        'familyName' => 'Daloisio',
        'additionalName' => 'D',
      ),
      2 => 
      array (
        '@type' => 'Person',
        'name' => 'Victoria A Daloisio',
        'givenName' => 'Victoria',
        'familyName' => 'Daloisio',
        'additionalName' => 'A',
      ),
      3 => 
      array (
        '@type' => 'Person',
        'name' => 'Danielle A Daloisio',
        'givenName' => 'Danielle',
        'familyName' => 'Daloisio',
        'additionalName' => 'A',
      ),
      4 => 
      array (
        '@type' => 'Person',
        'name' => 'Philip Lorenzo Daloisio',
        'givenName' => 'Philip',
        'familyName' => 'Daloisio',
        'additionalName' => 'Lorenzo',
      ),
    ),
  ),
)
© www.soinside.com 2019 - 2024. All rights reserved.