PHPcurl授权并从另一个链接解析产品表(我再次获得登录页面)

问题描述 投票:0回答:1
$ch = curl_init();

$post_field = 'ajax=SearchArticulo&cntrSgn=DeExMEkRabGEO396gOLDMqUZiXe2BibRjqgUXwZlQmMgrw4jJmdAwbUD11%2BddBhn&srcInicio=false&isSimple=false&codMarca=0&field=nombre&value=&oferta=false&pvpSubido=False&detallada=false&codPedido=';
$post_field .= '&cat1=5&cat2=87&cat3=314&token=';
curl_setopt($ch, CURLOPT_URL, 'https://actibios.com/WebForms/Clientes/GenerarPedidosVentas_new.aspx');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_field);

$headers = array();
$headers[] = 'Accept: */*';
$headers[] = 'Accept-Language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7';
$headers[] = 'Cache-Control: no-cache';
$headers[] = 'Connection: keep-alive';
$headers[] = 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8';
$headers[] = 'Cookie: grpAct@Session=sessionhere; grpAct@CodEmpresa=1; grpAct@CodDelegacion=1; grpAct@Year=; grpAct@Version=2024.01.25.005; SERVERUSED=srv2|ZicBV|ZicBR';
$headers[] = 'Dnt: 1';
$headers[] = 'Origin: https://actibios.com';
$headers[] = 'Pragma: no-cache';
$headers[] = 'Referer: https://actibios.com/WebForms/Clientes/GenerarPedidosVentas_new.aspx';
$headers[] = 'Sec-Fetch-Dest: empty';
$headers[] = 'Sec-Fetch-Mode: cors';
$headers[] = 'Sec-Fetch-Site: same-origin';
$headers[] = 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36';
$headers[] = 'X-Requested-With: XMLHttpRequest';
$headers[] = 'Sec-Ch-Ua: \"Google Chrome\";v=\"123\", \"Not:A-Brand\";v=\"8\", \"Chromium\";v=\"123\"';
$headers[] = 'Sec-Ch-Ua-Mobile: ?0';
$headers[] = 'Sec-Ch-Ua-Platform: \"Windows\"';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);
if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
}
curl_close($ch);
$dom = new DOMDocument();
$dom->loadHTML($result);
$tables = $dom->getElementsByTagName('table');
$table_array = array();
foreach ($tables as $table) {
    $rows = $table->getElementsByTagName('tr');
    foreach ($rows as $row) {
        $cols = $row->getElementsByTagName('td');
        $row_array = array();
        foreach ($cols as $col) {

            $row_array[] = $col->nodeValue;
        }
        $table_array[] = $row_array;
    }
}

foreach ($table_array as &$item) {
    // get description
    $curl = curl_init();

    curl_setopt($curl, CURLOPT_URL, 'https://actibios.com/WebForms/Clientes/Indicacion.aspx?cp='.$item[0]);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_POST, 1);
    curl_setopt($curl, CURLOPT_POSTFIELDS, "ajax=GetIndicacion&codArticulo=".$item[0]);

    $header = array();
    $header[] = 'Accept: image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8';
    $header[] = 'Accept-Language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7';
    $header[] = 'Cache-Control: no-cache';
    $header[] = 'Connection: keep-alive';
    $header[] = 'Cookie: grpAct@CodEmpresa=1; grpAct@CodDelegacion=1; grpAct@Year=; grpAct@Version=2024.01.25.005; ASP.NET_SessionId=l11hf41gpuxnha32puvxnvn4; grpAct@Session=sessionhere; SERVERUSED=srv1|ZiYu5|ZiYu2';
    $header[] = 'Dnt: 1';
    $header[] = 'Pragma: no-cache';
    $header[] = 'Referer: https://actibios.com/WebForms/Clientes/Indicacion.aspx?cp='.$item[0];
    $header[] = 'Sec-Fetch-Dest: image';
    $header[] = 'Sec-Fetch-Mode: no-cors';
    $header[] = 'Sec-Fetch-Site: same-origin';
    $header[] = 'Sec-Fetch-User: ?1';
    $header[] = 'Upgrade-Insecure-Requests: 1';
    $header[] = 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36';
    $header[] = 'Sec-Ch-Ua: \"Google Chrome\";v=\"123\", \"Not:A-Brand\";v=\"8\", \"Chromium\";v=\"123\"';
    $header[] = 'Sec-Ch-Ua-Mobile: ?0';
    $header[] = 'Sec-Ch-Ua-Platform: \"Windows\"';
    $header[] = 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8';
    $header[] = 'Origin: https://actibios.com';
    $header[] = 'X-Requested-With: XMLHttpRequest';
    curl_setopt($curl, CURLOPT_HTTPHEADER, $header);

    $Indicacion = curl_exec($curl);
    if (curl_errno($curl)) {
        echo 'Error:' . curl_error($curl);
    }
    curl_close($curl);
    $page_content = preg_replace('/<(pre)(?:(?!<\/\1).)*?<\/\1>/s','',$Indicacion);
    $desc = explode(':',$page_content);
    // get description
    $item['image'] = 'https://actibios.com/WebForms/Controls/imgArticulo.aspx?ca='.$item[0];
    $item['Codigo'] = $item[0];
    unset($item[0]);
    $item['Name'] = preg_replace('/\xc2\xa0/', '', $item[1]);
    unset($item[1]);
    $item['Marca'] = $item[2];
    unset($item[2]);
    $item['Stk24 (Stock1)'] = $item[3];
    unset($item[3]);
    $item['Stk24 (Stock2)'] = $item[4];
    unset($item[4]);
    $item['Stock'] = $item[5];
    unset($item[5]);
    $item['PVF'] = $item[6];
    unset($item[6]);
    $item['PVP'] = $item[7];
    unset($item[7]);
    unset($item[8]);
    unset($item[9]);
    unset($item[10]);
    $item['Category1'] = 'cat1';
    $item['Category2'] = 'cat2';
    $item['Category3'] = 'cat3';
    $item['desc'] = str_replace(",'marca'","",$desc[3]);
}
$json = json_encode($table_array);
//var_dump($table_array);
echo $json;

如果我使用此代码,一切正常,但第二天它就不再工作了。

我该如何解决这个问题?

我需要先登录,然后用产品解析 html 表。

我还尝试将 cookie 写入文件并从文件中检索它,但这也不起作用。

我尝试添加用户和密码,但同样的问题

如果我更改 Cookie:grpAct@Session=sessionhere 它会再次工作

php parsing curl
1个回答
0
投票

让curl 为您处理cookie,使用

CURLOPT_COOKIEFILE
加载cookie,使用
CURLOPT_COOKIEJAR
存储cookie。否则你必须手动解析响应头并更新 cookie 值等。

© www.soinside.com 2019 - 2024. All rights reserved.