PHP CURL多线程和单线程功能帮助。我该怎么做?

问题描述 投票:1回答:2

我在这里找到一个函数:http://archevery.blogspot.com/2013/07/php-curl-multi-threading.html

我正在使用它发送URL数组,以通过多线程curl请求尽快运行和处理。这很好用。

我想发送的某些URL要求对它们进行顺序处理,而不是同时处理,而是按顺序处理。

我该如何实现?

示例:

URL-A URL-B URL-C->全部同时启动

URL-D URL-E->必须等待URL-D完成才触发URL-E。

我的目的是用于任务管理系统,该系统允许我在数据库中将PHP应用程序添加为“任务”。我与任务具有头/细节关系,因此可以将一个带有一个头和一个细节的任务发送给多线程,但是一个带有一个头和多个细节的任务必须按照细节任务的顺序发送。

我可以通过在循环中调用curl请求来执行此操作,但是我希望它们还触发基本请求(序列的第一个任务)作为多线程函数的一部分。我不想等待所有顺序的任务堆积起来并按顺序处理。与在第一个任务中一样,每个序列都应该是多线程的,但是具有序列的任务然后需要等待该任务完成才能移动到下一个任务。

我尝试过将多个任务发送到的此功能,但它会等待每个任务完成后再继续执行下一个任务。我需要以某种方式将上述URL中的多线程函数结合起来。这是我的多线程curl函数:

function runRequests($url_array, $thread_width = 10) {
    $threads = 0;
    $master = curl_multi_init();
    $curl_opts = array(CURLOPT_RETURNTRANSFER => true,
        CURLOPT_FOLLOWLOCATION => true,
        CURLOPT_MAXREDIRS => 5,
        CURLOPT_CONNECTTIMEOUT => 15,
        CURLOPT_TIMEOUT => 15,
        CURLOPT_RETURNTRANSFER => TRUE);
    $results = array();
    $count = 0;
    foreach($url_array as $url) {
        $ch = curl_init();
        $curl_opts = [CURLOPT_URL => $url];
        curl_setopt_array($ch, $curl_opts);
        curl_multi_add_handle($master, $ch); //push URL for single rec send into curl stack
        $results[$count] = array("url" => $url, "handle" => $ch);
        $threads++;
        $count++;
        if($threads >= $thread_width) { //start running when stack is full to width
            while($threads >= $thread_width) {
                //usleep(100);
                while(($execrun = curl_multi_exec($master, $running)) === -1){}
                curl_multi_select($master);
                // a request was just completed - find out which one and remove it from stack
                while($done = curl_multi_info_read($master)) {
                    foreach($results as &$res) {
                        if($res['handle'] == $done['handle']) {
                            $res['result'] = curl_multi_getcontent($done['handle']);
                        }
                    }
                    curl_multi_remove_handle($master, $done['handle']);
                    curl_close($done['handle']);
                    $threads--;
                }
            }
        }
    }
    do { //finish sending remaining queue items when all have been added to curl
        //usleep(100);
        while(($execrun = curl_multi_exec($master, $running)) === -1){}
        curl_multi_select($master);
        while($done = curl_multi_info_read($master)) {
            foreach($results as &$res) {
                if($res['handle'] == $done['handle']) {
                    $res['result'] = curl_multi_getcontent($done['handle']);
                }
            }
            curl_multi_remove_handle($master, $done['handle']);
            curl_close($done['handle']);
            $threads--;
        }
    } while($running > 0);
    curl_multi_close($master);
    return $results;
}

这里是单线程curl函数。

function runSingleRequests($url_array) {
foreach($url_array as $url) {   

// Initialize a CURL session. 
$ch = curl_init();  

// Page contents not needed. 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 

// grab URL and pass it to the variable. 
curl_setopt($ch, CURLOPT_URL, $url); 

// process the request.  
$result = curl_exec($ch);

    }

两者均以URL数组作为输入。

[我目前有一个包含所有单个任务的数组,另一个包含所有多个任务的另一个数组,带有“标头ID”,可让我知道每个详细任务属于哪个标头任务。

在理论或代码方面的任何帮助将不胜感激。谢谢!

php curl
2个回答
1
投票

您为什么不使用基本任务计划程序来计划您的请求和跟进,而不是一次运行所有内容?

查看实际使用情况:https://ideone.com/suTUBS

<?php
class Task 
{
    protected $follow_up = [];
    protected $task_callback;

    public function __construct($task_callback) 
    {
        $this->task_callback = $task_callback;
    }

    public function addFollowUp(Task $follow_up) 
    {
        $this->follow_up[] = $follow_up;
    }

    public function complete() 
    {
        foreach($this->follow_up as $runnable) {
            $runnable->run();
        }
    }

    public function run() 
    {
        $callback = $this->task_callback;

        $callback($this);
    }
}



$provided_task_scheduler_from_somewhere = function() 
{
    $tasks = [];

    $global_message_thing = 'failed';

    $second_global_message_thing = 'failed';

    $task1 = new Task(function (Task $runner) 
    {
        $something_in_closure = function() use ($runner) {
            echo "running task one\n";
            $runner->complete();
        };
        $something_in_closure();
    });

    /**
     * use $global_message_thing as reference so we can manipulate it
     * This will make sure that the follow up on this one knows the status of what happened here
     */
    $second_follow_up = new Task(function(Task $runner) use (&$global_message_thing)
    { 
        echo "second follow up on task one.\n";
        $global_message_thing = "success";
        $runner->complete();
    });

    /**
     * Just doing things in random order to show that order doesn't really matter with a task scheduler
     * just the follow ups
     */
    $tasks[] = $task1;

    $tasks[] = new Task(function(Task $runner) 
    {
        echo "running task 2\n";
        $runner->complete();
    });

    $task1->addFollowUp(new Task(function(Task $runner) 
    { 
        echo "follow up on task one.\n";
        $runner->complete();
    }));

    $task1->addFollowUp($second_follow_up);

    /**
     * Adding the references to our "status" trackers here to know what to print
     * One will still be on failed because we did nothing with it. this way we know it works properly
     * as a control.
     */
    $second_follow_up->addFollowUp(new Task(function(Task $runner) use (&$global_message_thing, &$second_global_message_thing) {
        if($global_message_thing === "success") {
            echo "follow up on the second follow up, three layers now, w00007!\n";
        }
        if($second_global_message_thing === "success") {
            echo "you don't see this\n";
        }
        $runner->complete();
    }));
    return $tasks;
};
/**
 * Normally you'd use some aggretating function to build up your tasks
 * list or a collection of classes. I simulated that here with this callback function.
 */
$tasks = $provided_task_scheduler_from_somewhere();

foreach($tasks as $task) {
    $task->run();
}

通过这种方式,您可以嵌套需要相互遵循的任务,通过对闭包的巧妙使用,您可以将参数传递给执行函数和外部的包围对象。

在我的示例中,Task对象本身被传递给执行函数,因此执行函数在完成其工作时可以调用complete。当完成任务被调用时,任务确定它是否安排了要执行的后续任务,如果有,这些任务将被自动调用并像这样在链下进行工作。

这是一个基本的任务计划程序,但是它应该可以帮助您按照要执行的步骤来计划步骤。


0
投票

这是一个更容易理解的示例,发件人:http://arguments.callee.info/2010/02/21/multiple-curl-requests-with-php/

curl_multi_init。该功能家族允许您组合cURL句柄并同时执行它们。

示例

构建单个请求,但不执行它们

$ch_1 = curl_init('http://webservice.one.com/');
$ch_2 = curl_init('http://webservice.two.com/');
curl_setopt($ch_1, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch_2, CURLOPT_RETURNTRANSFER, true);

构建多卷曲句柄,同时添加两个$ ch

$mh = curl_multi_init();
curl_multi_add_handle($mh, $ch_1);
curl_multi_add_handle($mh, $ch_2);

同时执行所有查询,并在所有查询完成后继续执行

  $running = null;
  do {
    curl_multi_exec($mh, $running);
  } while ($running);

关闭手柄

curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);

我们所有的请求都已完成,现在我们可以访问结果

$response_1 = curl_multi_getcontent($ch_1);
$response_2 = curl_multi_getcontent($ch_2);
echo "$response_1 $response_2"; // output results

如果两个网站都需要一秒钟返回,那么我们实际上将使用第二个示例而不是第一个示例将页面加载时间缩短一半!

注解:https://www.php.net/manual/en/function.curl-multi-init.php

© www.soinside.com 2019 - 2024. All rights reserved.