PHP CURL Retrieves Partial pages

I have the following CURL code:

$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, $url);
if ($postParameters != '') {
    curl_setopt($ch, CURLOPT_POST, TRUE);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $postParameters);
}
curl_setopt($ch, CURLOPT_COOKIEFILE, __DIR__.'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, __DIR__.'/cookie.txt');
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt($ch, CURLOPT_REFERER, $referer);
$pageResponse = curl_exec($ch); 
curl_close($ch); 

When I try to fetch pages, most of the time I get the entire page I asked for. However, from time to time I will get only parts of the page, for example:

DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en"> head> meta http-equiv="Content-Type" content="text/html; charset=windows-1251" /> meta name="generator" content="

I removed the "<" in front of the tags so the HTML code would be displayed on stack exchange. Does anybody knows why it suddenly stops receiving? I noticed that the data often abruptly stops after an open double quotes (ie content=" or username="). Not sure 100% if it always happens that way. In any case, could this be an encoding issue? Any other ideas?

Any help would be appreciated.


You can try to add some debugging.

Add these options:

curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_STDERR,$f = fopen(__DIR__ . "/error.log", "w+"));

And these before curl_close():

if($errno = curl_errno($ch)) {
    $error_message = curl_strerror($errno);
    echo "cURL error ({$errno}):n {$error_message}";
}

If that doesn't work try increasing the timeout and see if it goes away:

curl_setopt($ch, CURLOPT_TIMEOUT, 300); 

If the timeout increase works, then find out why.

链接地址: http://www.djcxy.com/p/69694.html

上一篇: foreach while循环与div列表元素中的循环

下一篇: PHP CURL检索部分页面