Using streams in wordpress http requests

The problem

Making server to server requests is a common task. You might want to call remote REST services or download content from other servers. Sometimes instead you want to download large data, such as images or large text files, requesting it directly from your php code. In this scenario you should take care of php script execution limitation expressed in memory and time allowed by your server configuration. Some time you're even using wordpress and its high level interface (wp_remote_request). What if you've to use wpremoterequest to download a big file in a server to server request? How to avoid a classic error like the one below?

PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 229901081 bytes)  

Some approaches

The reason to keep using to wp_remote_request, or wp_remote_get or wp_remote_post etc., is that your application runs behind a proxy and you want just a single place to configure it, and it has to be at wordpress level. So, the prerequisite is that we have to use wpremoterequest family function. This family functions uses, under the hood, three transports: curl, streams and fsockopen. All of those transports are implemented in three wordpress classes and should work the same way, so I'll ignore transport dependant variations. When you download content server side you're code looks like this

$args = array( 'timeout' => 30, 'blocking' => true, 'user-agent' => 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0' ); 
$url = 'http://www.otherserver.com/image_very_big.jpg';
$response = wp_remote_get($url, $args);

We'are sayng to worpdpress to make a server-to-server request to url defined in $url with arguments. Once we have the response we could retrieve the image and write it onto a file:

if(! ($response instanceof \WP_Error) ){  
   $fp = fopen('/path/to/save/file.jpg', "w"); 
   if($fp === false){ 
       return null;
   } 
   fwrite($fp, $response['body']); 
   fclose($fp); 
}

The problem here is that, if the file is too big the $response['body'] properties can not contain it, generating the error about too much memory requested. Diving into wordpress code you can land in front of wp_remote_request parameters, and you can take a look to 'stream' and 'filename' parameters. On wordpress codex site this two parameters are not documented anywhere or, at least, I wasn't able to find them. They work this way anyway. If you specify:

$args['stream'] = true; 
$args['filename'] = '/path/to/download/file.jpg';

wordpress will use the stream opened by cUrl to write the body content directly to 'filename' argument position instead of populate the body response parameter (see this). In particular it skips the following line of code:

$this->body .= $data;

which is evil! So let's rewrite our code with some fallbacks:

$attachment_download_path = '/path/to/download/file.jpg'; 
$args['stream'] = true; 
$args['filename'] = $attachment_download_path; $response = wp_remote_get($url, $args); 
if($response instanceof \WP_Error){  
    $args['stream'] = false; 
    $args['filename'] = null; 
    $response = wp_remote_get($url, $args);
    if($response instanceof \WP_Error){ 
        return null; 
    } 
    $fp = fopen($attachment_download_path, "w");
    if($fp === false){ 
        return null; 
    } 
    fwrite($fp, $response['body']); 
    fclose($fp); 
}

If new approach actual download the file everything is ok, otherwise just use old approach and read the body of the response (which can fail of course!).

Problems, again

This approach is error prone too. The problem is that, if you specify 'blocking' parameter to false, your file will not be downloaded at all, at least in wordpress 3.6. The 'blocking' parameter, according to documentation act as follow:

The '<b>blocking'</b> argument allows you to trigger a non-blocking request. The default is true; setting it to false will generally allow PHP to continue execution while the transport is working. The key is that when you set blocking to false, then it will just send the request and won't bother you with the details. This is useful for sending a POST request, where you aren't concerned with whether it succeeded or not, or if you don't want to slow down the processing time of the page. (Note that not all the transports support non-blocking requests, and so you may still be blocked anyway. The alternative of setting an ultra-low timeout is not recommended, since low timeouts may cause the request to not be sent at all with some transports.)  

I don't know how to expect this to work with stream parameter set to true. Wordpress code does not download the response if blocking is set to true and I believe this is the right thing to do, according to documentation. So, if you want to use streams to download large file, your request needs to be blocking.