Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 1.6.6

Bug #21081 Fix performance bottlenecks in pipes handling
Submitted: 2016-06-14 17:31 UTC
From: alec Assigned: alec
Status: Closed Package: Crypt_GPG (version 1.4.1)
PHP Version: Irrelevant OS:
Roadmaps: 1.4.2    
Subscription  


 [2016-06-14 17:31 UTC] alec (Aleksander Machniak)
Description: ------------ I continued my investigation on why Crypt_GPG is much slower than gnupg in command line. See #21077 and #21080. Here I finally found where's the bottleneck and have some code that makes Crypt_GPG overhead non-existent for encrypting and/or signing when working with both strings and files. Consider this line of code: https://github.com/pear/Crypt_GPG/blob/master/Crypt/GPG/Engine.php#L1335. For input string of 60MB (and chunk size 65KB) it will be executed at least 1000 times. Such operation is very expensive when working with such a big data. Getting rid of this memory re-allocations makes performance more or less the same as in command line. Hurray! One issue still, for investigation. The process uses 150MB of peak memory, better than before but still it's 2-3 times the data size, could be better. There's another bottlenect when working with files instead of strings. Reading from GPG output is faster than writting to a file, which makes that $inputBuffer grows too much, making the same issue as above + big memory peak usage when it's not expected at all - we use files instead of strings for a reason, right? So, my patch will fix this too, making encryption of 60MB file to use 1.5MB of peak mem. diff --git a/Crypt/GPG/Engine.php b/Crypt/GPG/Engine.php index ed01eba..3fe3585 100644 --- a/Crypt/GPG/Engine.php +++ b/Crypt/GPG/Engine.php @@ -1210,6 +1210,7 @@ class Crypt_GPG_Engine // select loop delay in milliseconds $delay = 0; + $inputPosition = 0; while (true) { @@ -1311,7 +1312,7 @@ class Crypt_GPG_Engine $chunk = Crypt_GPG_ByteUtils::substr( $inputBuffer, - 0, + $inputPosition, self::CHUNK_SIZE ); @@ -1332,15 +1333,29 @@ class Crypt_GPG_Engine $this->_closePipe(self::FD_INPUT); } else { $this->_debug('=> wrote ' . $length . ' bytes'); - $inputBuffer = Crypt_GPG_ByteUtils::substr( - $inputBuffer, - $length - ); + // Move the position pointer, don't modify $inputBuffer + // when working on a string + if (is_string($this->_input)) { + $inputPosition += $length; + } + else { + $inputPosition = 0; + $inputBuffer = Crypt_GPG_ByteUtils::substr( + $inputBuffer, + $length + ); + } } } // read input (from PHP stream) - if (in_array($this->_input, $inputStreams, true)) { + if (in_array($this->_input, $inputStreams, true) + // If inputBuffer is too big wait until it's smaller, we do this + // because when working with files we don't want to use too much + // memory, this can happen because encoding is faster than writing + // to a file + && Crypt_GPG_ByteUtils::strlen($inputBuffer) < self::CHUNK_SIZE + ) { $this->_debug('input stream is ready for reading'); $this->_debug( '=> about to read ' . self::CHUNK_SIZE . I'll apply the patch soon. Need to investigate what's the impact of #21080 and #21077 with above fix applied.

Comments

 [2016-06-14 17:40 UTC] alec (Aleksander Machniak)
I did some tests. It looks that improvements from #21077 and #21080 are completely negligible now. When working with smaller data buffer, the chunk size as well as mbstring use does not cause any perf. increase nor degradation.
 [2016-06-14 17:48 UTC] alec (Aleksander Machniak)
One correction. This "There's another bottlenect when working with files instead of strings. Reading from GPG output is faster than writting to a file" should be "There's another bottlenect when working with files instead of strings. Reading from file is faster than writting to gnupg process".
 [2016-06-15 11:22 UTC] alec (Aleksander Machniak)
-Status: Open +Status: Closed -Assigned To: +Assigned To: alec
This bug has been fixed in SVN. If this was a documentation problem, the fix will appear on pear.php.net by the end of next Sunday (CET). If this was a problem with the pear.php.net website, the change should be live shortly. Otherwise, the fix will appear in the package's next release. Thank you for the report and for helping us make PEAR better.
 [2016-06-15 14:18 UTC] gauthierm (Michael Gauthier)
This is awesome!
 [2016-06-16 08:24 UTC] alec (Aleksander Machniak)
For completeness. My statement "The process uses 150MB of peak memory, better than before but still it's 2-3 times the data size, could be better." was not really correct. The memory use here is precisely as expected, because it's 60MB of input string + 84MB of output string. I described my findings also in my blog: https://kolabian.wordpress.com/2016/06/15/how-i-made-crypt_gpg-100-times-faster/