Mathjax

Wednesday, October 23, 2013

Improving the performance of reversehuffman.encrypt, Part II

In the encrypt algorithm, one stage requires a scan of the frequency table to generate a list of potential next characters based off of the previous n-1 characters. In the previous versions of the algorithm, this was performed using a linear scan. Since the frequency table is sorted in string order, an obvious optimization is to use a binary search and terminate the linear scan early.

The results:

 ImplementationUser Time
(Baseline) String.indexOf(...) === 0   1m21.828s
String.beginsWith(...)1m10.723s
beginsWith(str, prefix)1m11.457s
Binary search0m01.713s

Now, that's a nice result.

For completeness, my updated detailed profile results:

Statistical profiling result from /home/starr/svn/plainsight-addon/lib/v8.log, (1550 ticks, 0 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
    433   27.9%    0.0%  /usr/lib/libv8.so.3.14.5
     16    1.0%    0.0%  /lib/x86_64-linux-gnu/libc-2.17.so
     13    0.8%    0.0%  7fff58bfe000-7fff58c00000
      1    0.1%    0.0%  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.18
      1    0.1%    0.0%  /usr/bin/nodejs
      1    0.1%    0.0%  /lib/x86_64-linux-gnu/ld-2.17.so

 [JavaScript]:
   ticks  total  nonlib   name
    115    7.4%   10.6%  LazyCompile: Object.defineProperty.value /home/starr/svn/plainsight-addon/lib/reversehuffman.js:28
     83    5.4%    7.6%  LazyCompile: huffmantree /home/starr/svn/plainsight-addon/lib/textutils.js:220
     71    4.6%    6.5%  LazyCompile: *InsertionSort native array.js:764
     68    4.4%    6.3%  LazyCompile: *nextChars /home/starr/svn/plainsight-addon/lib/reversehuffman.js:170
     54    3.5%    5.0%  LazyCompile: *binaryInsertionPoint /home/starr/svn/plainsight-addon/lib/textutils.js:20
     53    3.4%    4.9%  Stub: CompareICStub
     47    3.0%    4.3%  Stub: FastNewClosureStub
     41    2.6%    3.8%  LazyCompile: *sort native array.js:741
     39    2.5%    3.6%  LazyCompile: *reduce native array.js:1381
     37    2.4%    3.4%  Stub: CompareICStub {2}
     35    2.3%    3.2%  LazyCompile: *self.push /home/starr/svn/plainsight-addon/lib/priority_queue.js:129
     35    2.3%    3.2%  LazyCompile: *QuickSort native array.js:793
     34    2.2%    3.1%  LazyCompile: *self.pop /home/starr/svn/plainsight-addon/lib/priority_queue.js:57
     31    2.0%    2.9%  Builtin: A builtin from the snapshot


The function nextChars is where I implemented the binary search.

Improving the performance of reversehuffman.encrypt, Part I

The Reverse Huffman encryption and decryption routines are the most expensive operations within the add-on. One of my beta testers reports that it took him 108 seconds to encrypt 116 words (of Lorem Ipsum) on his computer. Let's bring that down.

I created this script for profiling use:

 var fs = require("fs");  
 var rh = require("./reversehuffman");  
 var textutils = require("./textutils");  
 let source = fs.readFileSync("../doc/independence.txt", {encoding: "utf-8", flag: "r"});  
 let ft = textutils.frequencyTable(4, source);  
 for(let i = 0; i < 1000; i++) {  
      rh.encrypt("this is some plain text", ft);  
 }  

Running my initial implementation through the profiler, I get the following report (report cropped):

Statistical profiling result from /home/starr/svn/plainsight-addon/lib/v8.log, (77854 ticks, 0 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
  31451   40.4%    0.0%  /usr/lib/libv8.so.3.14.5
     36    0.0%    0.0%  /lib/x86_64-linux-gnu/libc-2.17.so
     16    0.0%    0.0%  7fff2d9fe000-7fff2da00000
      3    0.0%    0.0%  /usr/bin/nodejs
      1    0.0%    0.0%  /lib/x86_64-linux-gnu/libpthread-2.17.so
      1    0.0%    0.0%  /lib/x86_64-linux-gnu/libm-2.17.so


 [JavaScript]:
   ticks  total  nonlib   name
  21179   27.2%   45.7%  LazyCompile: encrypt /home/starr/svn/plainsight-addon/lib/reversehuffman.js:60
   8920   11.5%   19.2%  LazyCompile: *indexOf native string.js:118
   5013    6.4%   10.8%  Stub: KeyedLoadElementStub
   3718    4.8%    8.0%  Stub: CEntryStub
    395    0.5%    0.9%  KeyedLoadIC: args_count: 0
    177    0.2%    0.4%  LazyCompile: huffmantree /home/starr/svn/plainsight-addon/lib/textutils.js:219
    128    0.2%    0.3%  Stub: FastNewClosureStub
     85    0.1%    0.2%  LazyCompile: *sort native array.js:741
     79    0.1%    0.2%  LazyCompile: *slice native string.js:510
     78    0.1%    0.2%  LazyCompile: *InsertionSort native array.js:764
     60    0.1%    0.1%  Builtin: A builtin from the snapshot
     58    0.1%    0.1%  Stub: FastCloneShallowObjectStub
     54    0.1%    0.1%  LazyCompile: *self.push /home/starr/svn/plainsight-addon/lib/priority_queue.js:129
     52    0.1%    0.1%  LazyCompile: *QuickSort native array.js:793
     51    0.1%    0.1%  LazyCompile: huffmanencodetoken /home/starr/svn/plainsight-addon/lib/textutils.js:179


Within the table, the 'ticks' represents the number of times the sampling profiler detected that the code was within that function. Overall, the baseline test takes 1 minute, 21.828 seconds of user time.

(By impact, I should focus on encrypt first,  but I'll go after indexOf first because the costs to improve are much smaller.)

A surprising amount of time is spent in indexOf, within the encryption algorithm, we are scanning the key value and extracting strings that start with a prefix. Mozilla provides a String.startsWith function, but it is implemented in terms of computing indexOf and comparing against zero.

As an alternative, I implement a beginsWith alternate function:

 if (!String.prototype.beginsWith) {  
   Object.defineProperty(String.prototype, 'beginsWith', {  
     enumerable: false,  
     configurable: false,  
     writable: false,  
     value: function (prefix) {  
       "use strict";  
       if(this.length < prefix.length) {  
         return false;  
       } else {  
         for(let i = 0; i < prefix.length; i++) {  
           if(this.charAt(i) != prefix.charAt(i)) {  
             return false;  
           }  
         }  
         return true;  
       }  
     }  
   });  
 }  

Re-running the tests, I get:

ImplementationUser Time
(Baseline) String.indexOf(...) === 0   1m21.828s
String.beginsWith(...)1m10.723s

That shows a nice gross improvement, but what do the more detailed results say?

Statistical profiling result from /home/starr/svn/plainsight-addon/lib/v8.log, (65316 ticks, 0 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
   2253    3.4%    0.0%  /usr/lib/libv8.so.3.14.5
    524    0.8%    0.0%  /lib/x86_64-linux-gnu/libc-2.17.so
     57    0.1%    0.0%  7ffff93fe000-7ffff9400000
      2    0.0%    0.0%  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.18
      1    0.0%    0.0%  /lib/x86_64-linux-gnu/libm-2.17.so

 [JavaScript]:
   ticks  total  nonlib   name
  28448   43.6%   45.5%  LazyCompile: Object.defineProperty.value /home/starr/svn/plainsight-addon/lib/reversehuffman.js:22
  18774   28.7%   30.0%  LazyCompile: encrypt /home/starr/svn/plainsight-addon/lib/reversehuffman.js:81
   4278    6.5%    6.8%  Stub: KeyedLoadElementStub
   3993    6.1%    6.4%  Stub: FastNewBlockContextStub
   3549    5.4%    5.7%  Stub: CompareICStub {1}
    774    1.2%    1.2%  Stub: CompareICStub {2}
    398    0.6%    0.6%  Stub: ToBooleanStub_Bool
    390    0.6%    0.6%  KeyedLoadIC: args_count: 0
    142    0.2%    0.2%  LazyCompile: huffmantree /home/starr/svn/plainsight-addon/lib/textutils.js:219
    137    0.2%    0.2%  Stub: FastNewClosureStub
     93    0.1%    0.1%  LazyCompile: *InsertionSort native array.js:764
     90    0.1%    0.1%  LazyCompile: *sort native array.js:741
     82    0.1%    0.1%  LazyCompile: *slice native string.js:510


The Object.defineProperty.value refers to the new beginsWith code. Why is that so expensive? To see if it is a one-time expense versus a high overhead, I incremented the iterations from 1000 to 10000 in my profiling script and re-ran. That run took 11m43.463s of user time (703.463 seconds), which is almost exactly ten times the previous amount of time. The detailed profiler results showed the same -- the number of ticks is scaling with the number of iterations, so it is not a setup expense.

Furthermore, making the function stand-alone is not helpful (although it does make the detailed results easier to understand since the actual function being executed isn't hiding behind Object.defineProperty.value):

ImplementationUser Time
(Baseline) String.indexOf(...) === 0   1m21.828s
String.beginsWith(...)1m10.723s
beginsWith(str, prefix)1m11.457s

So, at this point, we've made a slight improvement in performance but we need to now work on the main algorithm rather than improving a subsidiary function.




Tuesday, October 22, 2013

Profiling Mozilla Add-ons (Stopgap)

There is currently no good way to profile Firefox JavaScript add-on code. For code that runs within the context of a web page, there are a number of options available (e.g. Firebug), but those solutions do not work for code that runs within the browser itself. For many add-ons, this is acceptable since they do not include any complex computations, but my add-on, Hide in Plain Sight, includes some expensive operations so I needed a profiler in order to perform evidence-based optimization.

My stopgap measure is to use the V8 JavaScript engine profiler. This is the JavaScript engine used in Chrome. Profiling code that will run in FireFox with the Chrome engine is not ideal, but I hypothesize that the two engines have similar gross performance characteristics.

Installation

In Ubuntu 13.10, using apt-get, I installed nodejs. (The version of nodejs installed with 12.04 many versions behind.)

Getting tools/linux-tick-processor (and the other functions necessary to parse the profiling output) was more involved.

Based on my history (i.e. I haven't double-checked these steps), this is what I did:

 $ apt-get source libv8  
 $ cd libv8-3.8.9.20  
 $ make dependencies  
 $ make native werror=no  

Running the Profiler
First, you need some code to profile. Not ideally, I wrote a driving script for frequencyTable, one of the functions I wanted to profile:

 var fs = require("fs");  
 var textutils = require("./textutils");  
   
 let source = fs.readFileSync("../doc/independence.txt", {encoding: "utf-8", flag: "r"});  
 
 for(let i = 0; i < 10000; i++) {  
      textutils.frequencyTable(4, source);  
 }

The "fs" library is provided by node.js.

Inside my add-ons lib/ directory, I ran the command:

 $ nodejs --harmony --use_strict --prof prof_frequencyTable.js  

The '--harmony' and '--use_strict' arguments are necessary to force node.js to allow "let" and other modern JavaScript features.

This command will run your script and create a "v8.log" file in your current directory.

To extract some useful data from v8.log, change directory to your libv8 directory and run:

 $ tools/linux-tick-processor path/to/your/v8.log  

This will write to stdout a profiling report.


Monday, October 21, 2013

Limited Beta Release of Hide in Plain Sight add-on

Today, we initiated a limited beta of the Hide in Plain Sight Firefox add-on. This release implements a Reverse Huffman steganographic algorithm and key management capabilities.

After limited beta, our plan is to push the add-on to addons.mozilla.org to simplify installation and maintenance and improve security.

Superior algorithms and releases for other browsers (Chrome has the highest priority) are on the road map.


Sunday, October 13, 2013

Impact of Corpus Size on Quality of Frequency Table

Hypothesis

The reverse Huffman steganography technique uses as input a n-character long frequency table drawn from a corpus. We know that the quality of the output increases as n grows, albeit not monotonically. n cannot grow beyond the size of the corpus and the number of branches declines as n grows leading to greater spatial inefficiency in the encoding process. We hypothesize that there are similar declines in the marginal improvement as the corpus size, s, increases.

Methodology
During the encoding process, the algorithm walks through a Huffman encoding of the frequency list to determine what tokens to emit based on a binary interpretation of the input. The actual frequency values are relatively unimportant; it is the frequency values in relation to each other that impact how the Huffman tree is generated. To measure quality, we compared the frequency tables for different sizes of s, where the different versions of the corpus are drawn from the same work.

As our distance measure, we used the Euclidean distance of the normalized frequency vectors. The tokens represent dimensions and the frequency counts are the lengths. For instance, the distance between u={a:4, b:2, c:1} and v={a:8, b:4, c:2} is zero.

Choice of Corpus
We used Darwin's The Origin of Species as the corpus for this test. Species is a fairly large work with greater than 893,000 characters in size, which allowed us to break it into 88 "chunks" of 10,149 characters. We then aggregated the chunks in increasingly large sizes via the Fibonacci sequence to arrive at nine corpora segments. The sizes of each of the corpora segments, along with sizes of representative works for comparison, are depicted in Figure 1.

Figure 1: Size of corpora increases per Fibonacci sequence
Sizes of representative works were taken from the "Plain Text UTF-8" version posted to Project Gutenberg.

For each of the nine corpora, we generated a frequency list for n=2 to 5, inclusive. We then compared the frequency lists for each pair of corpora.

Results
In the tables below, the rows and columns indicate the corpora number and the contents are the distance score for that pair (lower numbers mean that the corpora are more similar to each other).

Figure 2: Distance when n=2
Figure 3: Distance when n=3
Figure 4: Distance when n=4

Figure 5: Distance when n=5

In Figure 6 below, we show the same data as in the tables above, except the x-axis represents the ratio of size (in characters) of the two corpora being compared and the y-axis is the computed distance. Although it appears that distances converge to some average value as the ratio of sizes increase, that is almost certainly an artifact of the fact that we have fewer instances of those ratios.

Figure 6: Ratio of Corpora Sizes versus Distances
Conclusion
Figure 6 suggests that a frequency table generated from one segment of a single corpus will remain fairly distinct from a frequency table generated from another segment of the same corpus and that distinction is fairly scale invariant. This is evidence for rejecting the hypothesis. The results could be peculiar to our choice of corpus; for instance, one chapter might be written in a very different style than another chapter in the same book. However, it is also promising because it suggests that if the key is constructed from a subset of a corpus, a leak of the identity of that corpus may reveal less information to the attacker than previously assumed.