Monday, July 2, 2012

Big Improvements!

Update- Blog moved to: http://saleemabdulhamid.com/blog/2012/7/big-improvements

Introduction

Another day, another (two) optimizations- one of which turned out to be a really big improvement.
Previous posts in this series:

Cacheing the File Contents

The first optimization has to do with reading-in of files from the file system. Unlike a traditional static file server, we need the contents as a string in javascript memory (as opposed to being stored as a buffer) for two reasons- (a) we need to parse files for dependencies (which could be done before the server starts listening, since we cache those dependencies) and (b) we need to dynamically join multiple files into optimized bundles. In the non-optomized version of mundlejs, a file is read from the file system every time it is requested, or when a file is requested that has it as a dependency. As of this commit, the file contents are kept in memory so they only need to be read from the file system the first time that particular file is needed. In the future, as part of a deploy-mode, we could go ahead and get all the file contents in memory before the server starts listening. Here are the results with this optimization:
Again we have a definite improvement, but still nothing in the same order of magnitude as our competitors.

Cacheing the Bundled Javascript in Buffers

The last very obvious, low-hanging-fruit-with-a-lot-of-promise optimization has two components.
Although bundling of javascript happens dynamically, for a given file request and list of already loaded dependencies, the bundle that returns will always be the same. Depending on the different workflows a user might take through a client-side application there is usually a pretty high chance of multiple clients requesting the same bundle in this fashion. So it's not necessary compare the already-loaded files to the dependencies of the requested files and build the bundle from scratch every time- we can just cache that bundle the first time it is created. Since we are already cacheing the dependency list and the file contents, this is probably not a really big optimization, but it should still have some positive effect in terms of cutting down the amount of computation spent per request.
Cacheing these bundles leads to an opportunity to optimize the biggest bottleneck we have- because the server response needs to be built up dynamically, we build it as a string in memory. When it is served, V8 has to make a copy of the string- because the address of javascript variables in memory can change at runtime, the data can't be sent directly to the network. This operation is very expensive and is incurred for every request. The solution is, since we're already cacheing the bundles, instead of cacheing them as strings in memory, convert them to buffers and cache those. Buffers point to real memory locations outside of the V8 heap and therefore can be sent directly to the network on every subsequent request for that particular bundle.
Here are the results of this optimization:
Suddenly, we're on the same order of magnitude performance as Apache and Connect, and in fact seem to be beating Connect!
Now that we have performance in the same ballpark as these others, I will try and rerun the benchmarks on a higher concurrency range, with a larger number of iterations, and on a clean reboot for each run. Perhaps, I'll also run some other popular servers and also try to optimize some settings of the other servers to get more accurate results.

Next Steps

I'll probably take a little break from optimizations now to focus on perfecting the testing coverage, especially of the client-side code which is only covered by manual tests at this time.

Sunday, July 1, 2012

Optimizing Mundlejs: Cacheing the dependency parse

Update- Blog moved to: http://saleemabdulhamid.com/blog/2012/7/optimizing-mundlejs-cacheing-the-dependency-parse

Introduction

Following my previous efforts at establishing a baseline benchmark for mundlejs, I took my first step at optimization. Previously, the contents of every javascript file requested were parsed for dependencies. This occurred on every request from a different client. As of this commit, the direct dependencies of each file are cached, so the next time that file is requested, it is not necessary to parse the file again. This is a precursor to being able to pre-parse all files prior to beginning to serve files, which will be one of the features of the "deploy mode." Because we don't have cluster support built in yet, in the benchmarks below all parses are done 8 times (or however many cores are on the testing machine), since the cache is not shared between worker processes yet.

Results

We're still not on the same order of magnitude as Apache and Connect, but there was a significant improvement. The little dip at one data point is probably just an artifact of something else that was happening on my machine at the same time. I don't think I'll be exploring it further, preferring to spend my time on the next step in optomization.