Friday, 19 June 2009

Google Appengine (III) – Tuning means code refactoring

One of the last tuning gigs I was involved with was related to a content recommendation system. The original implementation was a one by one translation of a statistics algorithms into SQL sentences. It generated a huge number of precalculations stored in a multi million record table that were later on reused when visitors were looking for content. It was even worse than that, as the result size grew exponentially with the number of users.

The solution involved creating a set of stored procedures that were able to calculate in submillisecond times the same results without needing to store any intermediate results. Ah, and I discovered a few bugs on the original implementation that rendered the results of all those precalculations incorrect. Overall the customer was very satisfied with the end result.

After we were done with the testing and verification of results, something that is not an easy task with recommendation systems as it basically involves asking people if the results are more or less matching with their tastes, we set out to incorporate the changes to the live system.
Apart of creating the stored procedures, the changes in client application code amounted to five lines of code. Total. This is not unusual. And application down time was zero. Nothing. Users were using the old recommendation and in their next request they were using the new.

Application code changes are expensive, so usually the ratio of cost/benefit has to be quite high to be worth doing any performance optimization work in this area (except if you're dealing with SAP, of course, where everything is ABAP code)

Fact is, by its own design, SQL is an intermediate middleware layer sitting between application code and your data. Performance tuning happens mostly at the SQL layer or below. Seldom application code is changed.

Not so with Google AppEngine.

GAE SQL Dialect



GAE implements something remotely similar to a very small subset of SQL. Now, SELECT, FROM and WHERE have become so popular keywords that each time anyone wants to implement something even remotely related to a query language is using those keywords.

Be it WMI, Lotus Notes ODBC, GAE, Hibernate or whatever, they want to make it easy to learn at least the basics.

But beyond those keywords, they don't provide anything remotely similar to the richness and expressiveness of the SQL language. Simply put, GAE SQL dialect (called GQL) does not go beyond the simplest of data accesses. Fundamentally, GQL is not a set oriented language. Furthermore, whereas SQL provides many different ways of reaching the same final result, GQL usually provides on, if any at all.

Object model



Say that you're playing with performance improvements. You're looking for ways of minimizing the number of IO operations, the number of indexes or both.

Due to the limitations of GQL, none of the different options can be expressed using the query language. The possible choices range from using references, property lists, inheritance, or a combination of them all. The key part of in all of them is that you'll need to change application code to even try them.

This has a number of implications that make the evaluation of different options much more convoluted than in the traditional SQL server scenario.
  • Each time you want to test something, you need to do an extensive data conversion from one object model to another. This means time invested in data conversion code, time invested in running and verifying that code. Furthermore, if you finally decide to go for a change in pursuit of better performance, you have to execute that code in the live environment, and that probably means application downtime.
  • No development environment. GAE SDK does not implement the BigTable functionality or indexing. You cannot simply compare performance of different approaches on the local development environment. You need to use the live GAE for that.
  • Cost. Free GAE has limitations in the amount of IO you can do and the storage you can use, if you reach those during testing, you'll exceed your quota. Either you'll have to pay or wait for quota renewal.

All this is a bit contradictory. While GAE really makes rapid application development easy and attractive, it is still too young to have developed a set of performance best practices. Your best bet is to reach out for all the documentation available and try to get it right in the early stages.

Good documentation, wrong format



Fortunately, Google has created quite good documentation and tutorials and you can get a lot of learning out of those. Especially the Google IO sessions are valuable insights into how the App Engine operates. Unfortunately, most of that is in video format.

Mind you, I'm not whining about the quality of the conferences. They are excellent, and I'm surprised about how good the GAE team are at that. It is a common topic among the developer community that software development profiles do not value much the quality of verbal communication. Not so with the GAE team.

But video as a documentation platform has a lot of problems. First, you have to watch them end to end to really decide if the content is interesting as they are not structured in a book format with chapters, sections, etc. You cannot read the first paragraphs to decide if it's relevant to the topic you're interested in. And second, they cannot be used as reference material or something that you can bookmark to return later. Third, I can read a lot faster than listen to an speech, so what could have been digested in 15 minutes takes an hour instead.

The supporting materials, mostly slides, help to alleviate those problems somewhat. Ironically, they probably also provide the text that also makes those contents easily searchable on Google.

Now, if I just had time to put together a real GAE app.....