Google App Engine

Google App Engine

When the Google App Engine first launched I didn’t really have much to say about this. It was neat but I failed to fully comprehend what it represented. Only after I read a transcript of Steve Yegge’s recent talk in which he briefly mentioned it I realized the significance of this service. It was really an off-hand comment but something just popped in my head the moment he mentioned it. Which is not surprising at all:

croppercapture77-custom.jpg

Btw, I just wanted to mention that Google App Engine and Google Apps for Your Domain are really heading for a branding conflict. Not even the Google search engine itself can easily distinguish between the two. One ought to be renamed at some point to spare users the confusion.

But why is this service interesting? It gives us a brand new hosting model. You want to build small, throw away web applications? Look no further – all you need is a valid Gmail account and some knack for Python scripting. That’s all. No configuration, no maintenance, and no hassle – you even get a free subdomain at appspot.com.

In the past you really just had to buy yourself some hosting space. There was no way around it. In most cases you ended up with a heavy duty plan with your own domain name, lots of space, a full LAMP stack (or something equivalent that needs to be maintained), ssh access and various other benefits. Often an overkill and and a considerable investment both time and money wise for what you might have needed.

Google on the other hand provides you with a basic environment that’s relatively easy to pick up, and lets you start hacking immediately. It is a bit like what Geocities was to us in the early days of the web – a free hosting where anyone can put their personal web app. Especially if you don’t want to pay a monthly fee because you are broke, or because your parents won’t let you use their credit card to buy hosting. It’s a place for young and inspired teenagers, or lazy and bored developers with a cool idea but no inclination to host and maintain. The mundane details are abstracted – you don’t need source code repository, because each time you deploy Google stores a copy of your old code so that you can roll back.

You don’t need to worry about database specific stuff – like setting up users, connecting to the db and etc – it’s all built in, automated and behind the scenes. It’s actually even easier than in RoR where you still have to create a database configuration file, build migration files and then “rake” them. Google App Engine just requires that you import the db module, and write a Model class specifying data types for the fields you want to store – right there in your code. And the fun part is that you can modify your data scheme by just modifying the code. On top of that you get a neat web interface letting you maintain and browse through the stored data.

You don’t need to worry about authentication scheme since you can just drop in Goggle’s authentication instead. This way they store the passwords, and they are responsible for making things secure and efficient. Implementing these an authentication scheme yourself is always a hassle – so “outsourcing” it this way is generally a good idea – especially for small projects, or when you are feeling lazy.

Getting dugg, or DDOS’ed (aren’t the two the same though?) is no longer your problem, but Google’s problem – and they are pretty good at load balancing throttling and dealing with situations like that. The scalability of Google App Engine can actually be a major draw – for anyone. For example, Jaiku – a somewhat popular twitter clone is migrating to the Google App Engine platform for that very reason: scalability. This move will allow them to concentrate on delivering value to end users via new features and bug fixes rather than fighting a constant logistical battle trying to build up their infrastructure to handle ever increasing server load. It’s not an easy battle to win, as Twitter users can attest to. Scaling up is arguably most expensive, and most labor intensive element of running a popular web application. If Jaiku can offload this cost onto Google they gain an important advantage over Google.

So Google is offering us is a really powerful and versatile package, and it’s free. All you need to do is create an account and start hacking. If you look through their app library you will see all kinds of interesting ideas popping up. Some are big apps like Jaiku, and some are small hobby projects. Out of the latter my favorite must be the TweetWheel. Just try it and see that it is an interesting concept, but not really something you would actually want to pay for hosting.

This is a profound step toward increasingly more complex scriptable server side systems. It will never really replace traditional hosting. Since Google App Engine is a sandbox environment, a lot of apps would be simply impractical or inconvenient to run there. But it offers a new compelling new alternative for established applications, startups and small hobby projects alike. But it is more than that. This is an experiment in allowing users run and host their code on your servers.

We all know that moddable and scriptable systems are valuable because community that grows around them adds immense value. Look at Firefox for example – with millions of extensions and add-ons that do everything from checking your email to playing music. Some are actually incredible paradigm breaking ideas such as Greasemonkey which allows you to inject your own Javascript into a page while it is rendered allowing you to control it’s layout and even add features never considered by creators. This is the type of creativity that can be spurned only by a fully scriptable system. If you want another success story look at Emacs – a text editor which started as very basic text processing engine, and grew into a monster that we know and love today.

So scriptable and moddable applications on the desktop are good thing. And we pretty much figured out ways to make sure scripts and mods can’t totally break your app – at least not easily. We also know that moddable web applications are great too – everyone loves Twitter apps, Facebook apps and Google maps mashups. These are great examples how community can add to a great service. This is why every service which wants to be popular creates a public API. But, if you want to write a facebook app, or a twitter app or something like that you have to host it somewhere. And your hosting plan must be robust enough to survive a sudden spike in popularity. I have seen many a Facebook app go under or become unusable because their servers could not handle the load. And this is a problem – not everyone can afford to have a dedicated server just to host an add-on for a popular online service.

This is the problem that Google App Engine solves. They are the few companies out there which figured out how to safely let people run arbitrary code on their servers without getting pwned by script kiddies. They have found a way to safely, securely and efficiently sandbox user created web applications from the rest of their system. And this is huge. It is a proof of concept that this sort of thing can be done, and it can be done safely.

The implication of this experiment is that people can now start building extendable systems which not only expose a public API but will host and maintain user-submitted add-ons making them much more reliable responsive and flexible. After all, if you are hosting the app there is no reason to communicate via XML, JSON or whatever your public API is using. You can simply let users call a function and return an object they can manipulate eliminating the overhead of TCP/IP transmission and serializing/serializing data. I don’t think anyone is doing that yet, but Google App Engine is a first step in that direction. In the future we can see a brand new species of online services – ones that grow and transform themselves and adopt allowing the vibrant community to add new features and modules. Picture this: an emacs like flexibility and malleability, but for a web service. This could be huge…

[tags]google app engine, scripting, python, web services, web applications[/tags]

This entry was posted in programming and tagged , . Bookmark the permalink.



7 Responses to Google App Engine

  1. Naum UNITED STATES Mozilla Firefox Mac OS says:

    been meaning to do a writeup of GAE as I have tinkered a bit with it, and generally like it, but there are severe shortcomings if the platform is to be used for anything other than non-trivial toy like applications…

    it’s also rekindled an interest in python, which I did some coding in nearly 10 years ago, way back before python v2 even existed. oddly, my coder brain seems to wrap itself around python much more naturally than ruby, which i love too (but recognize its warts)…

    first the GOOD:

    1. as you’ve alluded to, drop dead simple to develop, without all the annoyances and grievances of deployment. one-click upload and deploy.

    2. open to using whatever platform (well, at least the controller/view deal, not the database implementation, which you can modify a library of your choosing but you stuck with GQL scheme (which has serious drawbacks as I explain further down))

    3. python is a great language and though some complain about lack of alternatives, python works for me

    4. auto user functionality built-in

    however, app engine is an immature platform and faces obstacles for any serious usage:

    the BAD:

    1. the datastore model – you’re really limited to a indexed flat file / single table storage deal. no joins (which i don’t have a big beef with), 1000k fetch limit (you shouldn’t really be doing more DB access in an online transaction!) — those are not show stoppers. what is, is that you are unable to do (1) aggregate functions like count, average, sum, etc… without hacking in a inelegant solution that totally violates atomicity. worse, if you mangle the DB (as i will get to #2 in a second), no way to really clean it up… in terms of table relations, you can have foreign keys and even setup a many to many scheme, but it’s fugly and again you will have to tread carefully and clean up yourself after deletes, and still are limited to 1000:1 ratio

    2. #1 wouldn’t be bad if you had capacity to run background/batch jobs (not asking for ever-executing daemons, just cron type scripts to run and perform summarize | DB scrub | etc….). I understand that the Google Lords don’t want people spidering the entire net, but they could limit such executions to 15 (or 10 or whatever cap) minutes or less and still charge for the CPU|GPU cost incurred…

    3. without ability to do serious work in a transaction, how is it possible to bulk upload? even data files are limited to 1M. in a toy app i coded, i needed to hit /usr/share/dict/words and in the virtual space provided, didn’t see how to get it on server but had to restrict my word list to less than 9 chars so the file could fit under 1M…

    4. the given examples use a webapp framework that is really limited when it comes to handling sessions, cookies or anything more detailed than a simple app with a handful of pages. this can easily be mitigated as Google themselves have a Django app-engine helper that makes it easy (though it’s still a scaled down Django without the models, middleware, and other parts). i would highly recommend using that or employing web2py (or another simple python framework like web.py).

    Reply  |  Quote
  2. Luke Maciak UNITED STATES Mozilla Firefox Windows Terminalist says:

    Very good points. The lack of aggregate queries in GQL is indeed incredibly annoying. I think the idea is to do a SELECT * and then aggregate in your code or something – which is boneheaded. I’m guessing since the indexes are just flat files like you said, they just can’t delegate this sort of thing to the database – and perhaps they didn’t feel like implementing parsing for the COUNT(), AVG() and etc seeing how it would still boil down to a select statement and a simple loop.

    I never really expected cron-job like behavior on this service. I think it is one of those limitations I talked about. But you are right – most people would be perfectly happy with a quota of some sort – even if it was 1 operation per hour. But this sort of thing is ripe of abuse so I’m not surprised they didn’t offer it.

    #3 is interesting – Jaiku is moving to Google apps. I suspect they are not bound by these limitations. So it’s possible that in the future Google may offer Pro accounts without these limits.

    As for #4 you pretty much answered it. Even though their framework is limited, you can easily drop in and use another one – as long as you can get it working with the data store, and change all calls to urllib2 into urlfetch :P

    Reply  |  Quote
  3. Naum UNITED STATES Mozilla Firefox Mac OS says:

    if you watch the video presentations from google io, the datastore lead walks through some hacks to implement aggregate type functions and many to many relationships — they’re ugly hacks, though the many to many is passable if your data model is going to be less than or equal to 1000 to 1. the aggregate hack is fugly and violates any aspect of atomicity. worse, if you mess up, there is literally no way to repair your data.

    now, on the google group forums, there is quite a bit of clever engineering around those restrictions by chaining requests.

    i’m rooting for them to succeed but given current platform features, i doubt anybody will use other than some toy apps. and given the traffic i see on the google group boards, it doesn’t appear that developers are exactly flocking there…

    another point i didn’t mention is that there is a cap on each transaction even if you are under the DB call quota you still can be flagged for excessive CPU. for example, i coded a simple app that does ONE DB FETCH and unpickles/pickles (Python equivalent of PHP serialize, Ruby marshal) a 200K object graph for which my logs show that i exceed the minimum quota and will be charged out of another queue that could shutdown my app. they don’t list the caps for that special bucket either, so i guess it’s at whim for now…

    batch job or some sort of DB export/import is needed… …in 2008, just not customer friendly to not offer migration/export function and that limits appengine to just a toy that will get some oohs and ahs for things like wordle but fairly limited in what can be accomplished (how many bookmarking/vote a link sites do we really need?)

    Reply  |  Quote
  4. Jules Carney INDIA Mozilla Firefox Windows says:

    I have to say, you did a really nice job on explaining something that can be really tricky at times. There are times that I struggle with wrapping my head around topics like the this, thank you for summing it up well.

    Thanks!

    Reply  |  Quote
  5. Pingback: Terminally Incoherent » Blog Archive » Your Homepage on Google AppEngine UNITED STATES WordPress

  6. delareyeslevi INDIA Mozilla Firefox Ubuntu Linux says:

    Nice, really.
    Many like me have tried and realized the same as in this blog:-
    http://bygsoft.wordpress.com/2010/01/09/cloudy-combo-google-app-engine -and-amazon-s3-combo-pack/

    Reply  |  Quote
  7. RoHa GERMANY Safari Windows says:

    Hy Luke, I released a tiny framework to manage homepage at Google AppEngine online. Once installed, you can upload and edit files / scripts without SDK.

    Reply  |  Quote

Leave a Reply

Your email address will not be published. Required fields are marked *