Tuesday, October 23, 2012

Python Templates Benchmark

Python template engines offer high reusability of markup code and the following features are used by content developers most of the time:
  • Includes: useful to incorporate some snippets of content that in most cases are common to the site, e.g. footer, scripts, styles, etc.
  • Extends: useful to define a master layout for the majority of the site content with placeholders, e.g. sidebar, horizontal menu, content, etc. The content developers extend the master layout by substituting available placeholders.
  • Widgets: usually small snippets of highly reusable markup, e.g. list item, button, etc. The content developers use widgets to increase readability and enforce consistency of design.
We will examine all mentioned features above. Test is executed in isolated environment using CPython 2.7. Latest available versions (November 23, 2013):
  1. django 1.6
  2. jinja2 2.7.1
  3. mako 0.9.0
  4. tenjin 1.1.1
  5. tornado 3.1.1
  6. wheezy.template 0.1.135
Includes & Extends: in this test case an initial version of HTML content is refactored to use include and extend features of respective template engine.
Widgets: the test case is around how widget is built and used.
02-single - a widget is built in a way that loop is inside; 03-loop - a widget represent an item that is rendered in a loop.
Here are raw numbers:
05-template

len(items) == 0

01-initial         msec    rps  tcalls  funcs
django            14241   7022     182     43
jinja2             2818  35484      27     21
mako               7223  13844      47     35
tenjin             2140  46738      27     21
tornado            2500  39996      30     18
wheezy.template     453 220703      14      8

02-include         msec    rps  tcalls  funcs
django            26690   3747     380     49
jinja2            13397   7465     109     39
mako              28374   3524     149     49
tenjin             7326  13650      88     27
tornado            2446  40889      37     18
wheezy.template     937 106685      29     13

03-extends         msec    rps  tcalls  funcs
django            45915   2178     638     72
jinja2            16550   6042     162     45
mako              38184   2619     225     63
tenjin            11088   9019     118     35
tornado            2550  39208      42     18
wheezy.template    1721  58110      47     18

04-preprocess      msec    rps  tcalls  funcs
django                 not available
jinja2                 not available
mako                   not available
tenjin                 not available
tornado                not available
wheezy.template     438 228383      14      8

len(items) == 10

01-initial         msec    rps  tcalls  funcs
django           129787    770    1586     52
jinja2            10932   9147     118     21
mako              12324   8114     136     35
tenjin             5153  19406      46     21
tornado            7493  13345     181     18
wheezy.template    1737  57568      73      8

02-include         msec    rps  tcalls  funcs
django           143243    698    1797     68
jinja2            21771   4593     200     39
mako              33042   3026     238     49
tenjin            10572   9459     107     27
tornado            7503  13329     188     18
wheezy.template    2268  44084      88     13

03-extends         msec    rps  tcalls  funcs
django           163957    610    2042     76
jinja2            25039   3994     275     45
mako              43356   2307     314     63
tenjin            14875   6723     155     38
tornado            7635  13097     193     18
wheezy.template    3236  30901     106     18

04-preprocess      msec    rps  tcalls  funcs
django                 not available
jinja2                 not available
mako                   not available
tenjin                 not available
tornado                not available
wheezy.template    1734  57680      73      8

06-widgets

len(names) == 0

01-initial         msec    rps  tcalls  funcs
django             6070  16475      80     36
jinja2             2010  49761      17     15
mako               5203  19219      35     32
tenjin             1596  62640      24     20
tornado            1920  52078      18     15
wheezy.template     250 399690       8      7

02-single          msec    rps  tcalls  funcs
django             9919  10081     130     37
jinja2             5846  17105      44     37
mako              11594   8625      74     50
tenjin             3341  29932      45     24
tornado            5287  18916      49     28
wheezy.template     447 223704      14      9

03-loop            msec    rps  tcalls  funcs
django             6120  16340      80     36
jinja2             4372  22871      33     31
mako              11091   9016      65     49
tenjin             1677  59645      24     20
tornado            1834  54540      18     15
wheezy.template     289 345959       9      8

len(names) == 1

01-initial         msec    rps  tcalls  funcs
django            12617   7926     154     48
jinja2             2756  36280      22     19
mako               6086  16431      42     35
tenjin             2176  45961      26     21
tornado            2527  39573      30     18
wheezy.template     368 271877      12      8

02-single          msec    rps  tcalls  funcs
django            16757   5967     204     49
jinja2             6797  14712      49     41
mako              12581   7948      81     52
tenjin             3989  25069      47     25
tornado            5955  16793      61     31
wheezy.template     590 169409      18     10

03-loop            msec    rps  tcalls  funcs
django            16679   5995     204     49
jinja2             6572  15215      49     40
mako              12437   8041      81     52
tenjin             3884  25744      43     25
tornado            5973  16742      61     31
wheezy.template     638 156695      18     10

len(names) == 10

01-initial         msec    rps  tcalls  funcs
django            64694   1546     802     48
jinja2             8324  12014      67     19
mako              10209   9795     105     35
tenjin             4201  23806      44     21
tornado            6474  15447     138     18
wheezy.template    1193  83819      48      8

02-single          msec    rps  tcalls  funcs
django            68542   1459     852     49
jinja2            11350   8810      94     41
mako              16666   6000     144     52
tenjin             5961  16776      65     25
tornado           10105   9896     169     31
wheezy.template    1518  65883      54     10

03-loop            msec    rps  tcalls  funcs
django           106102    942    1302     49
jinja2            23026   4343     193     40
mako              21423   4668     225     52
tenjin            22085   4528     214     25
tornado           42007   2381     448     31
wheezy.template    2713  36858      99     10
msec - a total time taken in milliseconds, rps - requests processed per second, tcalls - total number of calls made by corresponding template engine, funcs - a number of unique functions used.

Setup and Run

Prerequisites to be able run this in a clean debian testing installation.
apt-get install make python-dev python-virtualenv \
    unzip
The source code is hosted on bitbucket, clone it into some directory and setup virtual environment (this will download all necessary package dependencies per framework listed above).
hg clone https://bitbucket.org/akorn/helloworld
cd helloworld
make env -sC 05-template
make env -sC 06-widgets
Note, you can run this benchmark using any version of python, including pypy. Here are make targets to use:
make env VERSION=3.3
make pypy
Once environment is ready, cd to benchmark directory and run:
env/bin/python benchmark.py
Environment Specification:
  • Intel Core 2 Duo @ 2.4 GHz x 2
  • OS X 10.9, Python 2.7.6
Python has a number of template engines. A trivial use case gives you an idea where particular template engine stands in terms of performance and internal effectivity.

13 comments:

  1. I followed your series of web components benchmarks but I'm not sure how relevant they are to the "real world". If you take the slowest template engine in the slowest test it is 596 RPS = about 2ms per request. And for a fast website using SQL it takes let's say 50ms (probably more) to render complete HTML response, with majority of time spent waiting for SQL server and logging to files.

    So even the slowest template engine takes only 4% of time, and you're optimizing this 4% part. I'm not saying it's useless to optimize these (for sure there are use cases and loads when it matters), only that for most people functionality is what matters much more.

    ReplyDelete
    Replies
    1. Why not put cache between web application and database? That changes use case dramatically. Don't you think?

      Delete
    2. I think when cached values are kilobytes in size then even memcached give >1ms latencies. Also a single uncached HDD access means at least 8-10ms latency (not necesserily SQL access, can be eg. logging). I would say that 50ms for computing server response for a real world website is very fast.

      Delete
    3. I would suggest take a look at C10K topic. The question is about efficiency.

      Delete
  2. Thank you for the nice benchmark, especially for handing out the source. It would be interesting to see how django templates perform if you activate the cached template loader (I assume it will be considerably faster):

    TEMPLATE_LOADERS = (
    ('django.template.loaders.cached.Loader', (
    'django.template.loaders.filesystem.Loader',
    'django.template.loaders.app_directories.Loader',
    )),
    )

    ReplyDelete
    Replies
    1. I believe this is how it used, actually. See the source:
      https://bitbucket.org/akorn/helloworld/src/tip/05-template/django/app.py

      Delete
  3. I always get annoyed by these "how relevant are your benchmarks on an application?" / "Use a cache". Well, template rendering IS a part of an application, and it's easier and more wise to benchmark isolated parts then the whole. Everyone knows what caching can do for a web application and that a web application is not just templating. He's not claiming application for application performance, but TEMPLATING. SO, thanks for the test, it is very relevant.

    By the way, not all applications using templates are web ones, and not every template is cacheable.

    ReplyDelete
  4. Nice article! It would be interesting if Tenjin were included in it.

    ReplyDelete
    Replies
    1. I have updated post with tenjin in the list now.

      Delete
  5. Regards Jinja - is it Jinja2 without templates compilation to python code?

    As far as I know if you compile templates to .py it runs amazingly fast. Probably your tests reflect just compilation part and not the render itself.

    ReplyDelete
    Replies
    1. The tests do no reflect compilation part. Consider take a look at benchmarks source code.

      Delete
    2. From a quick look to source code I do not see that you use https://github.com/MiCHiLU/jinja2-precompiler - it should give some serious speed up.

      Delete
    3. The first use of template is compilation (warm up in benchmark), the result (compiled template) is reused whenever after.

      Delete