Tuesday, October 23, 2012

Python Templates Benchmark

Python template engines offer high reusability of markup code and the following features are used by content developers most of the time:
  • Includes: useful to incorporate some snippets of content that in most cases are common to the site, e.g. footer, scripts, styles, etc.
  • Extends: useful to define a master layout for the majority of the site content with placeholders, e.g. sidebar, horizontal menu, content, etc. The content developers extend the master layout by substituting available placeholders.
  • Widgets: usually small snippets of highly reusable markup, e.g. list item, button, etc. The content developers use widgets to increase readability and enforce consistency of design.
We will examine all mentioned features above. Test is executed in isolated environment using CPython 2.7. Latest available versions (April 2, 2015):
  1. django 1.8
  2. jinja2 2.7.3
  3. mako 1.0.1
  4. tenjin 1.1.1
  5. tornado 4.1
  6. wheezy.template 0.1.159
Includes & Extends: in this test case an initial version of HTML content is refactored to use include and extend features of respective template engine.
Widgets: the test case is around how widget is built and used.
02-single - a widget is built in a way that loop is inside; 03-loop - a widget represent an item that is rendered in a loop.
Here are raw numbers:
05-template

len(items) == 0

01-initial         msec    rps  tcalls  funcs
django            14148   7068     162     52
jinja2             2508  39874      27     21
mako               6634  15073      47     35
tenjin             2178  45911      27     21
tornado            2441  40965      30     18
wheezy.template     428 233700      14      8

02-include         msec    rps  tcalls  funcs
django            30875   3239     384     65
jinja2            12579   7950     109     39
mako              25971   3850     149     49
tenjin             7022  14241      88     27
tornado            2452  40787      37     18
wheezy.template     930 107577      29     13

03-extends         msec    rps  tcalls  funcs
django            47068   2125     621     83
jinja2            15531   6439     162     45
mako              36221   2761     225     63
tenjin            10336   9675     118     35
tornado            2530  39521      42     18
wheezy.template    1693  59065      47     18

04-preprocess      msec    rps  tcalls  funcs
django                 not available
jinja2                 not available
mako                   not available
tenjin                 not available
tornado                not available
wheezy.template     428 233559      14      8

len(items) == 10

01-initial         msec    rps  tcalls  funcs
django           109984    909    1321     61
jinja2             9093  10997     118     21
mako              11140   8976     136     35
tenjin             4723  21175      46     21
tornado            7438  13444     181     18
wheezy.template    1663  60130      73      8

02-include         msec    rps  tcalls  funcs
django           129224    774    1543     72
jinja2            19485   5132     200     39
mako              30257   3305     238     49
tenjin             9721  10287     107     27
tornado            7506  13323     188     18
wheezy.template    2188  45703      88     13

03-extends         msec    rps  tcalls  funcs
django           145839    686    1780     88
jinja2            23096   4330     275     45
mako              41884   2388     314     63
tenjin            13674   7313     155     38
tornado            7721  12952     193     18
wheezy.template    3056  32723     106     18

04-preprocess      msec    rps  tcalls  funcs
django                 not available
jinja2                 not available
mako                   not available
tenjin                 not available
tornado                not available
wheezy.template    1636  61125      73      8

06-widgets

len(names) == 0

01-initial         msec    rps  tcalls  funcs
django             7003  14280      78     40
jinja2             1740  57481      17     15
mako               5237  19096      35     32
tenjin             1470  68009      24     20
tornado            1895  52782      18     15
wheezy.template     212 471948       8      7

02-single          msec    rps  tcalls  funcs
django            12089   8272     131     41
jinja2             5190  19269      44     37
mako              11703   8545      74     50
tenjin             2958  33802      45     24
tornado            5133  19483      49     28
wheezy.template     390 256581      14      9

03-loop            msec    rps  tcalls  funcs
django             7459  13407      78     40
jinja2             4017  24896      33     31
mako              11387   8782      65     49
tenjin             1446  69149      24     20
tornado            1629  61386      18     15
wheezy.template     286 349846       9      8

len(names) == 1

01-initial         msec    rps  tcalls  funcs
django            12767   7833     138     56
jinja2             2502  39961      22     19
mako               5888  16985      42     35
tenjin             1971  50726      26     21
tornado            2315  43191      30     18
wheezy.template     360 277594      12      8

02-single          msec    rps  tcalls  funcs
django            21315   4691     191     57
jinja2             6184  16169      49     41
mako              11839   8447      81     52
tenjin             3615  27663      47     25
tornado            5511  18146      61     31
wheezy.template     551 181354      18     10

03-loop            msec    rps  tcalls  funcs
django            18787   5323     191     57
jinja2             6096  16404      49     40
mako              12241   8169      81     52
tenjin             3468  28838      43     25
tornado            6080  16448      61     31
wheezy.template     561 178407      18     10

len(names) == 10

01-initial         msec    rps  tcalls  funcs
django            52911   1890     660     56
jinja2             6754  14805      67     19
mako               9697  10312     105     35
tenjin             3636  27505      44     21
tornado            6618  15111     138     18
wheezy.template    1073  93161      48      8

02-single          msec    rps  tcalls  funcs
django            60454   1654     713     57
jinja2            11482   8709      94     41
mako              15761   6345     144     52
tenjin             5867  17045      65     25
tornado            9884  10118     169     31
wheezy.template    1401  71395      54     10

03-loop            msec    rps  tcalls  funcs
django           114332    875    1190     57
jinja2            20982   4766     193     40
mako              20198   4951     225     52
tenjin            22042   4537     214     25
tornado           45955   2176     448     31
wheezy.template    2596  38516      99     10
msec - a total time taken in milliseconds, rps - requests processed per second, tcalls - total number of calls made by corresponding template engine, funcs - a number of unique functions used.

Setup and Run

Prerequisites to be able run this in a clean debian testing installation.
apt-get install make python-dev python-virtualenv \
    unzip
The source code is hosted on bitbucket, clone it into some directory and setup virtual environment (this will download all necessary package dependencies per framework listed above).
hg clone https://bitbucket.org/akorn/helloworld
cd helloworld
make env -sC 05-template
make env -sC 06-widgets
Note, you can run this benchmark using any version of python, including pypy. Here are make targets to use:
make env VERSION=3.3
make pypy
Once environment is ready, cd to benchmark directory and run:
env/bin/python benchmark.py
Environment Specification:
  • Intel Core 2 Duo @ 2.4 GHz x 2
  • OS X 10.10, Python 2.7.9
Python has a number of template engines. A trivial use case gives you an idea where particular template engine stands in terms of performance and internal effectivity.

13 comments :

  1. I followed your series of web components benchmarks but I'm not sure how relevant they are to the "real world". If you take the slowest template engine in the slowest test it is 596 RPS = about 2ms per request. And for a fast website using SQL it takes let's say 50ms (probably more) to render complete HTML response, with majority of time spent waiting for SQL server and logging to files.

    So even the slowest template engine takes only 4% of time, and you're optimizing this 4% part. I'm not saying it's useless to optimize these (for sure there are use cases and loads when it matters), only that for most people functionality is what matters much more.

    ReplyDelete
    Replies
    1. Why not put cache between web application and database? That changes use case dramatically. Don't you think?

      Delete
    2. I think when cached values are kilobytes in size then even memcached give >1ms latencies. Also a single uncached HDD access means at least 8-10ms latency (not necesserily SQL access, can be eg. logging). I would say that 50ms for computing server response for a real world website is very fast.

      Delete
    3. I would suggest take a look at C10K topic. The question is about efficiency.

      Delete
  2. Thank you for the nice benchmark, especially for handing out the source. It would be interesting to see how django templates perform if you activate the cached template loader (I assume it will be considerably faster):

    TEMPLATE_LOADERS = (
    ('django.template.loaders.cached.Loader', (
    'django.template.loaders.filesystem.Loader',
    'django.template.loaders.app_directories.Loader',
    )),
    )

    ReplyDelete
    Replies
    1. I believe this is how it used, actually. See the source:
      https://bitbucket.org/akorn/helloworld/src/tip/05-template/django/app.py

      Delete
  3. I always get annoyed by these "how relevant are your benchmarks on an application?" / "Use a cache". Well, template rendering IS a part of an application, and it's easier and more wise to benchmark isolated parts then the whole. Everyone knows what caching can do for a web application and that a web application is not just templating. He's not claiming application for application performance, but TEMPLATING. SO, thanks for the test, it is very relevant.

    By the way, not all applications using templates are web ones, and not every template is cacheable.

    ReplyDelete
  4. Nice article! It would be interesting if Tenjin were included in it.

    ReplyDelete
    Replies
    1. I have updated post with tenjin in the list now.

      Delete
  5. Regards Jinja - is it Jinja2 without templates compilation to python code?

    As far as I know if you compile templates to .py it runs amazingly fast. Probably your tests reflect just compilation part and not the render itself.

    ReplyDelete
    Replies
    1. The tests do no reflect compilation part. Consider take a look at benchmarks source code.

      Delete
    2. From a quick look to source code I do not see that you use https://github.com/MiCHiLU/jinja2-precompiler - it should give some serious speed up.

      Delete
    3. The first use of template is compilation (warm up in benchmark), the result (compiled template) is reused whenever after.

      Delete