Tuesday, November 20, 2012

Python Web Frameworks Excessive Complexity

Cyclomatic (or conditional) complexity is a metric used to indicate the complexity of a source code. In this post we will take a look at web frameworks source code and estimate excessive complexity, something that is beyond recommended level of 10 (threshold that points to the fact the source code is too complex and refactoring is suggested). Here is a list of web frameworks examined:
  1. bottle
  2. cherrypy
  3. circuits
  4. django
  5. falcon
  6. flask
  7. pyramid
  8. pysi
  9. tornado
  10. turbogears
  11. web.py
  12. web2py
  13. webapp2
  14. wheezy.web
The source code is hosted on bitbucket, let clone it into some directory and setup virtual environment (this will download source code per framework listed above).
hg clone https://bitbucket.org/akorn/helloworld
cd helloworld/04-pep8 && make env up
The make file has a target for mccabe metric, so issue make mccabe.
Since some web frameworks consist of several packages (developed by the same team), I have combined them this way:
flask += jinja2 + werkzeug
pyramid += chameleon + webob
Here are raw numbers (November 24, 2013):
number of McCabe errors and overall complexity:
bottle => 11 152
chameleon => 15 211
cherrypy => 90 1403
circuits => 26 347
django => 234 3411
falcon => 0 0
flask => 6 84
jinja2 => 27 377
pyramid => 36 570
pysi => 1 10
tornado => 33 491
turbogears => 11 144
web2py => 325 6070
webapp2 => 104 1604
webob => 13 174
webpy => 25 346
werkzeug => 39 495
wheezy.caching => 0 0
wheezy.core => 1 11
wheezy.html => 0 0
wheezy.http => 5 61
wheezy.routing => 1 12
wheezy.security => 1 11
wheezy.template => 0 0
wheezy.validation => 1 12
wheezy.web => 2 21
Python has a number of web frameworks. A trivial McCabe test gives you an idea where particular web framework stands in terms of internal complexity. There is a wide field for improvement.


  1. I couldn't find the Makefile nor how you calculated "excessive" complexity. Could you please elaborate?

    1. Let demonstrate that on bottle web framework example: there are 8 McCabe errors with cumulative complexity of 132. 10 is selected as threshold, so excessive complexity is calculated as 132-8*10=52, this is the number you see in chart.

  2. why do u combine packages (i.e. pyramid + chameleon) but then break out wheezy.* into its constituent packages

    1. Eh, then wheezy.web should have 8 errors and a complexity of 99.

      Also, there's no requirement to use Jinja2 with Flask, you can use Mako just fine if you want to so it shouldn't get penalised for that, just as wheezy doesn't.

      The comparison imho should be split up farther to take template engines out of the equation since even with Django you can just swap them out and use something else.

    2. Let me please correct your math for wheezy.web: there are 10 McCabe errors accumulating 120 points, thus excessive complexity is 120-10*10=20

      There is no requirement to use Jinja2 with Flask but it is developed by the same core team and is a natural choice. wheezy.web integrates with both Jinja2 and Mako but wheezy.template is counted.

      Options can be endless, however there is a single preferred stack of tools offered by particular web framework, at least how its core developers say, may be not that loud.

  3. Would have been nice to have your conclusion on that because the numbers here don't speak by themselves ... Results are, sorry, not very clear (or not very clearly presented)

  4. Replies
    1. Spend 5 mins and run it yourself: the source is hosted on bitbucket.org and should be pretty easy.

  5. I don't understand this. Isn't the McCabe metric simply a measure of software complexity? Higher isn't necessarily worse; it just means the program is more complex, which is just as likely to be due to more functionality as to poor coding style. Also, as I understand it, the "10 or below" suggestion applies to individual modules, not entire projects; 10 is simply the complexity at which a module ought to be split into submodules.


    1. A program can be supremely complex but its routines, if broken down properly, rarely (if ever) MUST exceed 10. Any routine that is given 10+, can most probably benefit from some refactoring to reduce that number; without sacrificing the feature implemented by said routine.

  6. Pyramid has no ORM, django has one. As ORM solve a complicated problem, the solution is complex (and complicated). So don't compare apples with oranges. Level the scope of the frameworks (e.g. add SQLAlchemy to pyramids) and compare again.

    1. The web framework developers determine a scope of web framework. Some prefer reuse of 3rd party packages as simple package dependency, some include 3rd party source code in own code base. SQLAlchemy is not an essential part of pyramid web framework and not built by its core team. Django promotes own ORM and from web framework standpoint of view it essential and irreplaceable part of it, even in your own application you might try get rid of it, it still there.

  7. It would be nice if you would add some version numbers in there. Also, you've not commented on how complete or standalone the webframeworks are. Django by itself is bound to be much more complex that a webframework that uses an external ORM (or no ORM at all). Not including these kinds of comments really make the numbers themselves a bit useless as they are not comparing like for like.

    1. I am using the latest available source code straight from web framework source control, this is why I state only a date when the check was performed.

      I believe ORM (or admin app) should not be a part of web framework and evolving all three separately would not cause confusion.

    2. I can see how some would disagree with you on that remark.

      How "full-featured" a framework should be is mostly a matter of opinion. I would argue that micro-frameworks like Bottle are an entirely different category than application frameworks like Django or Zope. Pyramid probably represents a middle ground where the framework itself is more limited but presents many extension points.

      I definitely agree Django could (and should?) be much more modular but ultimately there simply is a trade-off between internal complexity and performance / ease of use. I wouldn't want to implement a large CRUD application in bottle, for example.

  8. Yeah, you are using the 10 rule, designed for *modules* and arbitrarily applying it to complete systems. As a result, your resulting metric conveys no meaningful information.

    1. The threshold of 10 for CC is applicable to methods, not modules.

      Cyclomatic complexity points you to methods that reside in modules combined by packages of the system. Thus modules and packages serve you aggregation purpose.

  9. CC is a tool to detect overcomplicated methods and functions to refactor them. You are misusing this tool in an attempt to estimate the complexity of a whole project. I'm not even sure summing the CC of every methods makes sense. The average would have been more useful, along with the number of lines of code.

    1. Right, however the excessive complexity gives you the overall picture of the project. This has nothing to do with number of lines of source code since CC is not taking that into account by definition.

    2. So if I understand this correctly, your metric would claim that a package made up of 1000 functions or methods, each with a mccabe complexity score of 11 would have an excessive complexity of 1000, while a package with 10 functions each with a mccabe complexity of 60 would have an excessive complexity of 500? In that case, it has everything to do with number of lines of source code, and it seems pretty obvious that the larger frameworks are going to be penalized. Maybe I'm misunderstanding.

    3. A higher excessive complexity leads to a higher refactoring effort. During refactoring you never have a deal with all source base, usually just a function that requires your attention, thus it is localized. Refactoring a function with a high complexity ends by extracting several simple functions (it is easier to cover them with tests) making the source more readable while resulting in more code lines. The excessive complexity especially concerns larger frameworks, since the probability to have multiple places for refactoring is higher (as well as cost to support or fix them) thus attention to source quality is essential.

  10. How come you did not include Zope or Plone?

  11. While many will snipe - you took an effort to understand and I applaud and appreciate it.