Wednesday, February 6, 2013

Thoughts on SQL vs ORM

The question of persistence implementation arise often. I found repository pattern very valuable due to separation of concerns, mediate between domain model and data source (mock, file, database, web service, etc). The database data source is somewhat specific since you can proceed with SQL functions or ORM. Here are some thoughts why you might prefer SQL functions over ORM in your next project:
  1. It is not valid to think that relational model in database is domain model of application. They are different (except some trivial cases).
  2. Domain model must not contain metadata: table names, constraints, validation properties, etc. That `extra` information impacts read/write operations with network cache, makes your code hard to read over the time. Design your domain model with plain objects only.
  3. Most queries are simple... what is the reason for ORM?
  4. Complex query is easier to write in SQL, rather than search how to make it with ORM... save your time.
  5. Earlier or later the need to optimize certain query arise... in this case it is a bit odd try find out what is resulting SQL of my ORM call (it looks ugly as a rule). We try escape from SQL with ORM but returning in any case.
  6. A developer who writes repository layer can not be ignoramus in database nature: he must understand impact of what and how he is doing... only in this case there is a chance for success. If database/SQL knowledge is so fundamental and clear what is the reason to bring Chinese speech?
  7. The question of database refactoring arise over the time. The need for additional knowledge (ORM) complicate things. The knowledge is the power, not always.
  8. Alternative for database migration sounds quite good, but in reality is it demanded?
  9. ORM does not give me an answer to the question of security: if the set of my operations is limited to data read why there is no way to set these boundaries (and thus restrict)? Database get this resolved quite flexibly.
  10. Shards topic is quite popular... take a look at ORM implementation (pick any) for this simple idea... looks pretty scary.
  11. It is not valid to wrap web handler with decorator that manages database connections. How about user input validation, template rendering? The use of database connection session (unit of work) must be minimal, so it can serve other concurrent requests.
The design principals of wheezy.web demo template project follows separation of concerns in mind: strictly defined actors with theirs unique duties. The mock repository can be easily switched to one implemented with sql: web/ui developer is not blocked since database design is not completed yet.

25 comments :

  1. Thank you for sharing this list.

    It's a really good set of reasons why we should not use ORM in our projects.

    But it would be fair if you could show opposite arguments.

    For example, ORM may work pretty good if:

    1. You have clear separation between domain model and persistent model.

    2. You have to support multiple RDBMS at the same time. Maintain repositories for MySQL plus SQL Server, ProstgreSQL and SQLite in a single application most likely will be relly painful if you do not use good ORM such as SQLAlchemy/Python or NHibernate/.NET or Hibernate/Java.

    3. Unit of work pattern implementation would be pretty cheap becase it's provided by ORM.

    4. ORM may save development time for application which _does not_ have complicated business logic.

    ReplyDelete
    Replies
    1. Try disagree to most items in the list and you will get fair reasons to use ORM.

      Keep in mind: over the time the complexity of project increases and long term maintenance is rarely provided by the same development team.

      Delete
    2. Actually I'm totally agree with all your statements listed in this post.

      I was big fan of ORM some time before and made several projects in .NET/NHibernate. There was really heavy lifting for whole team to train NHibernate. I spent huge amount of time on learn this ORM and then on knowledge sharing and teammates training.

      I've used a lot of advanced aspects of NHibernate and also work a bit with SQLAlchemy.

      But recently I just dropped all this things and switched back into dumb repositories with raw SQL queries.

      To be clear. I'm not gonna to use ORMs in any next new projects.


      I just wanted know your thoughts on when ORM usage could be appropriate.

      If your position is "Never ever use an ORMs. Period." I'm OK with it.

      But I would glad to see your comments/additions on my list above :)

      Delete
    3. 2. You have to support multiple RDMS at the same time ... blah blah blah. Really?!

      I disagree with this point. Basically, there are more ORM frameworks to support than RDMS.

      Have a look at SQL dialects, MySQL, PostgreSQL, Sqlite3 etc. They all have a pretty similar syntax.

      ORM frameworks (Hibernate, Sqlalchemy, Entity Framework etc) on the other hand have completely different syntax and have their own learning curve.

      From my experience, it is much leaner and easier to maintain if we create repositories with plain old SQL queries rather than ORM bullshit.

      Delete
  2. Nicely done, Can you possibly enumerate why ORM's are good? Take me for example: I don't know much SQL but I can plug an ORM into my project and get a database layer. Writing that felt scary, it could be argued that I shouldn't use a database. However, as a one man team, I don't have many options

    ReplyDelete
    Replies
    1. Read SQL manual instead of ORM's one, you will be surprised how fast you get up to speed.

      Delete
    2. That doesn't really answer Mike's question. Failure to provide a balanced argument does ding the article's credibility.

      There are few absolutes in life, the choice between an ORM and raw SQL being a great example. Neither is 100% appropriate for all cases, but you've only shown the cases where SQL is the better choice.

      Delete
    3. @Greg... I disagree regarding the article's credibility. Had Andriy titled this article "The pros and cons of ORMs" I might agree, but he's just stating his preference. And anyway, if you wanted a balance view, why pick a person that has already decided which way to go.
      For the record... I'm anti-ORM.

      Delete
  3. These days a lot of projects need to migrate from SQL to NoSQL because of scalability issues (see Reddit, Foursquare etc.). By using the repository pattern it is much more easy to migrate another database like MongoDB or Cassandra. So that makes another point against ORMs.

    ReplyDelete
    Replies
    1. Your point concerning repository benefit is right. However I would not advise blindly NoSQL path. The managed shards still need to be considered for SQL horizontal scalability (the ease of data moving between shards buys).

      We usually start with mock implementation of repository just to make the view happy with rendering, later on switch implementation to SQL, add caching layer, etc. Practically the development team is okay to use mock feeding some random data, unit tests for SQL repository implementation plus QA work make such `division` quite optimal.

      https://bitbucket.org/akorn/wheezy.web/src/tip/demos/template/src/membership/repository

      Delete
  4. I strongly agree.

    I honestly don't understand why people make simple things complicated and create tools like ORM.

    ReplyDelete
  5. > Design your domain model with plain objects only.

    - It's a DataMapper http://martinfowler.com/eaaCatalog/dataMapper.html
    Storm ORM, SQLAlchemy is a DataMapper.

    ReplyDelete
    Replies
    1. Correct. Data mapper allows you keep domain model and persistence access independent.

      Delete
  6. How you can provide ACID without ORM? https://en.wikipedia.org/wiki/ACID

    Do you have mechanisms to synchronize state of objects in memory with it's data on disk?

    Do you have "IdentityMap"? http://martinfowler.com/eaaCatalog/identityMap.html

    How you implement bussines transactions? http://martinfowler.com/eaaCatalog/pessimisticOfflineLock.html

    How you build SQL without "Query Object" http://martinfowler.com/eaaCatalog/queryObject.html for patterns like "Class Table Inheritance" http://martinfowler.com/eaaCatalog/classTableInheritance.html or uinion query of "Concrete Table Inheritance" http://martinfowler.com/eaaCatalog/concreteTableInheritance.html

    How you extend SQL, for example, to change "ORDER BY" clause?

    ReplyDelete
    Replies
    1. >> How you can provide ACID without ORM?
      ACID is a property of persistence (e.g. RDMS), not a way you access it.

      >> Do you have mechanisms to synchronize state
      >> of objects in memory with it's data on disk?
      There is no such need. The lifetime is per request, so they are reconstructed again with repository access (caching applies).

      >> Do you have "IdentityMap"?
      There is no such need.

      >> How you implement bussines transactions?
      Unit of work: here, here and session.

      >> How you build SQL without "Query Object"
      The SQL is not "built"... there is no dynamic SQL (this is what ORM is about). Just call a function.

      >> How you extend SQL, for example, to change "ORDER BY" clause?
      There is no need to extend SQL... you just call a function that returns the order you need.

      Delete
    2. > ACID is a property of persistence

      - Why? If you work with DB, and you have concurrent access, then you should provide ACID too. Else you can lost data. For example, SQLAlchemy, Storm ORM provides two-level transactions (rollback for DB and also for pyhon objects), and synchronize objects in memory with data on disk on each commit.

      > The lifetime is per request

      - How for background tasks, event based non-blocking servers, GUI, non-web applications?

      > "IdentityMap"? There is no such need.

      M. Fowler in «Patterns of Enterprise Application Architecture» proves the opposite.

      > bussines transactions? Unit of work: here, here and session.

      - Where is here bussines transactions? I mean bussines, not system transactions...

      > There is no need to extend SQL... you just call a function that returns the order you need.

      - Can you give me example of function to build SQL? Very interesting.

      Delete
  7. By the way, pattern Repository http://martinfowler.com/eaaCatalog/repository.html requires pattern Query Object http://martinfowler.com/eaaCatalog/queryObject.html to hidde implementation of mapper.
    See chapter "Repository" «Patterns of Enterprise Application Architecture» Martin Fowler, David Rice, Matthew Foemmel, Edward Hieatt, Robert Mee, Randy Stafford

    ReplyDelete
    Replies
    1. The repository can serve data mapper purpose (unless you use ORM).

      Delete
    2. "Under the covers, Repository combines Metadata Mapping (329) with a Query Object (316) to automatically generate SQL code from the criteria. Whether the criteria know how to add themselves to a query, the Query Object (316) knows how to incorporate criteria objects, or the Metadata Mapping (306) itself controls the interaction is an implementation detail." M.Fowler.

      Delete
    3. That usually applies when you use ORM. When you use SQL function you only provide a mapping to domain model.

      Delete
    4. It describes pattern "Repository", not ORM. Good ORM contains patterns DataMapper, Repository, UnitOfWork, IdentityMap, QueryObject, Gateway etc.

      You can use one of existent ORM, or can use patterns in particular way. In any case, Repository should fully hide source of data. I don't know how to make it without encapsulation criterias to single object.

      Delete
    5. Indeed, repository mediates between the domain model and persistence and as such hides source of data.

      Delete
    6. See schema on image: http://martinfowler.com/eaaCatalog/repository.html

      In Repository should be "Im memory Strategy", ie DataMapper.

      Delete
  8. By the way, in you case, I think that is just a datamapper, not repository. Repository is a "Context" for "Strategy" (see Strategy pattern) like DataMapper. I don't see here Repository https://bitbucket.org/akorn/wheezy.web/src/9b914ed7c5ff55a6ecc9609ff71246aac15cb075/demos/guestbook/repository.py?fileviewer=file-view-default#repository.py-7

    ReplyDelete
    Replies
    1. Nevertheless, I prefer call it repository, a way it mediates in between.

      Delete