For some reason, as of late, I can’t seem to attend any user group or conference without a speaker slating ORM’s. Several speakers at the PHP UK Conference this year expressed their disapproval, as well as the speaker at this months PHP London talk. However, no one is giving me a strong enough argument to not use an ORM. Remarks such as “That’s a whole other talk” or “Don’t get me started on ORM’s” seem to be thrown about. But whenever I get a chance to talk about any concerns or issues they’re having the conversation just seems to deflate. Am I missing something really terrible about ORM’s that’s going to creep up and bite me?
This expanding general dislike within the community is concerning, as the only reason I can think of as to why its gathering pace, is that developers haven’t had a chance to try out Doctrine 2 yet.
I’ve been using it commercially for about six months now and have found it so much better than the predecessors that gave ORM’s a bad name. Things have changed a huge amount, and I think its time for the community to give ORM’s another go. If you’ve not had a chance yet I’d advise staying away from it until you have a reasonable chunk of time to dedicate. It’s a complete paradigm shift from previous implementations and at first glance can be a little daunting. Trying to understanding what your repositories are, why you can’t edit your proxy classes or how to set up an event might confuse you a little, or even throw you off track, but if you stick with it, I promise you, the time invested hurdling over the learning curve is certainly worthwhile.
So in their defence, and with Doctrine now firmly established on my tool belt, I’m going to start banging the ORM drum. I feel they’re a great addition to a project and in my opinion, a worthy abstraction. They carry a great suite of benefits to an application.
Once you’ve done your research, you’ll start to see how the ORM should be used, and will be better protected against a flaky implementation. One of the most important things when adopting any 3rd party library is to protect yourself from “doing it wrong” by RTFM. I feel its exactly this problem that has led developers to be dismissive of ORM’s and has begun the fuelling of general disapproval. Getting a prototype up and running may take no time at all. Once you’ve established your properties, relations and all your getters and setters are in place you can easily traverse through your objects like a deck of cards, but it doesn’t end there. Developers seems to be skipping over the fine tuning part and then blaming the ORM for being slow. Things are often overlooked such as making sure your not eagerly loading relationships, not using DQL correctly for complex queries, or not applying the right hydrator for your use case.
I thought I’d share my response to some of the common reasons developers come up with to not use an ORM for their application. As well as expose my personal experiences with using Doctrine (version 2.2), and hopefully, convince you that they’re not all bad, maybe just slightly misunderstood.
“Using ORM’s means having a one to one relation between object and table”
In the general sense this is probably the most common use case, but in my experience, you’re not forced to architect this way. When working on a project many moons ago I needed to build a schedulling tool. It was to be used to schedule various components such as articles, images or products to go live on a site at a specified time. These components had no relation to each other, needed to have their own space for applying business logic, and should all share the common ability of being schedulled. Using column aggregation mean’t I could have three seperate classes (ScheduleProduct, ScheduleImage, ScheduleArticle) that fed data into a single scheduling table, aggregated by using columns. I could now apply any logic specific to scheduling an image, keep it seperate from my other models (articles / images), and persist into a single table. Inheritance mapping in Doctrine enables you to structure your objects in a completely different manner to your underlying table schema. You create your object graph, and then you can tell Doctrine how it should be persisted.
“ORM’s produce sub-optimal SQL and far too many queries”
This is not true at all, Doctrine gives you an abstraction of SQL through DQL, which gives you the power to query exactly what fields you want, and connect up any relations you want to join. You pretty much get to produce the SQL yourself, albeit through an abstraction layer. And incorporated with this abstraction is a bunch of helpful functions / operators that work across database vendors.
I think the problem lies again in implementation. Take for example..
$users = $em->getRepository('Entities\User')->findAll();
foreach($users as $user)
{
$profile = $user->getProfile();
}
If the relation between users and profile is set to lazy load then this will incur an additional query to retrieve each profile row. We know this can be done much better by simple applying a join on a DQL query.
$q = $em->createQuery('SELECT u,p FROM Entities\User u LEFT JOIN u.profile p');
$users = $q->getResult();
These are exactly the kinds of thing you need to think about when doing your data retrieval. The tools are there to help you tailor the results set to your need, but you can’t expect Doctrine to know what you plan to use it for.
“Using ORM’s means using active record”
This was certainly the case some years ago, but not anymore. One of Doctrine’s major failings in the 1.x release was the huge weight your models would inherit by extending the Doctrine_Record class. This gave your models the ability to save themselves (being active). And if you so wished, could act like a service layer, pulling in / saving any data you wanted. With this came with a bunch of problems, the most frustrating for me being memory consumption. With a frequent appearence of a memory leak caused by circular referencing you can just forget about using it for data processing.
The Doctrine 2.x release focus’ strongly on being a data mapper. This means your model classes (entities) are plain old PHP objects that only deal with themselves. They have no external dependencies and are not coupled with anything in the ORM library.
“ORM is slower than just using SQL, Unlike other abstraction layers, which make up for their performance hit with faster development, ORM layers add almost nothing.”
This is taken from a post by Laurie Voss on seldo.com named In defence of SQL which was later followed up with ORM is an antipattern. It’s common knowledge that in a general sense abstractions will slow things down. Any additional layers added to your stack will take you further away from the metal, and will generally incur a speed cost to your application. Whenever implementing the abstraction you should always be able to justify the decrease in speed with a benefit for adding it. Common arguments for adding an ORM might be;
- Getting your application released to market quicker.
- Readability / Maintainability through code clarity and clean design.
- Testability / Stability, keeping a clean domain model encapsulating your logic makes for easy testing.
- The option to farm off intensive processing. A good abstraction may enable you to snip out the intensive (CPU / Memory / Disk IO) parts of your application and have them processed asynchronously on a seperate resource to your application.
All of these points are valid in the case of incorporating an ORM into your project, but there’s one other that you might not have expected. Doctrine can actually improve the speed of your processing! You wouldn’t have though it, but its true. Let me explain;
I’m sure you’ve often come across the situation where you need to write a large set of rows to the same table. A typical programmer might interate over a dataset and insert each row one at a time. A better programmer might prepare the data for a single insert. Now, take that idea and apply it across the entire runtime of your application. This is exactly what doctrine does using the unit of work pattern. When persisting your entities, or retrieving them from the database they are in a managed state. Changes can be applied several times throughout the runtime of you application. Once you’ve finished tinkering with your objects your manager can be flushed and optimal queries are written to perform the inserts / updates.
This is a great feature of doctrine which helps hugely reduce the round trips you’d typically be making to your database. A benchmark was done to compare 20 inserts using Doctrine’s unit of work vs mysql_query. And Doctrine came up trumps with a completion time of 0.0094s, almost half of what it took to do 20 individual mysql inserts (0.0165s).
More information about Doctrine’s unit of work can be found here.
“But just pulling out arrays are quicker”
One argument I’ve heard for not using an ORM is the dislike of it always retrieving a data object by default. This carries more weight that an associative array and is not necessary. Although this can be changed by using the array hydrator, I find in the majority of cases my application requires an object.
I think if your data is being punched straight into a view, with no manipulation then your right, an array would make more sense, but this is seldom the case. You’ll often find your data requiring additional manipulation. If your going to be processing an order, or calculating a total cost then you’ll need to apply business logic, which should be encapsulated into your domain model. No matter how hard you try you’ll seriously struggle to encapsulate this logic into an array! Remember objects can provide a state well beyond what a key / value can, so should really be favoured.
“Incorrect abstraction – if you don’t need relational data features you’re using the wrong data store”
I completely agree with this, and Doctrine has the perfect solution. The project I’m working on at the moment has a few entities that have no relations, and will only be used for reporting. Consistency is not important, and we’re not going to be providing any real-time statistic. However, we will need to ensure availability under heavy load. We’ve settled on using mongoDB for storing these entities and implementing this using Doctrine couldn’t be simpler. Once I’d bootstrapped Doctrine mongoDB ODM all I had to do was remove the ORM annotation from my Entities, and replace them with annotations compatible with the ODM. Now these were happily persisted to mongoDB and ready to scale with load.
As the annotation mappings are pulled in as a namespace into your entity, so you could maintain both ORM and ODM definitions (if you wanted). This flexibility could be useful for using sqlite to quickly test your entities business logic.
Closing note
So please don’t follow the general opinion on ORM’s before giving them a try. Things have changed for the better and its time to give them another go. Remember as with any application not using an ORM you still need to optimise. And as I said it can take some time pick up but you’ll find plenty of help and examples in the online documentation. And there is also a reasonably active IRC channel where alot of the core developers loiter.
RE: “ORM’s produce sub-optimal SQL and far too many queries”
Why have an ORM if you are going to write out your query in the first place?
I’m happy that you took your time to defense ORM pattern/tool.
I agree with all your arguments.
I’m using Propel since 2006 and from there it is natural choice in every project I’m working on.
What I like most is that you can query your database in any needed way while keeping high readability of your code. When you add junior developers to the team you can save a lot of time keeping them in distance from SQL :)
Best regards
Interesting to hear someone defend ORM I heard the same things at the PHPLondon events. You’ve made me think I should set aside some time to try some stuff out in Doctrine 2, least can make my mind up then based on my own experience.
I would not say that ORM is bad, it really save you some precious times.
Writing 1 millons time the CRUD queries can be very annoying.
But, my advices would be :
- be wise, don’t rely too much on ORM use it as a tools that you can or have to replace
- know your tools, ORM are great but have always drawback somewhere.
Doctrine is great but seem overbloated and complex sometime, prepare yourself to read the docs often and search help on the web.
And … why not writing your own ORM ? I simply use Doctrine because so fancy toos provided and my clients needit.
I often use my own simple ORM that perfectly fit my needs in 90% of the times.
I totally agree with you on the benefits of ORM, it is a great starting point and if you need greater performance you can start handcrafting SQL queries to speedup the application. I have argued the same at ORM Anti-pattern – Right conclusion, wrong reasons – Rebuttal
ORMS are misused in many cases where SQL queries would be faster. Personally I do not use ORMs for the following as straight SQL is always better:
a) List Generation and pagination – no need to hydrate objects I just want the data
b) Report generation
c) Summations and Aggregation of data – I need only one computed value
Quick note:
SELECT u FROM Entities\User u LEFT JOIN u.profile p
will not fetch-join profiles. You need to include the profile entity in the select list like so:
SELECT u,p FROM Entities\User u LEFT JOIN u.profile p
This is a nice feature, as you may want to filter by some join condition, but not want to hydrate the associated entities.
Good point, i’ll update, thanks.
Well said!
I’ve used Doctrine 2 for a little while now and it’s worth its weight in gold.
Stephen, if you are using a good ORM, you can do all the the things you listed … without hydrating objects. Stating that SQL is faster is not always true. And even so, many times speed is less important than maintainability. Thus using ORMs for lists, pagination, reports and aggregate functions. ORMs are more than just mapping tables to objects. Or should be (good ones are).
Martin Fowler also comes out and talks about ORM hate, and I totally agree with him on this
http://martinfowler.com/bliki/OrmHate.html