ORM for Database Abstraction or Stored Procedures for Data Structure Abstraction?

“Object-relational mapping (ORM, O/RM, and O/R mapping) in computer software is a programming technique for converting data between incompatible type systems in object-oriented programming languages. This creates, in effect, a “virtual object database” that can be used from within the programming language.”

With the ever rising popularity of ORMs, I have to ask myself the question:  When is it a good idea to to use an ORM?  It’s a difficult question to answer because it relies on a number of variables, not all of which are known at design time.  When designing enterprise systems, I’ve found a use for both architecture styles and have developed my own personal biases for each.  The following is by no means a comprehensive list, just a few high level points I’ve considered when making my decisions.

ORM

The concept of the ORM is to simply map data fields to logical object.  Nearly every ORM has the capability to map disparate data fields to logical objects, yet few take an ORM to this level of complexity.  The common ORM implementation is a one to one mapping of objects to data tables.  This helps with LINQ queries, IntelliSense, and readability while minimizing confusion of where the data ends up in the database.  However, this approach tightly couples business objects to the data structure which can cause bigger headaches in the long run.  If the need to flatten a data structure for performance, or normalize a table to minimize duplicated data arises the chance for bugs and logic conflicts increases drastically.  Here are some of the

The Upside

Business logic stays in code. Probably the best argument for an ORM is that it helps to enforce the idea that business logic belongs in a business logic layer rather than the database.  With an ORM, all logic is handled in application/service code which makes this rule easy to enforce.

No DBA required.  Since the ORM brings all the heavy lifting to the application code and the queries are managed by the , there’s no compelling reason to staff a SQL expert.  In my experience, having a SQL expert and an ORM is usually like mixing potassium and water.

Changing data stores is purported to be minimal effort.  In theory, a properly implemented ORM can repointed from once data source to another (with the same structure) without drastic manipulation to code.  Personally, I’ve never encountered a situation in which I’ve had all of these variables perfectly in place, or the business need to do this sort of change-over.  Changing from MSSQL to Oracle is a big deal for numerous reasons and is likely to be accompanies with business logic changes.  However, this seems to be one of the selling points for the ORM approach.

Transactions are relatively straight forward.  ORMs store all changes for a transaction in memory until the application commits those changes to the database.  In Entity Framework, this is handled with the AcceptAllChanges()command.  This allows for a group of related changes to be staged, committed, and rolled back with relative ease.

The Downside

A change in data architecture can take down an application. With the application and data entities being tightly coupled to the data structure, any change to the data store has the potential to break portions of the application.  This means that any change in the database architecture requires a change to the ORM and referencing logic.  This increases development/test time for all changes of this nature.

Slower than stored procedures. Because the queries generated are plain SQL queries, they don’t have the benefit of being compiled or cached at the database server.  Because of the nature of the ORM data access, the queries are generated at runtime.

Complex data retrieval or storage queries can generate some nasty SQL.  ORMs can produce some hairy queries.  Although it’s improved with recent versions, Entity Framework has a difficult time handling multiple joins efficiently.  This causes the runtime preparation of complex queries to have a significant impact on application performance.

Altering data structures is no fun.  I’ve yet to work on an application that didn’t require additional tables, additional columns, data type changes, or dome alteration of the data structure.  Moving things around with an ORM is less than intuitive and requires a good amount of coordination and testing.

Stored Procedures

There are a few big upsides to using stored procedures as an abstraction layer that have pushed my designs to this direction for several large systems.  Implementing this can require additional resources and time in development but allows for data structure flexibility and optimal performance.

“The data architect’s answer to why use stored procedures is that T-SQL is only language that you KNOW will last as long as the data. A stored procedure that was written ten years ago is still just as easy to call as one that will be written today, or in the future. Anything less is building a long term solution on today’s fad, and it’s just plain foolishness.” from Paul Nielsen’s article “Why use Stored Procedures?” on sqlblog.com  Read more…

The Up Side

Compiled on the database and data execution plans are cached.  Because of this, speeds will out-perform any ORM.

Written in native database language with access to additional performance optimizations and language features. ORMs have to translate logic to queries.  Even at their best, the can’t take everything into account.

Decouple application code from database structure.  The data structure can be changed and manipulated to improve performance without having to update code; only the referencing stored procedures.

Business objects represent business needs.  There’s no direct need to have a one-to-one relationship of business objects to the data structure.  This allows a bit more flexibility in design and alteration.

The Down Side

If you don’t already have a dedicated SQL developer on staff, it will behoove you to hire one.  Most application developers can handle simple to moderate stored procedures.  Once the application concerns switch to performance, traffic and data size, you’ll want to make sure you have a specialist on hand.  Data integrity is priority for most organizations.  Data management, storage, and integrity for a high volume database is a job for a specialist not a part time concern for application developers.

You will inevitably and repetitively be urged to put business logic into stored procedures.  Whether it makes sense for the function or the DBA is telling you that he could do things faster and easier in the stored procedures, this will come up.  Spreading business logic around to various layers of an applications is not only bad form, but it makes it exponentially harder to troubleshoot when a problem arises.

Object mapping needs to be handled manually. Without an ORM, the output of a stored procedure needs to be mapped to an object manually.

Messaging needs to be handled.  Unless you have a desire to parse through SQL errors, you’ll need to device a common messaging structure to communicate detailed information to the application.  Depending on the level of detail, this can be a significant effort.

Transactions become a bit more cumbersome.  Simple/moderate transactions can be handled relatively easy with an ORM.  With Entity Framework, SaveChanges() commits all in memory changes to the database in a transactional manner.

The Hybrid

A very viable alternative to going whole hog on one of the previous designs is to take a little of both.  Pick a less volatile section of the application and use an ORM.  Leave your DBA out of non-critical features and let them focus on the high traffic, performance dependent areas of your application.

The Up Side

The best of both worlds.  Performance for the areas that need it, rapid development for the features that don’t.  Every application has features or areas that draw less traffic and might require faster development cycles; perfect for an ORM.

Most production ready ORM solutions can handle Stored Procedure output mapping.  The big players can be used to map the output record set(s) to business objects, saving you the hassle.  This adds quite a bit of overhead, and there are lower cost alternatives.

The Downside

Confusion.  Unless there are clear boundaries drawn, it can be very confusing to developers as to which method of data access to use where.

Sharing data structure makes things complicated.  If the different data access methodologies share a data structure that needs to change, this creates a good amount of work and a great deal of testing.  Say a UserProfile table needs to be altered to require a Date of Birth and a phone number.  Not only to all stored procedures need to be altered to incorporate this structure change, but the ORM entities need to change as well or inserts will potentially fail.  I’m sure in the real world changes like this will likely be part or a larger development effort anyway, but it’s worth noting.

Other Options

In-Line SQL

I’ve added this just so I can make a arguments against this choice.  Explicitly writing SQL queries inline can be difficult to manage and very difficult to keep in sync with the data structure.  The chance for developers to write inefficient or buggy queries increases and the adherence to uniformity drops.  I go out of my way to avoid this option unless absolutely necessary.

No-SQL

Another viable option that’s rapidly gaining popularity are the No-SQL or non-relational persistence models.  MongoDB… CouchDB… RavenDB… it seems like a new NoSQL database is popping up every week targeted to a slightly different audience to solve slightly different problems.  It’s a completely different way of storing data and one that works extremely well in many instances.  From a rapid development site to one that needs high availability and easy sharding, it’s a very viable alternative to traditional relational databases.  If you can get away from the idea that data needs to be normalized and that complex queries are necessary to run an application, this could work for you.  I’ll save this debate for a different post, but here’s a very thorough article to scratch the surface on the pro’s and con’s: SQL vs. NoSQL: Which Is Better?

Discussion

Let it be known that I’m neither advocating nor opposing any of these approaches as the only viable solution.  I also recognize that not all viable database servers offer stored procedures, and that the above selections aren’t the only viable solutions for data storage and access.  If you have another point to highlight or an option that I haven’t listed, please feel free to share in the comments.