Systems Engineering and RDBMS

Using Stored Procedures vs Dynamic SQL generated by ORM

Posted by decipherinfosys on March 27, 2007

This is one of those topics that always generates a lot of heat among developers/architects/DBAs etc/. People typically take a very strong stance on one approach over the other. We would take a neutral approach and highlight the pros and the cons in a humble effort to put our opinion forth. As most of the IT situations, the answer to this question is : “It Depends”. It really depends upon the type of your application and what the application is supposed to do. We are only talking about OLTP applications here…for BI applications where heavy data churning is required, it is always advisable to keep the code in the DB layer to use SET based logic to do your processing logic or use an ETLM tool for the processing logic.

The advantages of using stored procedures are:

1) Network Bandwidth: Assume that the application server(s) and the database server(s) are separate servers. Since the source code is actually stored on the database server, the application only needs to send the name and the parameters for executing it and this in turn reduces the amount of data that is sent to the server. When you use embedded SQL or dynamically generated SQL through an ORM, then the full source of commands must be transmitted and in a scenario where there is a lot of concurrent activity going on and the load on the system requires a lot of users, this can very soon become a performance bottleneck. This can be mitigate in part by a judicious usage of views as well.

2) Abstraction Layer: It helps in separating the business logic from data manipulation logic. Since the interface to the application remains the same, changes done internally to the stored procedures/packages do not effect the application and in turn leads to easy deployment of changes.

3) It offers simplified batching of the commands. Since stored procedures/packages are meant to do a “Unit of Work”, this procedural logic can be simple to maintain and offers additional advantages like making use of the rich feature functionality that the database engines provide. SQL is a SET based language and using SET based procedural logic is the easiest and most performant way of dealing with the data. With every new release of Oracle, SQL Server or DB2 LUW, new features are being introduced in PL/SQL, T-SQL and/or SQL/PL which makes handling of different types of requirements very easy in the DB layer code.

4) Increased data security and integrity: One can secure the tables for direct access and only grant privileges on the stored procedures/packages.

5) By using bind variables (parameterized queries), SQL injection issues are mitigated as well.

6) Code Instrumentation and tracing logic can be built in very easily using the stored procedures. This is one thing that we implemented for one of our clients recently. We created a table which had a list of the DB code that was being used in the schema and this table had a trace_flag column in it which could have 4 different values:

0 (no code instrumentation),

1 (log the sequence of events),

2 ( log the sequence of events and the time taken by those SQL statements),

3 ( log the sequence of events + the time taken + the execution plan from that point of time – since the execution plan can very easily be different at the time of execution under a load scenario vs when you actually run it separately), and

4 (Start the trace – example: starting 10046 level 12 trace in the case of Oracle).

Using this logic, code instrumentation and troubleshooting production issues became very easy. One could then run reports against the data that was logged and present it to the end user or the support personnel. Code instrumentation can be done in the application tier as well using the same guidelines (or using logging blocks like MSFT logging block in .Net) and a good programmer would always instrument their code. However, for the DB code, this code instrumentation becomes a lot more easier to implement.

Cons of using stored procedures:

1) If your application runs on multiple RDBMS, example: You are a vendor and you need to provide your product that runs on Oracle, SQL Server and DB2 LUW in order to expand your customer base, then in that scenario, you have to code or put fixes in for three different code bases. Not only that, you need to have proper staffing to ensure that the code written is optimal since the locking and concurrency mechanisms are very different between these RDBMS. Also, the language used by all these “big 3″ is very different as well.

2) We have seen client shops which offload all of their CRUD operations on the DB tier – as a result they end up with one-liner stored procedures and if you have say 200 objects, you now have 200 * 4 (select/insert/update/delete) stored procedures or one procedure per object with the flag option to indicate the operation and need to code the procedure to use dynamic SQL in order to take into account the conditional parameter logic.  Maintenance becomes a nightmare in this case.  Also, developers/DBAs sometimes get carried away with this approach and forget that SQL is a set based language – one example is that using this scheme, a client shop was doing purges of the data and kept calling the delete procedure by passing in the ID (surrogate Key) value when they actually could have purged and archived the data using a SET based logic.  Procedures are supposed to do unit of work – having one liner procedures and invoking that many calls in a batch does not yield any benefit.  In addition, it has to un-necessarily incur the cost of checking permissions and plan associated with that one liner procedure – the cost is albeit a very miniscule one.

3) Parsing of strings is not one of the forte’s of the DB languages – that code is better suited in the application tier unless you start using CLR or Java code.

So, the bottom line is that one needs to carefully evaluate which portion(s) of the application really belong as stored procedure/packages. In applications that work on volume of data and do bulk data processing, it is always advisable to have that logic in stored procedures/packages that reside on the DB layer so that one can take advantage of the SET based approach of the SQL language.

One can also use ORMs (Object Relational Mappers) like Hibernate to prepare their data access strategy – one can then extend it to make calls to the database stored procedure (if so desired in certain situations), have externalized SQL statements, have dynamic SQL statements being prepared by the ORM etc.. One just has to make the right decision depending upon the application and the requirements that are needed.

Another thing to remember is when people point out that an advantage of stored procedures is that the code is always pre-compiled, that is not always true, there can be scenarios that can lead to re-compilation. Also, if proper bind variables are being used for the queries built using an ORM, it serves the same purpose (i.e. has the same advantage as that of a parsed/compiled stored procedure query) since the plan for that SQL statement is parsed and compiled. Depending upon a lot of factors (cache sizes, concurrent load, statistics updates etc.), that plan may or may not be available the next time the call gets made.

A good mix of an ORM plus DB code is usually the best compromise to get the best of both the worlds. Deciding what goes where is the key and though there are guidelines on that, it can vary from application to application depending upon the requirements.

About these ads

Sorry, the comment form is closed at this time.

 
Follow

Get every new post delivered to your Inbox.

Join 82 other followers

%d bloggers like this: