Systems Engineering and RDBMS

Parsing of SQL Statements

Posted by decipherinfosys on April 19, 2007

Any SQL statement goes through three stages: Parsing, Compilation and Execution. During the parsing phase, the statement is broken down into it’s component parts – determining what type of a statement it is – whether it is a DML or a DDL or a select query. The checks that are next done on it are the syntax and the semantics checks. The syntax check validates the statement based on the SQL grammar. The semantics checks make sure whether the statement is a valid one in the light of the schema objects that you have i.e. do those tables, views etc. exist in the schema or not and whether you have the access to those objects and are proper privileges in place for you to execute that statement, whether there are ambiguities in the statement – example: If you have say two tables T1 and T2 and both have a column col1, then the query “select col1 from T1, T2 where…” is an ambiguous query since the engine does not know which table to use to get col1 from.

The difference between the two types of checks (syntax and semantics) can be shown via an example:

This statement fails with a syntax error:

select from where

go
Server: Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword ‘from’.

While this statement failed with a semantic error, if the table I_DO_NOT_EXIST existed and we had permission to access it, this statement would have succeeded:
select * from I_DO_NOT_EXIST
go
Server: Msg 208, Level 16, State 1, Line 1
Invalid object name ‘I_DO_NOT_EXIST’.

The next stage in the parsing process is either a soft parse or a hard parse. If the statement that we are currently parsing has been processed by some other session, then we can skip the next two steps which are optimization and row source generation – if we are able to skip those two steps, it is called as a soft parse (this term is not frequently used and is more common in the Oracle world). If we cannot find a suitable plan in the memory, then it has to do all the steps i.e. parse, optimize and generate the plan for the query. This is called as a hard parse. In OLTP applications, one should always attempt to minimize these hard parses and those can be minimized by using bind variables (also called as parameterized queries). So, in such applications, the parse to execute ratio should be low which means that we parse the statement once and execute multiple times. Unless the instance (database for Oracle) has been re-started, this ratio should be low.

The next two steps in the life-cycle of a SQL statement execution are: Compilation and Optimization. These are the steps in which the optimizer breaks down the statement and uses the meta-data and the statistics to generate a query graph and the execution plan. We will cover these two in future blog posts and will cover the importance of good statistics collection.

One Response to “Parsing of SQL Statements”

  1. […] Another advantage (besides the BIG performance advantage) of using bind variables in the code is security. Using parameters or parameter markers to hold values by end users is more secure than concatenating the values into a string that is then executed using either a data-access API method, or the sp_executesql stored procedure (SQL Server). Not using parameterized queries also exacerbates locking issues in the system because of poor response timings due to the need for repeating the parsing and the compilation steps in the SQL statement processing. […]

Sorry, the comment form is closed at this time.

 
%d bloggers like this: