Ordering of the data
Posted by decipherinfosys on August 4, 2008
This might seem like an obvious thing to a majority of our readers but since this question has come up many times in e-mails from some of our readers as well as at client sites, we thought it needs to be mentioned in here. At a client site, one of the end users of the system had put in a bug report that when configuring data for an application, the same data was sometimes getting repeated on different pages. For the application in question, the query for the page was made up of a single configuration table which had close to 1000 rows and using a paging mechanism, 25 records were being displayed per page from that table.
The query that was being used for that page did not have an order by clause. Also, data was updated as well as new data was inserted into this table as more stores were added to this retail application. As a result of this, when paging was done and since the SQL did not have an order by clause, at times, the same data appeared across the pages. This was a web based stateless application. When paging was done, each page fired off the same SQL but showed different sets of the data (first 25 records on the first page, 26-50 on the next and so on). Since there was no ORDER BY clause in the SQL, you are not guaranteed to get the records in “exactly” the same order each and every time. The developer of the page had made an assumption that since the query is based off a single table, the data will be presented in the same order as that of the primary key index. That is not always true. As updates are made to the data records or as deletes happen, the internal rowids can change and one cannot rely on the data being returned in the order of the primary key index. Also, if there is a filter condition that is used by the end user, depending upon the SQL that gets formed, the execution plan generated could be by-passing the PK index and returning all the data from a separate index itself (a covered index).
Moreover, if tomorrow, there is a need to add more tables in this SQL query, then also this assumption will fail. So, the bottom line is that if you want the records to be returned in a particular order, always use an ORDER BY clause. In the case of the above application since the application was using an ORM (Object Relational Mapper) to build up the SQL dynamically and was using a configuration table to specify the default order by condition (in case none was specified by the end user), fixing it was a simple data entry change by putting the primary key column in the list.