Systems Engineering and RDBMS

Archive for December 7th, 2008

High Availability – what do these nines mean?

Posted by decipherinfosys on December 7, 2008

If you have ever worked in an environment where you have had to either interact with the Hosting Providers (Perot Systems, Sungard, DST etc.) at their data centers or have to host your servers yourself, one question that will be asked early on is – what nines of availability do you need?  A nine is essentially the number of nines in a percentage that represents the uptime of a system or network or an application or a solution etc..  So, 99.999% of uptime is essentially 5 nines of availability which translates to that that application/solution is up 99.999% in a year.  The remaining 0.001% is the downtime which could because of planned maintenance, upgrades or failure related downtime.

However, this is also one of those things which is not well understood by some shops and they un-necessarily go for 5 nines even when they do not have a business need to do so.  We have seen clients pay a lot of money to get the 5 nines availability even when just having 99% would have sufficed for them. Sometimes, this happens because of lack of communication between the business side and the IT side – business really wants close to 0% downtime between the business hours of 8 A.M. – 6:00 P.M. but is ok with maintenance outages in the night shift.  This should not be interpreted as 5 nines of availability.

In terms of minutes/hours, this is what the table looks like when you compute the availability in terms of nines:  Considering a normal year (not taking leap years into account for this calculation):  365 days * 24 hours * 60 minutes = 525,600 minutes per year.  So, now if we want to calculate 5 nines availability, it means:

525600 – ((99.999 * 525600)/100) = 5.256 minutes of downtime (includes both planned and unplanned downtime).

So, do you really need 5 nines availability?   A majority of the applications don’t.   It’s not that it cannot be achieved – it most certainly can but then the cost goes up exponentially so you should evaluate carefully when the business area or the CxO (CEO, COO, CTO, CIO) proposes a 5 nine solution for the application.  A 99.9% (3 nines availability) translates to 525.60 minutes or roughly 8.7 hours of downtime in a year which for even the stringent requirements for uptime is acceptable.  Even 99.0% translates to roughly 87.6 hours of downtime per year which was what our client finally agreed to when we showed them the calculations for the application upgrades, patches etc. vs the cost that they would incur with the data center as well in terms of planning and resources in order to get the 5 nines of availability.

Posted in Technology | 2 Comments »

Using ROWID in the application

Posted by decipherinfosys on December 7, 2008

In one of the client engagements, we saw that the code was using ROWID for doing the updates for the tables i.e. it used to retrieve several columns including ROWID and then later on use ROWID to identify the rows it needs to update.  As you already know, ROWID is assigned to a row when a record gets created (ROWID is unique within a table) and remains the same for that record unless that row is deleted and re-inserted in which case it is a new row and hence a new ROWID.  Here are some of the common examples when this deletion + insertion can happen:

1) Suppose that you have implemented partitioning and the code updates the partitioned key – the update in this case will cause the row to move from one partition to another in which case the ROWID value will change.

2) If the DBA uses the “ALTER TABLE <Table_Name> MOVE” command, then also the ROWID could change.

3) If you have to use the Flashback feature and have to flashback the table to a point in time, then also the ROWID can change.

4) If the DBA uses the “ALTER TABLE <Table_Name> SHRINK SPACE COMPACT”, then also the ROWID can change.

5) If you have the “Enable Row Movement” set for a table, then also the ROWID can change for the record(s).

The reason why the client was using ROWID for the updates was because it is very fast however just using the ROWID does not guarantee that you will always be updating the same record.  Here is an example – consider these 3 sessions:

First Session: Fetches a row from table A with rowid A1
Second Session: Deletes the row from table A with rowid A1 and commits the transaction

NOTE ==> Please note that the rowid A1 is now free for re-use by sessions

Third Session: Creates a new row with rowid A1 and commits the transaction
First Session: Now tries to fire the update statement: “update A set …. where rowid = A1”

And this command will not be updating the same record as it intended to.  So, be aware of these facts in case you are also using only ROWID in your applications to do the updates.

Posted in Oracle | Leave a Comment »