Systems Engineering and RDBMS

Archive for November, 2008

Tempdb usage

Posted by decipherinfosys on November 30, 2008

Not many people understand how tempdb gets used in a SQL Server instance. Here are the basics –

a) Data in tempdb does not persist across service re-starts.

b) Tempdb system database inherits all the db properties from the model database just like other user databases – in the case of tempdb it happens upon the re-start of the service while in the user databases, it happens upon the creation of the user database(s).

c) Only 1 filegroup is allowed for the data file and one for the log file. You can configure the file sizes as well as autogrow but when the service re-starts, the size is re-set to the configured value.

d) Tempdb is used for a lot of things – temp tables (also table variables beyond a threshold), cursors, row versioning if you are using snapshot isolation levels, temporary user objects like work tables which store the results of sort or spools (for operations like joins, union, order by, group by etc.), rowversions generated by operations like MARS and online index operations.

e) Not all database options can be changed for tempdb for example: AutoShrink – this is always off for tempdb) – likewise there are quite a few restrictions on the kind of operations that can be performed on tempdb – you can see this list from the BOL page here.

The most challenging aspect to managing tempdb is to understand what all operations use the tempdb space. Point #d briefly covered those aspects – these are essentially objects which fall under 3 categories: User Objects, Internal Objects, and Rowversion data. There is a very good whitepaper from MSFT which covers these in detail and how you can go about monitoring the usage of your tempdb database – it’s one of the must reads if you are responsible for managing/monitoring your SQL Server instances. You can access this whitepaper here.

And regarding monitoring the space, the whitepaper has a lot of SQLs as well as a stored procedure to help you collect the data and then analyze that data. From a DMV perspective, in SQL Server 2005 (haven’t checked what all has been added in SQL Server 2008 in this regard) one can make use of this DMV:

sys.dm_db_file_space_usage

It will report the space used by tempdb and gives you the page count for all the three categories – version store, user objects and internal objects.

Some other useful posts for tempdb:

  • Changing Tempdb collation – our blog post here.
  • Understanding the new isolation levels in SQL Server 2005 – our blog post here.
  • Moving Tempdb files – out blog post here.
  • Working with tempdb – MSFT whitepaper here
  • Tempdb capacity planning and concurrency considerations – SQLCAT whitepaper here.

Posted in SQL Server | 2 Comments »

SQL Server 2008 upgrade guide

Posted by decipherinfosys on November 29, 2008

MSFT has released an upgrade guide for those who need to upgrade from SQL Server 2000/SQL Server 2005 to SQL Server 2008. It is a very comprehensive guide and covers all the aspects related to the upgrade – deprecated features, setup requirements, different upgrade strategies, troubleshooting guidelines etc. You can download this guide from here.

Posted in SQL Server | Leave a Comment »

Happy Thanksgiving

Posted by decipherinfosys on November 27, 2008

To all our readers in the USA, we wish you and your loved ones a very Happy Thanksgiving. For the non-US readers, here is a small snippet (from Wikipedia) explaining this tradition:

The First Thanksgiving

The first American Thanksgiving was celebrated in 1621, to commemorate the harvest reaped by the Plymouth Colony after a harsh winter. In that year Governor William Bradford proclaimed a day of thanksgiving. The colonists celebrated it as a traditional English harvest feast, to which they invited the local Wampanoag Indians.

Days of thanksgiving were celebrated throughout the colonies after fall harvests. All thirteen colonies did not, however, celebrate Thanksgiving at the same time until October 1777. George Washington was the first president to declare the holiday, in 1789.

A New National Holiday

By the mid–1800s, many states observed a Thanksgiving holiday. Meanwhile, the poet and editor Sarah J. Hale had begun lobbying for a national Thanksgiving holiday. During the Civil War, President Abraham Lincoln, looking for ways to unite the nation, discussed the subject with Hale. In 1863 he gave his Thanksgiving Proclamation, declaring the last Thursday in November a day of thanksgiving.

In 1939, 1940, and 1941 Franklin D. Roosevelt, seeking to lengthen the Christmas shopping season, proclaimed Thanksgiving the third Thursday in November. Controversy followed, and Congress passed a joint resolution in 1941 decreeing that Thanksgiving should fall on the fourth Thursday of November, where it remains.

Posted in General | Leave a Comment »

Windows Firewall and SQL Server

Posted by decipherinfosys on November 26, 2008

At one of the client sites, they were using Windows firewall and the client machines were having a problem connecting to the SQL Server instance. In order to allow SQL Server to be accessed by the client machines, one needs to configure the Windows Firewall on the SQL Server box for DB engine access and to allow remote connections. In this blog post, we will go over the different steps of ensuring that if you are using a windows firewall, how you can go about ensuring that access to SQL Server is not hindered.

Step #1: Go to Start/Run, Type Firewall.cpl and this will bring up the Windows Firewall configuration:

fire_1

I am running Windows Server 2003 on this machine and by default the port 1433 is closed to prevent hackers to connect to the instance from the net. Now, if you have your SQL Server instances running in production under the default ports, you should consider changing it – there are other security guidelines as well from MSFT on this topic that you can find on TechNet (Search for “SQL Server” and Security).

Step #2: Click on Exceptions Tab and then click on Add Port:

fire_2
In the Add port, there are two enteries that you will need to make:

a) Add the name of the instance and then the port number for TCP (1433 if you are running it on the default port or the specific port number that you are running your instance on). If you are not sure what port you are running your SQL Server instance on, see this blog post.

b) Next is to open up the SQL Server Browser Service – so for UDP, type “SQL Server Browser” in the name and then the port number 1434 (make sure UDP is selected instead of TCP). You can read more on the SQL Server Browser Service in this description of all the SQL Server services. This service is used to allow the clients connect to an instance of SQL Server which is not running on the default port of 1433. If you want to further secure it, you can stop the SQL Server Browser Service and have the clients connect by specifying the port number in the connection strings.

Step #3: Next thing to do would be to click on Add Program and you will get the following dialog box:

fire_3

Using the browse button navigate to the Binn folder for SQLServr.exe and add it to the list.  By doing this we are adding the Windows Firewall exception for accessing SQL Server running on a dynamic port.

Step #4: Next thing would be to ensure that the SQL Server instance is open for accepting remote connections.  This you can do via the SAC (Surface Area Configuration) Tool.  After you click on Surface Area Configuration for Services and Connections, you will get the different options like in the image shown below:

fire_4

Check Remote Connections and make sure that you have TCP/IP selected (if allowing only TCP/IP) – if you use both TCP/IP and named pipes, you can choose to select that last option.  You will need to re-start the SQL Server Service in order to have these changes in place.  You can re-start the service from the SAC tool itself – go to Service under database engine as shown below in the image and stop and start the service:

fire_5

So, these are some simple series of steps in order to configure the windows firewall exception for enabling access to the SQL Server engine.  In a majority of the shops though, instead of using Windows Firewall, they use different options like a CISCO firewall which are more robust and the steps are essentially the same over there as well – the GUI options are of course different but the basics remain the same.

Posted in SQL Server | Leave a Comment »

Kilimanjaro – an add on release for SQL Server 2008

Posted by decipherinfosys on November 25, 2008

Yet one more upgrade for SQL Server. MSFT has stated that this is not going to be a major release but an “add-on” kind of upgrade to SQL Server 2008. You can read more here:

  • Redmond Magazine article – here.
  • eWeek articler – here.
  • and the MSFT press release over here.

Posted in News, SQL Server, Technology | 1 Comment »

Searching Audios and Videos

Posted by decipherinfosys on November 24, 2008

While helping a client build up an application, one of the questions that came up was whether it is possible to search audios and videos and if so, does anything like that already exist using open source code that can be incorporated within the product. The back-end is a SQL Server database with SQL Server Analysis Services being used for the OLAP piece. The product is an aggregator of information from a wide variety of sources and one of the key items is the search patterns for news, blogs, articles, magazines, tweets etc.. And capturing the data set from the audios and videos after searching would further enhance the capabilities of this product.

While researching on ways to go about doing it, we came across a couple of options which we wanted to share with you in case you have a similar need or if you find this idea exciting and want to play with it on the site of some of the providers who already provide such capabilities:

1) Google: When it is about innovation, how can it not involve Google. Here is a link to GAUDI (Google Audio Indexing). It uses speech technology to find the keywords that you search for.

2) VideoSurf: This company has a video search technology that analyzes the images in videos.

3) Blinkx: This was a spun-off from Autonomy. It has a video search engine.

4) EveryZing: Another audio and video search engine.

5) Delve Networks: They also provide keyword search capabilities inside the videos.

The data warehouse for this client is already close to 1TB with 6 months of data – with much more search data coming our way, it will be interesting sets of challenges in the days to come 🙂

Posted in News, SQL Server, Technology | Leave a Comment »

NLS_Database_Parameters and v$nls_parameters

Posted by decipherinfosys on November 23, 2008

We have discussed in the past about the NLS parameters including a discussion on NLS_LANG setting. One of the questions that a reader recently asked us was:

“What is the difference between v$nls_parameters and NLS_Database_Parameters and how can I see the differences in the settings for these?”

The globalization guide at otn (Oracle Technology Network) covers all these in detail and we would recommend you to go over it since this is a topic that we have seen to create a lot of confusion among DBAs and Database Developers.  NLS_DATABASE_PARAMETERS gives the permanent NLS parameters of the database where as v%nls_parameters gives the current values of the nls parameters for that session.  So, if via a logon trigger or via an “ALTER Session” command, if I change the session settings, then you will see the difference between the parameter values for the session vs what is set at the database level.

And how can you write a SQL to see all such differences?  Simple enough – do a FULL OUTER JOIN between the two and do the value comparisons:

SQL> select coalesce( perm.parameter, sess.parameter ) parameter,
2                       perm.value permanent_value,
3                       sess.value session_value
4    from nls_database_parameters perm full outer join v$nls_parameters sess on perm.parameter=sess.parameter
5   where decode(perm.value, sess.value,0,1) = 1
6  /

Posted in Oracle | 1 Comment »

Good SQL Server 2008 training videos

Posted by decipherinfosys on November 22, 2008

In the latest issue of the SQL Server Magazine, there was an attachment for some of the training videos which pointed to this site:

http://getonthecase.com

and when you go there, it is a re-direction to the ISV Innovation site:

http://www.isvinnovation.com/SQL/Default.aspx

Over there, you will get 8 different video tutorials. In case you have not gone over those yet, take a look – they are very well done and gives you a lot of very good information about the new feature sets of SQL Server 2008. These are the 8 tutorials:

1) Uncovering T-SQL on SQL Server 2008.
2) Working with unstructured data in SQL Server 2008.
3) Working with asynchronous data in SQL Server 2008.
4) Deploying managed code to SQL Server 2008.
5) Developing for SQL Server 2008 using Visual Studio.
6) SQL Server 2008 and ADO.Net Entity Framework Integration.
7) Communicating with SQL Server 2008 using HTTP.
8 ) Working with SQL Server 2008 and Disconnected Clients.

Posted in SQL Server, Technology, Visual Studio | 3 Comments »

Query Execution Time outs because of lack of memory

Posted by decipherinfosys on November 22, 2008

We have discussed the different kind of timeouts before and have also covered how one can prevent a runaway query by using some simple options. You can read those posts here and here. In today’s post, we will cover the query execution time outs that you might have seen at times in busy systems which have an error like:

[State:42000 Error:8645] [Microsoft][SQL Native Client][SQL Server]A time out occurred while waiting for memory resources to execute the query. Rerun the query.

If you are new to SQL Server memory management and would like to get a good understanding of it, here is a MSDN post on that topic.  If you remember from this post, the steps performed (at a very high level) when a query gets submitted to the engine are parsing, compilation and then execution.  Prior to the parsing, if the execution plan is already in the cache, then the engine selects it.  If it is not in the cache, then it parses it to prepare a sequence tree and that is then converted into algebras tree to generate the algebraic plan.  The execution of the query depends upon whether enough memory is available from the buffer pool to satisfy the requirements of that query.  If enough memory is not available, then the query is put into a queue with a timeout value and has to wait for the memory to become available.    This timeout value is based off the  estimated query cost – the higher the query cost, the higher the timeout value.  So, if enough memory is not available prior to hitting that timeout value, that is when you will see the error message that is shown above.

When the error occurs, the query is also removed from the queue.  This raises an interesting question which a client DBA had asked us – so, if there are already quite a few queries in the queue and a new query is submitted for which there is enough memory available for the execution, how does the queue work?  Haven’t researched that completely yet but our understanding is that the ranking in the queue is a combination of the cost and the wait time.  The lower the cost or the higher the wait time, the higher the ranking but am not sure about that – have to test that out or post a question on the MSDN forums.

However, while looking into the DMVs, we did see this DMV: sys.dm_exec_query_memory_grants which provides us with the information pertaining to the queries that have acquired a memory grant or are in a waiting status.  The column “is_next_candidate” can be used to find out whether the query has already been granted memory or whether it is a candidate for the next memory grant.  Here is a simple query on this DMV to get information on those queries which have not been granted memory and are waiting:

select *
from sys.dm_exec_query_memory_grants
where is_next_candidate in (0,1)
order by is_next_candidate desc, queue_id, wait_order;

Also, we can use the CROSS APPLY operator to get this DMV joined to sys.dm_exec_query_plan in order to get the execution plan for the query as well as with sys.dm_exec_sql_text to get the exact query:

SELECT  sdeq.*
,db_name([dest].[dbid]) as database_name
,object_schema_name([dest].[objectid], [dest].[dbid]) as [schema_name]
,object_name([dest].[objectid], [dest].[dbid]) as [object_name]
,[dest].[text]
,[deqp].[query_plan]
FROM    sys.dm_exec_query_memory_grants sdeq
CROSS APPLY sys.dm_exec_sql_text(sdeq.sql_handle) dest
CROSS APPLY sys.dm_exec_query_plan(sdeq.plan_handle) deqp
where sdeq.is_next_candidate in (0,1)
order by is_next_candidate desc, queue_id, wait_order

Posted in SQL Server | Leave a Comment »

Using DISTINCT … just because

Posted by decipherinfosys on November 21, 2008

We have blogged about the DISTINCT clause in the past (common mis-conceptions about the DISTINCT clause) and what it does and how it gets mis-used at times. This is one of the common issues that we have seen in some of the queries written by junior folks or even at times by senior folks not well versed with SQL.  They add the DISTINCT clause to the Select list just to make sure that “in case” there are duplicates, it will all be taken care of by this wonderful clause.  Most of the time, the reason why they get duplicates is because of join conditions or because they did not write the query in different ways like for example: If you have a 3 table join and say tableA : tableB :: 1:N (1 to many relationship) and tableB:tableC::1:N (again a 1 to many relationship) and the 3 joins are together but one is only selecting the data from tableA and tableB and that set of columns provides unique set of data, however the developer also adds tableC in the join because there is a filter condition (where clause) on it.  That would result into duplicates when there are many records being returned from tableC for a record in tableB because of the 1:N relationship.  In that case, the developer ends up adding a DISTINCT clause in the select list to get the distinct set of data.  The developer instead could have chosen to just use an EXISTS clause like this:

select …<columns from tableA and tableB>…

from tableA

inner join table B

on tableA.col1 = tableB.col2

where …<criteria on tableA and tableB – if being used based on the business logic>…

and exists (select 1 from tableC where col3 = tableB.col4)

and this will negate the usage of the DISTINCT clause.  There are many other such scenarios as well.  The problem with using DISTINCT un-necessarily is that the code will then incur sorting costs un-necessarily since the DISTINCT clause requires that the resultset is sorted and the duplicates are eliminated from the result set.

We have seen even bigger issues when people start using distinct aggregates even when they are not needed or when there are better ways to write the query…by distinct aggregates, we mean, clauses like:

count(distinct invoice_nbr) or sum(distinct sales_qty) etc.

The problem exacerbates when there are a lot of mixing of such distinct aggregates with non-distinct aggregates, example:

select
count(distinct invoice_nbr),
sum(distinct sales_qty),
count(ship_date),
Avg(sales_price),

…etc.

Mixing of these distinct aggregates and non distinct aggregates in the same SQL statement’s select list leads to spooling and involves several re-reads of the intermediate results which is very expensive – even more expensive than computing those separately in separate queries.

Posted in DB2 LUW, Oracle, SQL Server | Leave a Comment »