Systems Engineering and RDBMS

Virtual Labs for SQL Server 2012

Posted by decipherinfosys on March 23, 2012

MSFT has released a bunch of virtual labs to easily come up to speed with the new features of SQL Server 2012.  So, if you are restrained due to hardware/software and are finding it difficult to convince your boss to provide environments to play around with the new feature sets, here is a great place to get your hands dirty and start learning more about SQL Server 2012 – all these virtual labs run off the MSFT servers in the cloud.

http://www.microsoft.com/sqlserver/en/us/learning-center/virtual-labs.aspx

Posted in SQL Server | Tagged: , | Leave a Comment »

Sample Databases for SQL Server 2012

Posted by decipherinfosys on March 22, 2012

In case you are looking for the AdventureWorks Sample Databases for SQL Server 2012 version, you can download them from codeplex – here is the link for all the downloads (Data files, Script files etc.):

http://msftdbprodsamples.codeplex.com/releases/view/55330

Once you download the “AdventureWorks2012_Data” data file, you can simply attach it using T-SQL or through the GUI.  If you use T-SQL, you can do simply:

USE MASTER
GO
CREATE DATABASE AdventureWorks2012
ON (FILENAME = 'C:\AdventureWorks2012_Data.mdf')
FOR ATTACH_REBUILD_LOG

It will automatically create the log file in the same folder.  And if you want to do it from the GUI, you can right click on databases in Management Studio and Select “Attach” which will bring up the GUI in which you can select the MDF file location.  You will notice that under the details section, it will complain about not finding the Log file.  If you proceed as is, you will get an error since it is expecting the log file to be present.  Simply highlight the log entry line in the details section and click on Remove and then proceed and it will create the log file for you and attach the database:

SQL Server 2012 - Attach Database

Posted in SQL Server | Tagged: , , | Leave a Comment »

Excellent comparison of Pig vs Hive

Posted by decipherinfosys on March 22, 2012

We had recently blogged about Hadoop and the different sources for learning Hadoop and getting up to speed on it.  One of the points that we missed out on was a mention of Pig and Hive.  Hive and Pig were Hadoop sub-projects before but are now open source volunteer projects under the Apache Software Foundation.

Pig is essentially a platform for creating MapReduce programs with Hadoop.  The platform consists of a high level language for data analysis programs and an infrastructure for evaluating those programs.  Since they are amenable to substantial parallel operations, it enables them to handle very large data sets.

Hive is a data warehouse system built for Hadoop that allows easy data aggregation, ad-hoc queries and analysis of large data sets stored in Hadoop compatible file systems.  HiveQL is a SQL “like” language that can be used to interact with the data and it also allows developers to put in their own custom mappers/reducers.

Here is a link that provides an excellent comparison between Pig and Hive by Lars George:

http://www.larsgeorge.com/2009/10/hive-vs-pig.html

Be sure to read the comments as well.

And the getting started guides on Hive and Pig:

Hive: https://cwiki.apache.org/confluence/display/Hive/GettingStarted

Pig: http://pig.apache.org/docs/r0.9.2/start.html

Posted in Big Data | Tagged: , , , | Leave a Comment »

New way of paging in SQL Server 2012 – Offset clause

Posted by decipherinfosys on March 21, 2012

For any web application, paging of records is a necessary requirement.  We have written several posts in the past on how to achieve paging in Oracle, SQL Server and DB2 LUW.  Those posts showed how paging can be achieved using CTE/custom procedural code/row_number() function (SQL Server), using rownum (Oracle) or rownumber()/fetch first clause (DB2).  Here are some of those posts to familiarize yourself with the different ways of paging in different RDBMS:

Whitepaper on paging in Oracle, SQL Server and DB2 LUW (bit dated – wrote it in 2007): https://decipherinfosys.wordpress.com/2007/01/18/paging-data-in-oracle-ms-sql-server-and-db2-luw-using-database-code/

Paging of Record Sets and COUNT(STOPKEY) Optimization in Oracle: https://decipherinfosys.wordpress.com/2007/08/09/paging-and-countstopkey-optimization/

Paging of record sets in SQL Server 2005 or SQL Server 2008: https://decipherinfosys.wordpress.com/2007/08/10/paging-of-record-sets-in-sql-server-2005/

Dynamic SQL, Dynamic Ordering and Paging: https://decipherinfosys.wordpress.com/2008/04/22/dynamic-sql-dynamic-ordering-and-paging/

In today’s post, we will introduce a new clause that is now available in SQL Server 2012 which makes writing the paging logic a lot more simpler to read and implement.  Here is an example:


DECLARE
@NbrofRecords INT = 15,
@PageNbr INT = 3,
@OrgTypeID INT = 23,
@OrgName NVARCHAR(100) = NULL;


SELECT OM.ORG_MASTER_ID, OMTX.ORG_TYPE_LKUP_ID, OTL.ORG_TYPE_NAME
, COALESCE (MFFM.FUND_FAMILY_NAME, OM.ORG_NAME) AS ORG_NAME, MFFM.MUTUAL_FUND_FAMILY_MASTER_ID
, COALESCE (CM.FIRST_NAME, ' ') + COALESCE (CM.MIDDLE_NAME, ' ') + COALESCE (CM.LAST_NAME, ' ') AS CONTACT_NAME
, CM.CONTACT_MASTER_ID FROM dbo.ORG_MASTER AS OM
INNER JOIN dbo.ORG_MASTER_TYPE_XREF AS OMTX
ON OM.ORG_MASTER_ID = OMTX.ORG_MASTER_ID
INNER JOIN dbo.ORG_TYPE_LKUP AS OTL
ON OMTX.ORG_TYPE_LKUP_ID = OTL.ORG_TYPE_LKUP_ID
LEFT OUTER JOIN dbo.ORG_TYPE_CONTACT_XREF AS OTCX
ON OMTX.ORG_MASTER_ID = OTCX.ORG_MASTER_ID
AND OMTX.ORG_TYPE_LKUP_ID = OTCX.ORG_TYPE_LKUP_ID
LEFT OUTER JOIN dbo.CONTACT_MASTER AS CM
ON OTCX.CONTACT_MASTER_ID = CM.CONTACT_MASTER_ID
LEFT OUTER JOIN dbo.MUTUAL_FUND_FAMILY_MASTER AS MFFM
ON OM.ORG_MASTER_ID = MFFM.ORG_MASTER_ID
WHERE OM.MARK_FOR_DELETION = 0
AND OTL.ORG_TYPE_LKUP_ID = @OrgTypeID
AND (OM.ORG_NAME LIKE @OrgName OR MFFM.FUND_FAMILY_NAME LIKE @OrgName OR @OrgName IS NULL)
ORDER BY OM.ORG_NAME ASC
OFFSET (@NbrofRecords * (@PageNbr - 1)) ROWS
FETCH NEXT @NbrofRecords ROWS ONLY;
GO

As you can see from above, the only change was to append the OFFSET clause and the FETCH NEXT to your actual query and use two parameters – one to denote the Page Number that you want and the other to denote how many records do you want per page.  So, say your query returns 1000 records and your per page display of records requirement is 15 records per page and you want to display the data from the third page – in that scenario, @PageNbr will have a value of 3 and @NbrofRecords will have a value of 15 so the offset would be by 15 * (3-1) = 30 i.e. you are going to start from the 31st record using the OFFSET clause and then FETCH the NEXT 15 records.  Pretty easy to write, understand and implement, isn’t it?  Do remember that these clauses go along with the ORDER BY clause i.e. you cannot use these on unordered data (will get an error) – if you want data ordered in a particular way, you should be using order by anyways which is always the case for paging for web applications.

Will need to play around a bit to ascertain the performance impact in the case of complex queries with different columns (from different tables) in the order by clause or conditional ordering to see what the impact is between the old vs the new approach.  Will post the results on the blog when we do those tests.

Posted in DB2 LUW, Oracle, SQL Server | Leave a Comment »

Learning Hadoop

Posted by decipherinfosys on March 20, 2012

In a recently concluded project, we had the opportunity to work on Hadoop.  There was a learning curve since none of us had worked in Hadoop before.  Here are some URLs to help you get started with your learning process in this regard:

Basics of Hadoop:

The article on gigaom or the series of articles on cloudera’s site will get your started:

http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop/

http://www.cloudera.com/what-is-hadoop/

Sign up with Cloudera and you will have access to a lot of very good learning material on Hadoop, example:

http://www.cloudera.com/resource/introduction-to-apache-mapreduce-and-hdfs/  is a good starter’s video on MapReduce and HDFS.

or this one: http://www.cloudera.com/resource/apache-hadoop-ecosystem/ for understanding the Hadoop ecosystem.

And this whitepaper from Gartner on Hadoop and MapReduce for Big Data Analytics:

http://info.cloudera.com/GartnerReportHadoopJanuary2011.html

If you like to have text available for your learning purposes rather than video tutorials, here is a good chapter on HDFS: http://www.aosabook.org/en/hdfs.html

Setting up Hadoop cluster:

And once you are ready to jump in, there are some excellent tutorials by Michael G. Noll to guide you:

To set up your first Hadoop node: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

And then multiple node cluster: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

And here are some additional good tutorial references: http://www.delicious.com/jhofman/tutorials+hadoop

Microsoft and BigData

Recently, MSFT also announced their support for Apache Hadoop.  You can read more on MSFT’s big data solution from here:

http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data-solution.aspx

and the work done by HortonWorks for extending Apache Hadoop to Windows:

http://hortonworks.com/blog/extending-apache-hadoop-to-millions-of-new-microsoft-users/

Posted in Big Data, Linux, SQL Server, Unix, Windows | 1 Comment »

Installing/Enabling .Net 3.5 SP1 on Windows 2008 R2 for SQL Server 2008 R2/SQL Server 2012 RC0 Install

Posted by decipherinfosys on February 24, 2012

A junior team member ran into this issue so thought it was worth a post.  While installing SQL Server 2012 RC0 on a new VM of Windows Server 2008 R2, it complained about installing or enabling .Net Framework 3.5 SP1 which is a pre-requisite for the SQL Server install.  Here are the steps to validate whether it is installed and how to enable it and if it is not installed, how to go about installing it.

  1. Under Administrative Tools, Select Server Manager and click on Features.
  2. All the installed Features are displayed in the right side pane where you can validate if .Net Framework 3.5 SP1 (3.5.1) is installed or not.
  3. If it is not installed, then click on “Add Features” and expand “.Net Framework 3.5.1 Features”.
  4. Check the check box for “.Net Framework 3.5.1 Features” and click on Next and then Install.

5.  Once the installation is complete, click on close and you are done.

Now, you can move on with your SQL Server 2012 RC0 install on Windows Server 2008 R2.  Have fun.

Posted in .Net Development, SQL Server, Windows | Leave a Comment »

Excellent Article on the Basics of Hadoop

Posted by decipherinfosys on February 10, 2012

Read an excellent article at gigaom on Hadoop.  It is true that many people in the industry have heard the word, read about the Big Data and don’t really know what Hadoop is.  This article will give you the basic understanding that you need:

http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop/

Posted in Big Data, Open Source, Technology | Leave a Comment »

Reading from a XML Data Type Column in SQL Server

Posted by decipherinfosys on November 23, 2011

We have covered XML capabilities in SQL Server in several posts before – you can read those here.  Yesterday, while helping a client, the developer needed to read from an XML data type column and the way the code was written was to iterate over the records, fetch one record at a time, assign it to a variable and then using the nodes() method, extract out the data and their hierarchies into a flat row/column structure as a data set.  This approach is not very scalable on large data sets and performance becomes an issue.

One can use the nodes() method and CROSS APPLY/OUTER APPLY to do this in a SET based fashion using a single SQL statement.  We had demonstrated this before in some of our Report Server queries as well.  Here is an example of how to do that:

We will first create a demo table:
IF EXISTS (SELECT 1 FROM SYS.objects WHERE TYPE = 'U' AND NAME = 'AP_DEMO')
BEGIN
DROP TABLE AP_DEMO
END
GO
CREATE TABLE dbo.AP_DEMO
(
TRANSACTION_ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
Xml_Data XML NOT NULL
)
GO

And now insert into this table 2 records – note that the XML that is being inserted has a three layer hierarchy – One Transaction can have 1-N Services and 1 Service can have 1-N Responses.  Also note that we are using the row value constructor feature of SQL Server 2008 to do these inserts.

INSERT INTO AP_DEMO (XML_DATA)
VALUES
(
‘<?xml version=”1.0″ ?>
<Root>
<Transaction>
<Tattrib1>Test1</Tattrib1>
<Tattrib2>100</Tattrib2>
<Service>
<Id>1</Id>
<Name>Service1</Name>
<Response>
<Key>KeyTest1</Key>
<Value>Value1</Value>
<Point>10</Point>
</Response>
</Service>
</Transaction>
<Transaction>
<Tattrib1>Test2</Tattrib1>
<Tattrib2>53</Tattrib2>
<Service>
<Id>100</Id>
<Name>Service2</Name>
<Response>
<Key>KeyTest2</Key>
<Value>Value2</Value>
<Point>90</Point>
</Response>
</Service>
</Transaction>
</Root>’)
,
(‘<?xml version=”1.0″ ?>
<Root>
<Transaction>
<Tattrib1>Test3</Tattrib1>
<Tattrib2>1000</Tattrib2>
<Service>
<Id>9</Id>
<Name>Service20</Name>
<Response>
<Key>KeyTest3</Key>
<Value>Value3</Value>
<Point>99</Point>
</Response>
</Service>
</Transaction>
<Transaction>
<Tattrib1>Test4</Tattrib1>
<Tattrib2>999</Tattrib2>
<Service>
<Id>87</Id>
<Name>Service30</Name>
<Response>
<Key>KeyTest4</Key>
<Value>Value4</Value>
<Point>97</Point>
</Response>
</Service>
</Transaction>
</Root>’)

And now, we can extract this data using the nodes() method and applying CROSS APPLY at it to extract out each hierarchy:

SELECT
svc.value('(Tattrib1/text())[1]', 'varchar(100)') as Txn_Attribute_1
, svc.value('(Tattrib2/text())[1]', 'varchar(100)') as Txn_Attribute_2
, rsp.value('(Id/text())[1]', 'int') as Service_Id
, rsp.value('(Name/text())[1]', 'nvarchar(100)') as [Service_Name]
, val.value('(Key/text())[1]', 'nvarchar(100)') as Response_Key
, val.value('(Value/text())[1]', 'nvarchar(100)') as TSR_Value
, val.value('(Point/text())[1]', 'int') as TSR_Point
FROM
AP_DEMO
CROSS APPLY Xml_Data.nodes('//Transaction') AS Txn(svc)
CROSS APPLY svc.nodes('Service') AS svc(rsp)
CROSS APPLY rsp.nodes('Response') as rsp(val)
GO

This will give the following output:

Txn_Attribute_1 Txn_Attribute_2 Service_Id Service_Name Response_Key TSR_Value TSR_Point
Test1           100             1          Service1     KeyTest1     Value1    10
Test2           53              100        Service2     KeyTest2     Value2    90
Test3           1000            9          Service20    KeyTest3     Value3    99
Test4           999             87         Service30    KeyTest4     Value4    97

As you can see from above, one can easily extract out the data from a XML data type column in a table by using a single SQL statement.  Of course, one should ensure that one has proper filter conditions in the WHERE clause on properly indexed columns and is also making use of XML indexes in order to ensure a good execution plan for the SQL.

Hope this small code snippet helps in your work.  Wish all of our readers in the US a very Happy Thanksgiving.

Posted in SQL Server | 1 Comment »

History of Programming Languages

Posted by decipherinfosys on July 28, 2011

Nice infographic from Rackspace on the history of programming languages:

http://www.rackspace.com/cloud/blog/2011/05/17/infographic-evolution-of-computer-languages/

Posted in General, Technology | Leave a Comment »

Cannot Access Newly Created Instance

Posted by decipherinfosys on July 6, 2011

Got a question from a reader yesterday:

“I just joined a new company and was given a VM on which my domain account was part of the Administrators group but the SQL Server 2008 R2 install was done by the IT team and even by using my Windows account, I cannot login to the instance.  I tried using the “sa” account thinking that they might have left the password as blank but keep getting errors.  How can I resolve it?  My understanding was/is that if my Windows account is part of the Administrators group on the local machine/VM, I should be able to log in since the BUILTIN\Administrators group is always by default assigned sysadmin rights on a newly created instance.  Is that no longer the case in SQL Server 2008 R2?”

Yes, as part of the new security features in SQL Server 2008, by default, the local Windows group “BUILTIN\Administrators” is no longer made part of the sysadmin fixed server role.  You have to explicitly add it if you want that to be the case. It is very well documented in BOL and MSDN and we would recommend that you go through this article to familiarize yourself with the security changes in SQL Server 2008 R2:

http://msdn.microsoft.com/en-us/library/cc280562.aspx

While helping the user above to resolve the issue, it turned out that IT Admin had used a domain group that only IT team members were a part of and a) Installed in Windows Authentication mode, and b) Only made that group a part of the sysadmin role.  The user above tried to get in by making changes to the registry to change the authentication to mixed mode by changing the value for this registry setting to 2:

HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\LoginMode

Or in the case of a named instance:

HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft SQL Server\MSSQL.n\MSSQLServer\LoginMode (n: Instance nbr of the named instance)

Post re-starting the service, it did change the authentication mode to mixed mode, however due to un-successful attempts and the password policy in place, the “sa” account was locked out/disabled.  One can easily enable it but first one needs to be able to get in with sysadmin privs. and then make the changes.  Best option was to get one of the IT team members to login and post it, add additional logins, enable the “sa” account, set it’s password and then make whatever changes needed for the logins/their access levels etc.  The user did that and was on his way to managing the newly created instance.  One can enable a disabled/locked SQL Server account through the GUI or by using the “ALTER LOGIN” statement as we had shown in a previous post here.

Posted in SQL Server | Leave a Comment »