Systems Engineering and RDBMS

SQL Server Virtualization

Posted by decipherinfosys on April 25, 2013

Excellent Tips by Michael Otey on virtualization for SQL Server:

Virtualization for SQL Server


Posted in SQL Server | 3 Comments »

Difference between UNICODE and UTF-8 files in Windows Notepad

Posted by decipherinfosys on February 25, 2013

While generating flat files through SSIS for a feed process at a client site, we noticed that the developer had left the file encoding to be UNICODE with the thought that it would be the best practice and the consumption of the file will not be an issue at the receiving system.  The file was being opened up in Notepad.

When generating a flat file in Windows, you have the option (just like you would when you are using Notepad) to use the encoding of ANSI, UNICODE, UTF-8 or Unicode big-endian.  What is important to understand is that in case you are using UNICODE, it is essentially UTF-16 little-endian and if you are using ANSI, it is Code Page 1252.

Microsoft’s Notepad writes UTF-16 with a Byte Order Mark (BOM) and also looks for that BOM when reading the file.  If you are un-aware of what a BOM is, read this entry in Wikipedia – here.  So, in the case of a UNICODE file, the BOM is what helps in determining whether the file is UTF-16 big-endian or little-endian. Now, if Notepad is not able to find the BOM, then it calls a library function called isTextUnicode and it looks at the data and attempts to determine the encoding.  If the interpretation of this function comes out wrongly, it will cause it to display wrong glyphs.

Best approach in our opinion is to use UTF-8 everywhere.  It is a universally accepted encoding and even if you are sharing files across different operating systems, you would still be assured of providing proper data.

Posted in Windows | Tagged: , , | Leave a Comment »

A new iPhone app

Posted by decipherinfosys on August 31, 2012

Wanted to blog about both our 5 Million views milestone which happened yesterday as well as about a new iPhone app that is available in the market and created by our friends at Of late, we have been pretty busy with our projects and as a result have not been blogging that frequently but will start blogging again now with some of the lessons learnt on the new projects related to predictive intelligence projects.

Coming back to the app – it is called iHelp+ and is available in the iTunes AppStore.  Here is a brief description of the app:

iHelp+ is an iPhone application that allows users to alert a family member or a friend or a monitoring company with the touch of the screen.  The users have the ability to create an audible alert, silent alert or even a delayed alert when you want to make sure someone knows you should have arrived at a location and did not due to an un-forseen incident.  When an alert is triggered, the application will provide your contact/monitoring company with life saving information (Your GPS location and a phone number for the closest Public Service Access Point – PSAP).  iHelp+ also allows you to store your medical information and allows you to share this with first responders in-case of an emergency.  With 4 quick contact links you will also be able to reach out to your friends, family or care givers quickly all at the touch of a button.  This app is available only in the US right now.

This app can be used by any age group including teenagers as well as the elderly.  Here are the facebook, YouTube and twitter links for the app and the iTunes link:

We believe that this app can be very useful for the security and personal safety conscious individuals.

Posted in Mobile Devl | Tagged: , , , , , | Leave a Comment »

Sqoop graduates from Apache Incubator to a Top Level Project

Posted by decipherinfosys on April 5, 2012

Sqoop – the Big Data Tool has moved out of the Apache Incubator to a Top Level Project (TLP).  In case you are not aware of Sqoop, it is the key data tool to transfer volumes of data between Hadoop and structured data stores like RDBMS (Relational Database Management Systems).  This project provides connectors for many popular RDBMS – Oracle, SQL Server, MySQL, DB2 and PostgreSQL.  This is a significant step towards the adoption of Hadoop in the enterprise solutions.

You can read more on this in the eweek article here. And here is the link for learning more on sqoop:

Posted in Big Data, Technology | Tagged: , | 1 Comment »

Bug: Scope_Identity() and @@Identity can return wrong values during parallel execution plans

Posted by decipherinfosys on April 4, 2012

Ran into a weird issue today at a client site and upon further researching it, found out that it is a known issue in the SQL Server version that was being used at the client site.  The issue was that at times, we were getting wrong values returned by the Scope_Identity() function.  Upon further diagnosis, we found that the query in question was using a parallel execution plan and when that happens, there is a known issue in some versions of SQL Server because of which you can get wrong Identity value returned.  The KB article from MSFT that talks about this is available here:;en-US;2019779

It states that the issue is fixed as part of the Cumulative Update Package 5 of SQL Server 2008 R2 SP1 and any new releases post it.  However, this was a production system and getting any such kind of patch update/upgrade would have taken time so we adopted the workaround of using the OUTPUT clause.   Example of the OUTPUT clause usage is in that KB article and you can also read up more on it in our blog post here and here.  You can also opt for Method 2 mentioned in the workaround in the article.  Method 1 and Method 3 would be sub-optimal for your scenario if the parallel plan was desired to begin with with Method 3 being applied at the Server level and hence effecting all queries.

Posted in SQL Server | Leave a Comment »

Online Index Operations involving certain large data types are allowed in SQL Server 2012

Posted by decipherinfosys on April 3, 2012

Prior to SQL Server 2012, when you did online index operations (available in Enterprise Edition), if the index contained certain large data types (specifically varchar(max), nvarchar(max), varbinary(max)), then those online operations were not allowed on that index.  That restriction does not exist anymore in SQL Server 2012 so there is no need to take such indexes out of your online operations.

Example:  Let’s create a sample table which contains two columns – an ID column of type INT and the second column of data type NVARCHAR(MAX) and then let’s try to create an index on it with ONLINE option set to ON.

CREATE TABLE tblTestOnlineIdx
CREATE INDEX tblTestOnlineIdx_IND_1 on tblTestOnlineIdx (ID) INCLUDE (COL1) WITH (ONLINE = ON)

We will see that we will get an error in SQL Server 2008 telling us that:

Msg 2725, Level 16, State 2, Line 1
An online operation cannot be performed for index ‘tblTestOnlineIdx_IND_1’ because the index contains column ‘COL1’ of data type text, ntext, image, varchar(max), nvarchar(max), varbinary(max), xml, or large CLR type. For a non-clustered index, the column could be an include column of the index. For a clustered index, the column could be any column of the table. If DROP_EXISTING is used, the column could be part of a new or old index. The operation must be performed offline.

And this same code will run fine on SQL Server 2012 because this limitation has been removed.  However, even though the BOL link above states that the same is true for indexes containing a XML data type column, that is not the case.  ONLINE=ON is not a valid option for the XML indexes and it will still give an error.  That is reflected in the BOL link here which covers the creation of XML indexes.

Posted in SQL Server | Tagged: , | Leave a Comment »

Editions in SQL Server 2012 and Licensing

Posted by decipherinfosys on March 27, 2012

SQL Server 2012 is available in these editions:

  • Enterprise Edition
  • Standard Edition
  • Business Intelligence Edition
  • Developer Edition
  • Web Edition, and
  • Express Edition

Out of these, the Enterprise Edition is of course the one that has all the high end features for meeting the needs of a Tier-1 large scale deployment.  You would recall that previously there was a DataCenter Edition available – that is no longer available now and all of it’s capabilities are now included in the Enterprise Edition itself.

The Business Intelligence Edition is a new Edition that is offered now.   This is available to offer clients the full suite of powerful BI features in SQL Server 2012 like Power View, PowerPivot, enhanced reporting and analytics.  Any company specifically looking at SQL Server from the perspective of BI and not interested in the full OLTP features that are found in the Enterprise Edition would opt for this one.

The Standard Edition is aimed towards customers who do not need the high end features that are available in the Enterprise or the Business Intelligence Editions.

Out of the remaining three specialized Editions, the Developer Edition (like in previous versions) has all the features of the Enterprise Edition but is meant only to be used in a Dev/QA environment and does allow for a direct upgrade to the Enterprise Edition.  The Web Edition is essentially intended for the hosting service providers.  It is scaled down version in terms of support for processors (max: 4) and the feature sets but is good enough to be used if you are looking for an edition to host web-sites and small scale web applications.  It supports up-to 64GB of memory.  The Express Edition is a free edition and has restrictions on the DB size (max: 10GB), 1 processor and 1GB of memory.

Here are some links comparing the different feature sets across the different editions available in SQL Server 2012:

SQL Server 2012 Editions:

Article on MSDN with the Feature Comparison between the different Editions:

And here is the information from MSFT in regards to licensing:

Posted in SQL Server | Tagged: , , | Leave a Comment »

Virtual Labs for SQL Server 2012

Posted by decipherinfosys on March 23, 2012

MSFT has released a bunch of virtual labs to easily come up to speed with the new features of SQL Server 2012.  So, if you are restrained due to hardware/software and are finding it difficult to convince your boss to provide environments to play around with the new feature sets, here is a great place to get your hands dirty and start learning more about SQL Server 2012 – all these virtual labs run off the MSFT servers in the cloud.

Posted in SQL Server | Tagged: , | Leave a Comment »

Sample Databases for SQL Server 2012

Posted by decipherinfosys on March 22, 2012

In case you are looking for the AdventureWorks Sample Databases for SQL Server 2012 version, you can download them from codeplex – here is the link for all the downloads (Data files, Script files etc.):

Once you download the “AdventureWorks2012_Data” data file, you can simply attach it using T-SQL or through the GUI.  If you use T-SQL, you can do simply:

CREATE DATABASE AdventureWorks2012
ON (FILENAME = 'C:\AdventureWorks2012_Data.mdf')

It will automatically create the log file in the same folder.  And if you want to do it from the GUI, you can right click on databases in Management Studio and Select “Attach” which will bring up the GUI in which you can select the MDF file location.  You will notice that under the details section, it will complain about not finding the Log file.  If you proceed as is, you will get an error since it is expecting the log file to be present.  Simply highlight the log entry line in the details section and click on Remove and then proceed and it will create the log file for you and attach the database:

SQL Server 2012 - Attach Database

Posted in SQL Server | Tagged: , , | Leave a Comment »

Excellent comparison of Pig vs Hive

Posted by decipherinfosys on March 22, 2012

We had recently blogged about Hadoop and the different sources for learning Hadoop and getting up to speed on it.  One of the points that we missed out on was a mention of Pig and Hive.  Hive and Pig were Hadoop sub-projects before but are now open source volunteer projects under the Apache Software Foundation.

Pig is essentially a platform for creating MapReduce programs with Hadoop.  The platform consists of a high level language for data analysis programs and an infrastructure for evaluating those programs.  Since they are amenable to substantial parallel operations, it enables them to handle very large data sets.

Hive is a data warehouse system built for Hadoop that allows easy data aggregation, ad-hoc queries and analysis of large data sets stored in Hadoop compatible file systems.  HiveQL is a SQL “like” language that can be used to interact with the data and it also allows developers to put in their own custom mappers/reducers.

Here is a link that provides an excellent comparison between Pig and Hive by Lars George:

Be sure to read the comments as well.

And the getting started guides on Hive and Pig:



Posted in Big Data | Tagged: , , , | Leave a Comment »


Get every new post delivered to your Inbox.

Join 94 other followers