Systems Engineering and RDBMS

Archive for December 17th, 2007

FILESTREAM Data in SQL Server 2008

Posted by decipherinfosys on December 17, 2007

We have been covering the new features in SQL Server 2008 in some of our blog posts.  In some of those, we had also discussed the new data-types in that release.  You can search for SQL Server 2008 on this blog site to get to all those posts.  In today’s blog post, we will discuss what can be done for designing and implementing FILESTREAM storage in SQL Server 2008.  Prior to SQL Server 2008, storage of items like documents, images was typically done by storing a link in the table column and storing the documents and images in a particular mapped out location on the SAN or other storage device.  For images, some people also used to store it in the imsage data-type but that was not always the most performant solution.

In SQL Server 2008, Filestream enables the application to store such un-structered data on the file system.  So, the application can now take advantage of the streaming APIs and performance of the file-system.  In addition, one can now co-relate the un-structured and structured data easily.  The Win32 filesystem interfaces provide the streaming access to the data and a good thing about filestream is that it does not use the SQL Server buffer cache – it uses the Windows system cache for the caching of the file data.

In order to make use of FILESTREAM, one needs to enable filestream for an instance of SQL Server and then one needs to create a database that supports filestream.  Once that is done, we can use the  FILESTREAM clause next to the VARBINARY(MAX) data-type to take advantage of this new feature.  Let’s follow this up with an example:

1) Let us first enable the instance for filestream:

USE MASTER
GO
EXEC sp_filestream_configure @enable_level = 3

This can also be changed from the GUI from the Advanced level properties for the instance.  Below is the jpg image for it.

filestream_1.jpg

You can see from the drop-down the different levels that are available for FILESTREAM.  We have chosen the highest level – Transact SQL and filesystem (local and remote client access).

2) Now that we have enabled the filestream option at the instance level, let us go ahead and create a new database.  We would need to create a filegroup that has the CONTAINS FILESTREAM as the clause so that we can designate that filegroup (and it’s files) to contain the filestream data.

USE MASTER
GO
CREATE DATABASE DEC_TEST_FS ON PRIMARY
( NAME = DEC_TEST_FS_data,
FILENAME = N’C:\DEC_TEST_FS_data.mdf’),
FILEGROUP DEC_TEST_FS_FG_1
( NAME = DEC_TEST_FS_REGULAR,
FILENAME = N’C:\DEC_TEST_FS_data_1.ndf’),
FILEGROUP DEC_TEST_FS_FG_2 CONTAINS FILESTREAM
( NAME = DEC_TEST_FS_FILESTREAM,
FILENAME = N’C:\DEC_TEST_FS’)
LOG ON
( NAME = DEC_TEST_FS_LOG,
FILENAME = N’C:\DEC_TEST_FS_log.ldf’);
GO

Please note that if you try to create this database by specifying a path for the filestream files that is not on NTFS, you will get the error message:

“The path specified by ‘d:\DEC_TEST_FS’ cannot be used for FILESTREAM files because it is not on NTFS.”

For the Filestream filegroup, the FILENAME refers to the path and not to the actual file name.  It creates that particular folder – from the example above, it created the C:\DEC_TEST_FS folder on the filesystem.  And that folder now contains a filestream.hdr file and also a folder $FSLOG folder.

If you already have a database, you can add a FILESTREAM filegroup to it using ALTER DATABASE command.

3) Now, that we have the instance and the database taken care of, let us create a table to take advantage of this new feature:

USE DEC_TEST_FS
GO

CREATE TABLE DEC_FS_TEST
(
COL1     INT IDENTITY,
COL2    UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
COL3     VARBINARY(MAX) FILESTREAM
);

The table definition needs a ROWGUIDCOL column – this is required by FILESTREAM.  The actual data is stored in the column COL3.

Now, let’s insert data into this table:

INSERT INTO DEC_FS_TEST (COL2, COL3) VALUES (NEWID(), NULL);
INSERT INTO DEC_FS_TEST (COL2, COL3) VALUES (NEWID(), CAST(‘MY DUMMY TEST’ AS VARBINARY(MAX)));

And now we can select out of this table and see how this data is represented:

COL1        COL2                                 COL3
———– ———————————— ———————————————————–
1           78909DBF-7B26-4CA9-A840-4D45930F7523 NULL
2           0B0F5833-1997-4C9C-A9A7-F2536D68CFED 0x4D592044554D4D592054455354

If you see on the filesystem, you will see additional folders have been created under DEC_TEST_FS folder.  That is shown in the jpg image below.

filestream_2.jpg

One can also use T-SQL to update the filestream data if one needs to.  The deletes will also work – the delete also marks the underlying file system files for deletion.  We will be doing some benchmarks on this new feature and will post our findings on the blog.

Posted in SQL Server | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 84 other followers