SQL Server 2008 | SQL from the Trenches

T-SQL Toolbelt – Search for objects in databases – V 2.1.0

August 11, 2012 Leave a comment

A few weeks ago, I received a message from an old colleague and friend Eric (Blog | @saidin). He wanted to know if I had a query in my (and I quote) “magic bag of SQL tricks”, to search through object in SQL server. The company he works for (he is a software developer, and independant consultant) wanted to change all stored procedures, that contained functionality to calculate VAT (Value Added Tax).

I remembered that a few years ago, I needed that same functionality, and I wrote a few scripts to search for specific dependencies in views and stored procedures. Next to a query that gets information from
sys.tables and sys.columns, I used these queries to get view and stored procedure content:

SELECT *
FROM sys.syscomments
WHERE text LIKE '%&lt;SearchTerm&gt;%'


SELECT *
FROM INFORMATION_SCHEMA.VIEWS
WHERE VIEW_DEFINITION LIKE '%&lt;SearchTerm&gt;%'

The first query uses information from sys.syscomments. Which, according to MSDN:

“Contains entries for each view, rule, default, trigger, CHECK constraint, DEFAULT constraint, and stored procedure within the database. The text column contains the original SQL definition statements.

The seconds query uses INFORMATON_SCHEMA, that contains SQL Server metadata (see MSDN article):

An information schema view is one of several methods SQL Server provides for obtaining metadata.

The VIEWS view (a view on all views?) returns a row for each view that can be accessed by the current user, in the current database. So this means that the view only returns rows for objects that you have permissions on.

Then I decided to write a script that does this in one query, and more… When working on this script, I thought about adding more functionality to it. Why wouldn’t you want to search for primary or foreign key columns, triggers, functions, etc? But adding more information to the resultset often means that the overview is lost along the way. Because of that I created a switch system. By setting a few bits you can turn on what you need to see, and turn off what you don’t want to see. This way the result is kept clean, and you’re not bothered with unnecessary information.

One of the issues I ran into is how to search for a specific string. Because I wanted to let the user enter the searchterm once, I needed to use a variable. But if you use a variable, and you ad a wildcard (%) as the first and last character, the query returns all objects. It has the same effect as returning all objects, instead of returning objects based on a specific searchterm.

So because of this, I used dynamic SQL to search through the list of objects. In dynamic SQL it’s possible to work with wildcards in a like statement. The only thing I needed to do is change one more table from memory to physical temp table, because it’s used in the dynamic SQL. Apparently dynamic SQL can’t use a memory table (DECLARE @Object TABLE) as a datasource.

So this is what I could build in the past few weeks. The only problem is that fixing issues that I found resulted in adding more and more new functionality to the script. With that in mind, I want to create a toolbelt with useful SQL scripts for myself. But of course, I want to share it with the community, so they can use it if they like.

So the upcoming weeks, I hope to build as much functionality in this script as I can. There are still a few wishes for the future, and a few features that I want to build in, just because they’re cool! For every new version, I will write a blog with releasenotes, so you’re aware of the changes in the script.

For further questions and wishes, please contact me via twitter or this blog. I’d love to hear your ideas and wishes, so that I can implement it in the script!

You can download the script by clicking on the image below.

Downloads

Version 2.1.0:

Filed under SQL Server 2008, SQL Server 2012 Tagged with Colleague, Database, Object, Query, Search, SQL Server, Tables, Temp, TempTable, Toolbelt

Strange behavior of spatial data

June 29, 2012 2 Comments

As of today, I’m kind of forced to admit I have a problem… I’m in love with spatial data. And once you’re hooked, there’s no turning back. It could be worse of course! And in these circumstances, you come across the most interesting cases…

After trying to draw a geometry polygon in SQL Server 2008, I wondered what the difference is between a polygon and a multipolygon. Polygon also accepts more then 1 polygon just like a multipolygon. So what’s the difference then? I think I found the difference, with the help of Andreas Wolter (Blog | @AndreasWolter).

Just look at the query below:

DECLARE @Obj GEOMETRY
SET @Obj = GEOMETRY::STGeomFromText('POLYGON((10 0, 10 10, 0 10, 0 0, 10 0),
											 (10 15, 10 25, 0 25, 0 15, 10 15, 10 15))'
									,4326)

SELECT @Obj.ToString(), @Obj

This query produces a valid set of polygons, even though it’s two objects in a single polygon. It draws the polygons, even though a polygon should only consist of 1 object, and not two as the example above.

This isn’t the weirdest I’ve seen. Just look at the example below:

DECLARE @Obj GEOMETRY
SET @Obj = GEOMETRY::STGeomFromText('POLYGON((10 0, 10 10, 0 10, 0 0, 10 0),
											 (10 15, 10 25, 0 25, 0 15, 10 15, 10 15))'
									,4326)

--==================================================

SELECT
	'Geo Object'					AS Description,
	@Obj							AS GeoObject,
	@Obj.ToString()					AS GeoObject_ToString,
	@Obj.STIsValid()				AS GeoObject_IsValid

UNION ALL

SELECT
	'Geo Object + MakeValid()'		AS Description,
	@Obj.MakeValid()				AS GeoObject,
	@Obj.MakeValid().ToString()		AS GeoObject_ToString,
	@Obj.MakeValid().STIsValid()	AS GeoObject_IsValid

If you run the example, you’ll see a description of the objects, the spatial object itself, the object ToString(), and a bit (boolean) for STValid(). The first record in the resultset is just the same as in the previous example. The second row contains the method .MakeValid().

As you see the first record (2 objects in 1 polygon) is not a valid geometry object. The second records shows that .MakeValid() converts your polygon into a multipolygon. And if you check if the multipolygon is valid, it returns true.

This example was getting weird, but now run the example on a SQL 2012 instance. You will see a difference in the multipolygon coordinates. I’ve ran them on both versions, and you can see the differences in the screenshot below:

SQL Server 2008:

SQL Server 2012:

The conversion to multipolygon causes a different rounding of coordinates. It seems like the converison in SQL 2008 is changed quite a bit for SQL 2012.

So what does this mean for the usage of geometry and geography data? I don’t know… This might mean we always need to use .MakeValid() before storing a polygon. Part of the issue is, that spatial data is pretty new stuff. This means that there are no best practices yet. So time will tell what the best practices will become…

Filed under Spatial Data, SQL Server 2008, SQL Server 2012 Tagged with Engine, Instance, Query, Spatial Data, SQL Server

Reporting Services – Query database

April 19, 2012 1 Comment

After reading a tweet by Ray Martinez, I decided to share the scripts for Reporting Services that I often use.

One of the scripts I used a lot, is a script that shows the Reports that were successfully executed in 2012. For these Reports you will see the last execution time, rows and bytes returned, and how long it took to gather and show the results:

USE ReportServer

SELECT
	CL.Name						AS ReportName,
	CL.Description				AS ReportDescription,
	CL.Path						AS ReportPath,
	CL.CreationDate				AS ReportCreationDate,
	SUM(1)						AS TotalNumberOfTimesExecuted,
	MAX(EL.TimeStart)			AS LastTimeExecuted,
	AVG(EL.[RowCount])			AS AVG_NumberOfRows,
	AVG(EL.TimeDataRetrieval)	AS AVG_DataRetrievalTime,
	AVG(EL.TimeProcessing)		AS AVG_TimeProcessing,
	AVG(EL.TimeRendering)		AS AVG_TimeRendering
FROM ExecutionLog EL
JOIN Catalog CL
	ON CL.ItemID = EL.ReportID
WHERE 1 = 1
AND CL.Name IS NOT NULL
AND EL.Status ='rsSuccess'
GROUP BY
	CL.Name,
	CL.Path,
	CL.CreationDate,
	CL.Description
HAVING YEAR(MAX(EL.TimeStart)) = 2012
ORDER BY COUNT(EL.ReportID) DESC

The second script I want to share with you, shows the first 1000 successfully executed Reports. The data that is returned includes Report format, parameters used to query the data, information about the returned resultset and time needed to return and render the data and Report, etc. I uses a top 1000 because our Report server returned a lot of rows, where I only needed a few for my analysis.

USE ReportServer

SELECT TOP 1000
	EL.InstanceName				AS SQLInstanceName,
	EL.UserName					AS ExecuterUserName,
	EL.Format					AS ReportFormat,
	EL.Parameters				AS ReportParameters,
	EL.TimeStart				AS TimeStarted,
	EL.TimeEnd					AS TimeEnded,
	EL.TimeDataRetrieval		AS TimeDataRetrieval,
	EL.TimeProcessing			AS TimeProcessing,
	EL.TimeRendering			AS TimeRendering,
	EL2.Source					AS Source,
	EL.ByteCount				AS ReportInBytes,
	EL.[RowCount]				AS ReportRows,
	CL.Name						AS ReportName,
	CL.Path						AS ReportPath,
	CL.Hidden					AS ReportHidden,
	CL.CreationDate				AS CreationDate,
	CL.ModifiedDate				AS ModifiedDate,
	EL2.Format					AS RenderingFormat,
	EL2.ReportAction			AS ReportAction,
	EL2.Status					AS ExectionResult,
	DS.Name						AS DataSourceName,
	DS.Extension				AS DataSourceExtension
FROM ExecutionLog EL
JOIN Catalog CL
	ON CL.ItemID = EL.ReportID
LEFT JOIN ExecutionLog2 EL2
	ON EL2.ReportPath = CL.Path
JOIN DataSource DS
	ON DS.ItemID = CL.ItemID
WHERE 1 = 1
AND EL.Status = 'rsSuccess'
ORDER BY EL.TimeStart DESC

The next script returns an overview of the folder and Reports on your Report server. For every object on your server, you can see the creation- and modify date. The joined Executionlog table is used to get the number of times the Report was executed, by which user, and how much data was returned in which time:

USE ReportServer

SELECT
	CASE CL.Type
		WHEN 1 THEN 'Folder'
		WHEN 2 THEN 'Report'
		WHEN 3 THEN 'Resource'
		WHEN 4 THEN 'Linked Report'
		WHEN 5 THEN 'Data Source'
	END									AS ObjectType,
	CP.Name								AS ParentName,
	CL.Name								AS Name,
	CL.Path								AS Path,
	CU.UserName							AS CreatedBy,
	CL.CreationDate						AS CreationDate,
	UM.UserName							AS ModifiedBy,
	CL.ModifiedDate						AS ModifiedDate,
	CE.CountStart						AS TotalExecutions,
	EL.InstanceName						AS LastExecutedInstanceName,
	EL.UserName							AS LastExecuter,
	EL.Format							AS LastFormat,
	EL.TimeStart						AS LastTimeStarted,
	EL.TimeEnd							AS LastTimeEnded,
	EL.TimeDataRetrieval				AS LastTimeDataRetrieval,
	EL.TimeProcessing					AS LastTimeProcessing,
	EL.TimeRendering					AS LastTimeRendering,
	EL.Status							AS LastResult,
	EL.ByteCount						AS LastByteCount,
	EL.[RowCount]						AS LastRowCount,
	SO.UserName							AS SubscriptionOwner,
	SU.UserName							AS SubscriptionModifiedBy,
	SS.ModifiedDate						AS SubscriptionModifiedDate,
	SS.Description						AS SubscriptionDescription,
	SS.LastStatus						AS SubscriptionLastResult,
	SS.LastRunTime						AS SubscriptionLastRunTime
FROM Catalog CL
JOIN Catalog CP
	ON CP.ItemID = CL.ParentID
JOIN Users CU
	ON CU.UserID = CL.CreatedByID
JOIN Users UM
	ON UM.UserID = CL.ModifiedByID
LEFT JOIN ( SELECT
				ReportID,
				MAX(TimeStart) LastTimeStart
			FROM ExecutionLog
			GROUP BY ReportID) LE
	ON LE.ReportID = CL.ItemID
LEFT JOIN ( SELECT
				ReportID,
				COUNT(TimeStart) CountStart
			FROM ExecutionLog
			GROUP BY ReportID) CE
	ON CE.ReportID = CL.ItemID
LEFT JOIN ExecutionLog EL
	ON EL.ReportID = LE.ReportID
	AND EL.TimeStart = LE.LastTimeStart
LEFT JOIN Subscriptions SS
	ON SS.Report_OID = CL.ItemID
LEFT JOIN Users SO
	ON SO.UserID = SS.OwnerID
LEFT JOIN Users SU
	ON SU.UserID = SS.ModifiedByID
WHERE 1 = 1
ORDER BY CP.Name, CL.Name ASC

The last query is a short one. This returns the Reports that don’t inherit permissions, that are set in the site settings menu.

USE ReportServer


SELECT
	Path,
	Name
FROM Catalog
WHERE PolicyRoot = 1

Hopefully there will be more where this came from. So follow my blog, or contact me by email or Twitter!

Filed under SQL Server 2008, SSRS Tagged with Catalog, Execution, Executionlog, Query, Reports, ReportServer, SSRS, Twitter

Importing and using GIS Spatial Data

April 16, 2012 5 Comments

For a project I’m working on right now, we (the team I’m currently a part of) decided to research the use of GIS data. The GIS data is available for free at DIVA-GIS. If you want to download the GIS data, choose the country and in the Subject drop-down, choose “Administrative areas”.

To import the data, I’ve used an easy to use .NET Tool: Shape2SQL. This tool is created by Morten Nielsen (Blog | @dotMorten), and allows the user to import Shapefiles (.SHP) into SQL Server without problems. If you want to download this tool-set, I advise you to download the “SqlSpatialTools”, which also contains “SQLSpatial.exe”, which allows you to query and visualize the data.

After downloading the GIS data and tools, run the “Sharp2Sql.exe”. You will see the following screen pop up (at first run only):

Fill in the server and database information. In my case, I imported the data on a local SQL Server:

After that, you will see a start screen like this:

Now you need to select a SHP file. If you press the button, the following window shows:

As you can see, I picked the GIS data of The Netherlands as an example. After selecting a source file, you need to chance the settings of the import:

I also changed the “Geometry Name” on the right from “geom” to “geog”, just to remind myself that the content of the column is Geography- and not Geometry-data.

Once you decided about the options and naming conventions, press “Upload to Database”, and wait for the file to be processed:

Once the processing is completed, you can start using the GIS data. You can do this straight from SQL Server, but you could also use the “SqlSpatial.exe” that you downloaded as part of the “SqlSpatialTools”. If you choose to use this tool, it would look something like this:

You can run the same query in SQL Server Management Studio (SSMS), and you will get an extra tab in the resultset:

There’s only one more thing to remember: In SQL Server 2008 and 2008 R2 you can only select 1 hemisphere at a time. SQL Server 2012 has a new version of the Geography assembly, and supports querying multiple hemispheres at the same time. For more information about this, read the MSDN article about Spatial Data Types.

Filed under Spatial Data, SQL Server 2008, SQL Server 2012 Tagged with Database, Geography, Geometry, GIS, Import, Spatial Data, SQL Server

Data Type and Operator Precedence

April 8, 2012 Leave a comment

In SQL Server you will encounter a lot of cases, in which an operator combines two expressions of different data types. The rules that specify which value is converted to another data type, can be found on MSDN. But the precedence of data types are different for the different versions of SQL Server. Therefore I created the schema below, so you can compare the different versions:

	SQL Server 2005	SQL Server 2008 / 2008 R2	SQL Server 2012
1	user-defined data types (highest)	user-defined data types (highest)	user-defined data types (highest)
2	sql_variant	sql_variant	sql_variant
3	xml	xml	xml
4	datetime	datetimeoffset	datetimeoffset
5	smalldatetime	datetime2	datetime2
6	float	datetime	datetime
7	real	smalldatetime	smalldatetime
8	decimal	date	date
9	money	time	time
10	smallmoney	float	float
11	bigint	real	real
12	int	decimal	decimal
13	smallint	money	money
14	tinyint	smallmoney	smallmoney
15	bit	bigint	bigint
16	ntext	int	int
17	text	smallint	smallint
18	image	tinyint	tinyint
19	timestamp	bit	bit
20	uniqueidentifier	ntext	ntext
21	nvarchar (including nvarchar(max))	text	text
22	nchar	image	image
23	varchar (including varchar(max))	timestamp	timestamp
24	char	uniqueidentifier	uniqueidentifier
25	varbinary (including varbinary(max))	nvarchar (including nvarchar(max))	nvarchar (including nvarchar(max))
26	binary (lowest)	nchar	nchar
27		varchar (including varchar(max))	varchar (including varchar(max))
28		char	char
29		varbinary (including varbinary(max))	varbinary (including varbinary(max))
30		binary (lowest)	binary (lowest)

The same counts for Operators. There are differences in the precedence between SQL Server versions. I took the data from different versions of SQL Server, and created the schema below:

	SQL Server 2005	SQL Server 2008 / 2008 R2	SQL Server 2012
1	~ (Bitwise NOT)	~ (Bitwise NOT)	~ (Bitwise NOT)
2	* (Multiply), / (Division), % (Modulo)	* (Multiply), / (Division), % (Modulo)	* (Multiply), / (Division), % (Modulo)
3	+ (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), & (Bitwise AND)	+ (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), & (Bitwise AND), ^ (Bitwise Exclusive OR), \| (Bitwise OR)	+ (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), & (Bitwise AND), ^ (Bitwise Exclusive OR), \| (Bitwise OR)
4	=, >, =, <=, , !=, !>, !< (Comparison operators)	=, >, =, <=, , !=, !>, !< (Comparison operators)	=, >, =, <=, , !=, !>, !< (Comparison operators)
5	^ (Bitwise Exlusive OR), \| (Bitwise OR)	Text	Text
6	NOT	NOT	NOT
7	AND	AND	AND
8	ALL, ANY, BETWEEN, IN, LIKE, OR, SOME	ALL, ANY, BETWEEN, IN, LIKE, OR, SOME	ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
9	= (Assignment)	= (Assignment)	= (Assignment)

_{This post was inspired by a SQL Server session by Bob Beauchemin (Blog | @bobbeauch), at the last version of TechDays NL. Bob, thank you for that! 😉}

Filed under SQL Server 2008, SQL Server 2012 Tagged with Data Type, Database, Engine, Execution, Operator, Order, Precedence, Query, SQL Server

Error: Permissions denied mssqlsystemresource – configurations

February 25, 2012 2 Comments

Last week I encountered a strange issue on a production SQL Server. We weren’t able to connect to some of them. If we tried to connect with a normal Windows NT (Active Directory) or SQL account, we got the following error (I recreated the error locally):

Message: The SELECT permission was denied on the object ‘configurations’, database ‘mssqlsystemresource’, schema ‘sys’. (Microsoft SQL Server, Error: 229)

For further information about the mssqlsystemresource database, please read my previous post about this database.

But in my case, there was nothing wrong with this Resource database. After all, no other database was unavailable or corrupt. So I needed a little bit more research. After a while I found out that the issue is coming from a security setting.

We allowed an external company of Data Warehouse consultants (KVL) to access our production SQL Server. They needed to create a Snapshot, and the person responsible for making this possible created 2 Stored Procedures (SP’s) for this task. One SP to create a Snapshot, and one to drop the Snapshot. These SP’s are stored in the Master database.

But because he wanted them to only access the SP’s, the created a user for this (again, created this locally):

And he added a server mapping for the user:

After that he tested it, and it worked like a charm! He e-mailed me the location of the SP’s and the names, so I could send them to our consultants. Then he added one more thing to the User Mappings without any of us knowing:

When the consultants tried to create the Snapshot, they couldn’t get it to work. After some research I found out that the User Mapping on the Master database was set to db_denydatareader. As you all know, deny permissions always overrule access permissions. In this case this worked against us.

So if you ever encounter this issue, please take a look at your security settings. If all of your other databases are still accessible, the error is coming from a lack of permissions. You’ll encounter this issue most of the time when a user (NT or SQL user) is a member of 2 separate groups with different permissions.

Filed under SQL Server 2008 Tagged with Connection, Database, db_datareader, Engine, Instance, Master, mssqlsystemresource, Permissions, Resources, SQL Server, Tables

SQL Server system database – mssqlsystemresource

February 25, 2012 1 Comment

The database mssqlsystemresource is a read-only database that is shipped with SQL Server from version SQL Server 2005. It contains all the system objects that are included in SQL Server. An example of this are the sys.objects. These are stored in the Resource database, but appear in every other database. The sys.objects from a user database refer to the Resource database.

SQL Server cannot backup this database. You can create a backup yourself, but don’t back it up like a normal .MDF file. It also can’t be restored via SQL Server, but you can do it manually.

The Resource database makes upgrading to a new version easier and faster.

The mssqlsystemresource database is invisible for users, even for the System Administrator (sa). The database can be restored by copying the .ldf and .mdf files from the folder “[Drive]\Program Files\Microsoft SQL Server\MSSQL10_50.[Instance Name]\MSSQL\Binn” to a folder of your choice (in this case D:\Databases). After that, start a query with the user sa, and run the following script:

USE [master]
GO

CREATE DATABASE [mssqlsystemresource_RESTORED] ON 
	(FILENAME = N'D:\Databases\mssqlsystemresource.mdf'),
	(FILENAME = N'D:\Databases\mssqlsystemresource.ldf')
FOR ATTACH
GO

Once you’ve done this, you can query the restored version of the Resource database.

Filed under SQL Server 2008 Tagged with Connection, Database, Engine, Execution, Instance, Performance, Query, Resource, SQL Server

Comparison: Ad-hoc vs Stored Procedure vs Dynamic SQL

February 10, 2012 4 Comments

Sometimes when you are creating a SQL query, you wonder if it’s more sensible to use a standard T-SQL, or some other possibility. In this post I will try to compare the three different possibilities, so that you can make your own decision. If you read the comparison below, keep in mind that there is no right or wrong in this. Each situation requires a different point of view, and may offer it’s own difficulties.

	Ad-Hoc Query	Stored Procedure	Dynamic SQL
Use it for	Long, complex queries (OLAP; for example Reporting or Analysis)	Short, simple queries (OLTP; for example Insert, Update, Delete, Select)	Short, simple queries (OLTP; for example Insert, Update, Delete, Select)
Performance	Compiled at runtime, Execution Plan stored in Cache Changed data is no issues because of re-compile	Compiled once at first run, and stored in Procedure Cache Changed data might be a performance bottleneck. Can be solved with recompile	Compiled at runtime, and execution plan is not stored (unless using the more efficient sp_executesql) Changed data is no issue because of re-compile
Security	Permissions (read/write) on all objects (database(s)/table(s))	Execute permissions on SP are enough	Permissions (read/write) on all objects (database(s)/table(s))
Flexibility	If changed, your application needs to be recompiled	If changed, only need to change the SP in the database	If changed, your application needs to be recompiled
Number of Statements	Only 1 statement possible	Multiple statements possible	Multiple statements possible
Memory Usage	Uses more memory then an SP	Uses less memory then an ad-hoc query	Uses more memory then an SP
Network traffic	Query executes server side Query and resultset are send across network/internet	Query executes server side Execute statement and resultset are send across network/internet	Query executes server side Statement and resultset are send across network/internet
Separation	Database logic and business logic are mostly combined in the query	Seperate database logic from business logic	Seperate database logic from business logic
Troubleshoot	Relatively easy to troubleshoot	Relatively easy to troubleshoot	Difficult to troubleshoot
Maintaining	Difficult because of several locations in applications and database	Easy because of single location	Difficult because of several locations in applications and database
Access	Difficult to access multiple objects in different databases, or in dynamic databases	Difficult to access multiple objects in different databases, or in dynamic database	Allows any object (database, table, columns, etc) to be referenced
WHERE clause	Fairly static WHERE clause	Fairly static WHERE clause	Dynamic WHERE clause (add/remove), based on parameters
Versioning	Only possible via Source Controlling your application	Possible via Source Controlling your database, and by commenting your SP	Only possible via Source Controlling your application
CRUD	Can be created by getting all your queries together, and looking for specific keywords (Update, Delete, Select, Etc)	Difficult to catch in a CRUD (Create, Read, Update, Delete) diagram	Difficult to catch in a CRUD (Create, Read, Update, Delete) diagram
Structure Update	Can be changed simultaneously with structure changes	Needs to be altered when the underlying structure is changed	Can be changed simultaneously with structure changes
Searching	No standard way to search through	Possible to use sys.procedures to search through SP contents. Dependency window in SSMS shows SP content	No standard way to search through
Testing	Can be compiled/tested in code	Impossible to automatically compile without 3rd party tools	Difficult to test in code
Mapping	ORM (Object-relational mapping) is possible	ORM (Object-relational mapping) is impossible	ORM (Object-relational mapping) is impossible
Compiling	Compiles the whole statement	Compiles the whole statement	Only static elements can be compiled

For the design of this comparison chart, I need to thank my buddy and colleague Pascal (Blog | @pdejonge). For the record: I’m not a designer, and my “design” was what you guys might call Fugly.

These comparison chart covers the main reasons for me to use or not use a specific option. These are my personal beliefs. If you have any suggestions to add, please don’t hesitate to contact me.

Filed under SQL Server 2008 Tagged with Colleague, Engine, Execution, Performance, Query, SQL Server, Update

Row_Number: Unique ID in select statement

January 16, 2012 2 Comments

Last week I was asked by a colleague, Sander (Blog | @SanderKooij), “What’s the easiest way of adding a unique identifier (or row ID) to a result set?”. That’s an interesting question. There are several ways to do this. For example, you could insert the resultset into a temp table (physical or memory temp table), with an extra column (Identity). But what if you want to do this in a Select statement itself? Here’s a solution.

If you use the scripts I’ve added to this post (check Code Samples), you’ve created a table with country abbreviation codes. This table contains the data as shown below:

If you select the data, and want to add a record ID to your result set, you can use the following script:

SELECT
	DefaultAbbreviation,
	CountryProper,
	CountryNumber,
	ROW_NUMBER() OVER(ORDER BY DefaultAbbreviation ASC) AS RowNumber
FROM CountryAbbreviationCodes

The statement above uses the ROW_NUMBER built-in function to generate a row number for the result set. The ORDER BY clause in the functions is used to tell SQL Server what column(s) to use, to create a row number. In this case, the result set is ordered by the abbreviation column.

But what if you want to select the data with a specific row number? If you try to use the ROW_NUMBER function in the where clause, you get the following error:

“Windowed functions can only appear in the SELECT or ORDER BY clauses.”

So you need another solution. From SQL Server 2005 onwards we can use a Common Table Expression (CTE). With a CTE you can use a select statement as a table. So if you want to return the rows that are numbered 50 through 60, you can use the following query:

WITH OrderedCountries AS
(
	SELECT
		DefaultAbbreviation,
		CountryProper,
		CountryNumber,
		ROW_NUMBER() OVER(ORDER BY DefaultAbbreviation ASC) AS RowNumber
	FROM CountryAbbreviationCodes
)

SELECT
	DefaultAbbreviation,
	CountryProper,
	CountryNumber,
	RowNumber
FROM OrderedCountries
WHERE 1 = 1
AND RowNumber BETWEEN 50 AND 60

The result of this statement is the 10 rows we requested. So with a CTE we can use the ROW_NUMBER function to return specific rows from a table.

With the ROW_NUMBER functions, you can also delete duplicate records from your tables. If interested, post a comment, mail or tweet me, and I might write a blog post about it.

_{Code samples:

CreateSources.sql

SelectStatement.sql

CTESelectStatement.sql}

Filed under SQL Server 2008 Tagged with Colleague, Common Table Expressions, CTE, Engine, Identifier, Order, Query, Row_Number, SQL Server, Tables, Temp, TempTable, Unique

SQL Joins – The Differences

January 11, 2012 Leave a comment

In SQL Server you can find several joins. In this post I will try to explain the diffences between them, and how you can use them in your advantage.

Joins
There are several different types of joins. Some of them look different, but are actually the same. For example, a Left Join is the same as a Left Outer Join. This also counts for Right/Right Outer Joins. And if you ever encounter a statement with just Join in it, this is converted by SQL Server to an Inner Join.

(Inner) Join
An Inner Join takes the results from both tables in the join, and combines data that matches. All data that doesn’t match the ON clause, isn’t shown in the result set.

Left (Outer) Join
A Left Join takes the 2 tables used in the join, and takes all records from the left table (1st table in your join), and matches and adds data from the right (2nd table in your join). All data from the 2nd table that doesn’t match the ON clause is filtered out.

Right (Outer) Join
A Right Join takes the 2 tables used in the join, and takes all records from the right (2nd table in your join), and matches and adds data from the left (1st table in your join). All data from the 1st table that doesn’t match the ON clause is filtered out.

Full (Outer) Join
A Full Join takes the records from both tables, and matches all the data. The data that doesn’t match is also shown, except with NULL values in the columns that are missing on the other side.

Left (Outer) Join – Right Key Is NULL
With this Join (can be found in code samples, download below), you will get all the results that are not included in the Left Join.

Right (Outer) Join – Left Key Is NULL
With this statement (can be found in code samples, download below), you will get all the results that are not included in the Right Join.

Full (Outer) Join – Both Keys Are NULL
With this join you get all the data that isn’t included in the Full Join result set.

Cross Join
The Cross Join is a “special” Join. This Join type selects all the possible combinations it can. This sounds weird, but in the image below and in the code samples, it’s much clearer.