Querying XML Data in SQL Server

There are multiple ways to query and search XML data in SQL Server and today I want to go through some examples. In my experiences, I’ve most often worked with XML where data was split out into multiple files rather than be included all in one (i.e. one record per employee instead of one XML file full of multiple employees) so these examples will reflect that.

I have a Comic_Collection test table loaded up with sample comic book collection data that I’ll be working with:

Each ComicData XML file follows this type of format:

Query()

We can first use the query() method in SQL Server. Let’s say we wanted to find Full_Title and Publisher from our data. We could gather that by running this query:

SELECT ComicData.query('/comic/Full_Title') AS 'Full_Title'
	,ComicData.query('/comic/Publisher_Name') AS 'Publisher'
FROM Comic_Collection;
GO

This returns the values including “Full_Title” and “Publisher” in the results. What if we wanted to use the query() method but only want to return data where the Publisher is Dark Horse Comics? If we run this query:

SELECT ComicData.query('/comic/Full_Title') AS 'Full_Title'
	,ComicData.query('/comic/Publisher_Name') AS 'Publisher'
FROM Comic_Collection
WHERE ComicData.query('/comic/Publisher_Name') = '<Publisher_Name>Dark Horse Comics</Publisher_Name>';
GO

We’ll get an error:

Msg 402, Level 16, State 1, Line 61
The data types xml and varchar are incompatible in the equal to operator.

Instead, we can CAST the XML to VARCHAR for our WHERE clause to work:

SELECT ComicData.query('/comic/Full_Title') AS 'Full_Title'
	,ComicData.query('/comic/Publisher_Name') AS 'Publisher'
FROM Comic_Collection
WHERE CAST(ComicData.query('/comic/Publisher_Name') AS VARCHAR(100)) = '<Publisher_Name>Dark Horse Comics</Publisher_Name>';
GO

Value()

We can also use the value() method to query XML. If we stick to our original goal of listing Full_Title and Publisher we can run this query with the value() method:

SELECT ComicData.value('(comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title', 
	ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection;
GO

Now we only get the values without any tags. Again, if we want to narrow down our results to Dark Horse Comics we can run:

SELECT ComicData.value('(comic/Full_Title)[1]','VARCHAR(100)') AS  'Full_Title', 
	ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection
WHERE ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') = 'Dark Horse Comics';
GO

Search it All

What if we need to search all XML for a particular term or phrase? If we wanted any records where “Dark” is contained anywhere in the XML, we could run this type of query:

SELECT ComicData.value('(comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title', 
	ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection
WHERE CAST(ComicData AS VARCHAR(MAX)) LIKE '%Dark%';
GO

Let’s Compare

Comparing query() and value(), we see that query() has a much greater cost:

One thing to keep in mind when working with XML is that CASE MATTERS. For these examples, “comic” is lowercase in the XML which at first was an annoying mistake but it reminded me of that point. If we change one of our queries so that the first “comic” is replaced by “Comic” we will not get the same results:

SELECT ComicData.value('(Comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title', 
	ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection
WHERE ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') = 'Dark Horse Comics';
GO

That has slowed me down before. Something so simple can turn into something so frustrating.

Thanks for reading!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s