There are multiple ways to query and search XML data in SQL Server and today I want to go through some examples. In my experiences, I’ve most often worked with XML where data was split out into multiple files rather than be included all in one (i.e. one record per employee instead of one XML file full of multiple employees) so these examples will reflect that.
I have a Comic_Collection test table loaded up with sample comic book collection data that I’ll be working with:

Each ComicData XML file follows this type of format:

Query()
We can first use the query() method in SQL Server. Let’s say we wanted to find Full_Title and Publisher from our data. We could gather that by running this query:
SELECT ComicData.query('/comic/Full_Title') AS 'Full_Title'
,ComicData.query('/comic/Publisher_Name') AS 'Publisher'
FROM Comic_Collection;
GO

This returns the values including “Full_Title” and “Publisher” in the results. What if we wanted to use the query() method but only want to return data where the Publisher is Dark Horse Comics? If we run this query:
SELECT ComicData.query('/comic/Full_Title') AS 'Full_Title'
,ComicData.query('/comic/Publisher_Name') AS 'Publisher'
FROM Comic_Collection
WHERE ComicData.query('/comic/Publisher_Name') = '<Publisher_Name>Dark Horse Comics</Publisher_Name>';
GO
We’ll get an error:
Msg 402, Level 16, State 1, Line 61
The data types xml and varchar are incompatible in the equal to operator.
Instead, we can CAST the XML to VARCHAR for our WHERE clause to work:
SELECT ComicData.query('/comic/Full_Title') AS 'Full_Title'
,ComicData.query('/comic/Publisher_Name') AS 'Publisher'
FROM Comic_Collection
WHERE CAST(ComicData.query('/comic/Publisher_Name') AS VARCHAR(100)) = '<Publisher_Name>Dark Horse Comics</Publisher_Name>';
GO

Value()
We can also use the value() method to query XML. If we stick to our original goal of listing Full_Title and Publisher we can run this query with the value() method:
SELECT ComicData.value('(comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title',
ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection;
GO

Now we only get the values without any tags. Again, if we want to narrow down our results to Dark Horse Comics we can run:
SELECT ComicData.value('(comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title',
ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection
WHERE ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') = 'Dark Horse Comics';
GO

Search it All
What if we need to search all XML for a particular term or phrase? If we wanted any records where “Dark” is contained anywhere in the XML, we could run this type of query:
SELECT ComicData.value('(comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title',
ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection
WHERE CAST(ComicData AS VARCHAR(MAX)) LIKE '%Dark%';
GO

Let’s Compare
Comparing query() and value(), we see that query() has a much greater cost:

One thing to keep in mind when working with XML is that CASE MATTERS. For these examples, “comic” is lowercase in the XML which at first was an annoying mistake but it reminded me of that point. If we change one of our queries so that the first “comic” is replaced by “Comic” we will not get the same results:
SELECT ComicData.value('(Comic/Full_Title)[1]','VARCHAR(100)') AS 'Full_Title',
ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') AS 'Publisher_Name'
FROM Comic_Collection
WHERE ComicData.value('(comic/Publisher_Name)[1]','VARCHAR(100)') = 'Dark Horse Comics';
GO

That has slowed me down before. Something so simple can turn into something so frustrating.
Thanks for reading!
One thought on “Querying XML Data in SQL Server”