Tải bản đầy đủ - 0 (trang)
3-1. Loading XML Files for Storage in SQL Server

3-1. Loading XML Files for Storage in SQL Server

Tải bản đầy đủ - 0trang

Chapter 3 ■ XML Data Sources



 John Smith

 4, Grove Drive

 Uttoxeter

 Staffs

 1



. . . many other elements were omitted here to save space . . .



 7

 Slow Sid

 2, Rue des Bleues

 Avignon

 Vaucluse

 3





3.



Run the following T-SQL code to import the contents of the XML file that

is referenced, as well as any other data you wish to store in the table

(C:\SQL2012DIRecipes\CH03\XmlImportTestInsert.Sql).

INSERT INTO XmlImportTest(XMLDataStore, Keyword);

SELECT XMLDATAToStore, 'Attribute-Centric' AS ColType

FROM

(

SELECT

CONVERT(XML, XMLCol, 0)

FROM

OPENROWSET (BULK 'C:\SQL2012DIRecipes\CH03\Clients_Simple.Xml' ,

SINGLE_BLOB) AS XMLSource (XMLCol)

) AS XMLFileToImport (XMLDATAToStore);



How It Works

XML files can be loaded directly into an SQL Server table as pure XML data—that is, they are not “shredded” (or

broken down) into rows and columns. This is done using OPENROWSET (BULK), which essentially reads the source

file into a T-SQL process. The same file can be either an XML “fragment”—that is, not forced to conform to an

XML schema, or a “well-formed” XML document, which conforms to an XML schema definition. This technique

is useful in the following circumstances:





When you wish to store XML documents or fragments in the database, rather than in the

file system.







When you do not wish to shred the XML data into a fully-relational SQL Server table

structure.



Such a data load could be an end in itself, where, once loaded into the database, you query this XML data using the

subset of the XQuery language that SQL Server uses to query the XML data type. Alternatively, it could be part of a

data-staging process, where you want to store the XML fragments in SQL Server before other processing takes place.



134

www.it-ebooks.info



Chapter 3 ■ XML Data Sources



This technique can be extended to update (or rather replace) an XML fragment or document equally easily.

The code required is (C:\SQL2012DIRecipes\CH03\LoadXMLToUpdateInTable.sql):

UPDATE XmlImportTest

SET XMLDataStore =

(

SELECT XMLDATAToStore

FROM

(

SELECT

CONVERT(XML, XMLCol, 0)

FROM OPENROWSET (BULK

'C:\SQL2012DIRecipes\CH03\Clients_Simple.Xml',

SINGLE_BLOB) AS XMLSource (XMLCol)

) AS XMLFileToImport (XMLDATAToStore)

)

WHERE Keyword = 'Attribute-Centric'

It could be that you have many XML fragments or documents to load, and do not want to develop an SSIS

package to perform such a task. Fortunately, with a little ingenuity (and some dynamic SQL), the script used in

the current recipe can be extended to load multiple XML fragments or documents. The script to load a series of

files is (C:\SQL2012DIRecipes\CH03\LoadMultiplexmlFiles.sql):

-- Used for dynamic SQL

DECLARE @SQL VARCHAR(8000) ;

-- Table variable to hold file names

DECLARE @FileList TABLE (XMLFile VARCHAR(150)) ;

-- Table variable to capture "raw" file listing

DECLARE @DirList TABLE (List VARCHAR(250)) ;

-- Load the output from the DIR command into the @DirList

-- Table variable

INSERT INTO @DirList

EXEC master.dbo.xp_cmdshell 'Dir D:\BIProject\*.Xml';

-- Parse out the file names - caveats, no spaces in file names,

-- All files have the correct and identical structure...

INSERT INTO

@FileList

SELECT

REVERSE(LEFT(REVERSE(List), CHARINDEX(' ', REVERSE(List))))

FROM

@DirList

WHERE

List LIKE '%.Xml%';

-- Cursor to loop through file names and load xml files

DECLARE @FileName VARCHAR(150);

DECLARE FileLoad_CUR CURSOR

FOR

SELECT XMLFile FROM @FileList

OPEN FileLoad_CUR



135

www.it-ebooks.info



Chapter 3 ■ XML Data Sources



FETCH NEXT FROM FileLoad_CUR INTO @FileName

WHILE @@FETCH_STATUS <> -1

BEGIN

-- The XML load process

SET @SQL = 'INSERT INTO AA1 (XMLData) SELECT CAST(XMLSource AS XML) AS XMLSource

FROM OPENROWSET(BULK ''' + @FileName + ''', SINGLE_BLOB) AS X (XMLSource)'

EXEC (@SQL)

FETCH NEXT FROM FileLoad_CUR INTO @FileName

END;

CLOSE FileLoad_CUR;

DEALLOCATE FileLoad_CUR;

The T-SQL, which loads multiple files, works this way:





First, the list of files to be processed is defined using xp_cmdshell to call a Dir (directory

listing) command.







This list is then parsed to get useable file names.







A cursor loops through the list of files and loads them.



The dynamic SQL used here makes certain presumptions:





The files have the .Xml extension.







There are no spaces in the file names.



The script can be extended, if required, to accept multiple extensions or file names with spaces. Please note

that this recipe only purports to help you load XML data as XML into an SQL Server table. The many and varied

ways of extracting and querying the data are, unfortunately, beyond the scope of this book. However, if you

store XML data as an XML data type, do remember that SQL Server allows you to index XML columns for faster

querying with the subset of the XQuery language that SQL Server uses.

The “keyword” use here is, of course, optional, and I have left it there to show how XML and “normal”

data can be imported together. It is important to use the SINGLE_BLOB parameter, so the entire file is handled

as the contents of one column. Also, no XML document can be greater than 2 gigabytes in size. Note also that

the CONVERT parameter 0 used here discards insignificant white space and does not allow for an internal DTD

(Document Type Definition) subset. I am going to presume that DTDs are not going to be used.

To query the contents of the XML column, you can use the following T-SQL syntax, for example:

SELECT XMLDataStore.query('(/ROOT/Customer/CustomerID)') FROM XmlImport



Hints, Tips, and Traps





The dynamic SQL is predicated on having the rights to use xp_cmdshell. There are

many DBAs who will not countenance this, and so you need to discover vast reserves

of charm—or irrefutable technical arguments—to get this allowed. Alternatively, you

can write a CLR (Common Language Runtime) routine to list the files in a directory and

persuade the DBA that this approach is safer.



136

www.it-ebooks.info



Chapter 3 ■ XML Data Sources







By using SINGLE_BLOB (as opposed to SINGLE_CLOB or SINGLE_NCLOB), you avoid a

potential mismatch between the encoding of the XML document (as given in the XML

encoding declaration) and the code page of the server.







Recipes 3-2 and 3-3 show how to take the data from an XML column and shred it into

separate columns of a destination table.







Shock and horror—a cursor! Well, as I say elsewhere in this book, cursors are not inevitably

wrong, and there are occasions when they are the easiest solution to use and debug—and

where the resource hit is negligible. This whole approach is predicated on the fact that you

are probably only loading a few dozen XML files at most. If you will be loading many more

files than this, then a cursor-free recipe (Recipe 6-7) can be adapted to your needs.



3-2. Loading XML Data into Rows and Columns

Problem

You want to load data from small- to medium-sized XML files into an SQL Server table, correctly “shredded” into

rows and columns.



Solution

Use OPENXML to load and shred both element-centric and attribute-centric XML files into an SQL Server table.

1.



Place your (in this case, element-centric) XML file in the requisite source directory.

The file looks like this (C:\SQL2012DIRecipes\CH03\ClientLite.Xml):





 3

 John Smith

 1



Data omitted to save space . . .



 7

 Slow Sid

 3







2.



Load the XML file using the following T-SQL code

(C:\SQL2012DIRecipes\CH03\ShredClientLite.Sql):

DECLARE @DocID INT;

DECLARE @DocXML VARCHAR(MAX);

SELECT @DocXML = CAST(XMLSource AS VARCHAR(MAX))

FROM OPENROWSET(BULK 'C:\SQL2012DIRecipes\CH03\ClientLite.xml', SINGLE_BLOB)

AS X (XMLSource);



137

www.it-ebooks.info



s



EXECUTE master.dbo.sp_xml_preparedocument @DocID OUTPUT, @DocXML;

SELECT

INTO

FROM



ID, ClientName, Country

XmlTable

OPENXML(@DocID, 'CarSales/Client', 2)

WITH (

ID VARCHAR(50)

,ClientName VARCHAR(50)

,Country VARCHAR(10)

);



EXECUTE master.dbo.sp_xml_removedocument @DocID;

3.



Query the XmlTable table and you should see something like the following result:

ID

3

4

5

6

7



ClientName

John Smith

Bauhaus Motors

Honest Fred

Fast Eddie

Slow Sid



Country

1

2

3

2

3



How It Works

Ever since SQL Server 2000 was released, OPENXML has been helping developers and DBAs load (or “shred” as

it is known) XML data into relational tables. This technology is still supported in the current version of SQL

Server, and is easy and efficient to use. The result is totally different from that obtained in Recipe 3-1, as this time

the source file is broken down into its constituent data fragments and each element or attribute loaded into a

separate column in one or more tables.

First, the source file is loaded into the @DocXML variable. Then, the sp_xml_preparedocument stored

procedure is run against this variable to prepare the XML and return a handle, in this case, @DocID. The

handle is passed to the OPENXML command, which reads the XML data and shreds it into rows and columns

in the destination table. Then, in the OPENXML statement, the row pattern ('CarSales/Client') identifies

which nodes to process.

The WITH clause is the ColPattern, which allows you to traverse the hierarchy of the XML and select any

element you choose. For attributes, you can merely use a ColPattern like Invoice/@InvoiceNumber. You

can traverse the XML up beyond the initial node specified using the RowPattern, by using (for example)

../Invoice—assuming that the row pattern was 'CarSales/Client'/Invoice. If a ColPattern is not specified,

the default mapping (attribute-centric or element-centric mapping, as specified) will take place.

Be warned, however, that although simple, OPENXML is very memory-intensive. Also OPENXML uses XPath (not

XQuery). Taken together, this signifies that OPENXML is best used in the following situations:





When you do not wish to import the XML data into SQL Server before using it in a T-SQL

query.







When you have a properly formed XML document.







When the XML document is less that 2 gigabytes in size.



To extend this example slightly, let us assume that you have an (admittedly fairly simple) attribute-centric

XML file, something like this (C:\SQL2012DIRecipes\CH03\ClientLiteAttributeCentric.Xml):









138

www.it-ebooks.info



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

3-1. Loading XML Files for Storage in SQL Server

Tải bản đầy đủ ngay(0 tr)

×