Tải bản đầy đủ - 0 (trang)
2-12. Performing a BULK INSERT with a Format File

2-12. Performing a BULK INSERT with a Format File

Tải bản đầy đủ - 0trang

Chapter 2 ■ Flat File Data Sources






xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance">






COLLATION = "Latin1_General_CI_AS"/>

























The following format file will only load the first field in the source file, and will ignore the others. Essentially,

this technique allows you to subset source data vertically, and only loads the columns that interest you

(C:\SQL2012DIRecipes\CH02\InvoiceSubset.Xml).




xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance">






COLLATION = "Latin1_General_CI_AS"/>

















Even if you are only loading a single field from the source file, all the source fields have to be described in

the < RECORD /> section.

Should you need to do the opposite of what was described previously, then you can load all (or a selection)

of fields from a source file where the destination table contains many more fields. For this, we will use the

Invoices2.txt source file, which only has three fields. The format file to use looks like this

(C:\SQL2012DIRecipes\CH02\Invoices2.Xml):




xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance">







103

www.it-ebooks.info



Chapter 2 ■ Flat File Data Sources




COLLATION = "Latin1_General_CI_AS"/>

















It is worth noting that the source column fields are still numbered sequentially, but that the mapping to the

destination table columns has changed. As the source file only contains three columns, but the destination table

has five columns, the format file indicates to BCP that columns to use and maps them to the destination using the

“NAME” attribute. Of course, you could always restructure the destination table, but this is frowned upon by SQL

purists, and in any case, should your existing destination table contain hundreds of millions of records (or more)

then tweaking the format file is by far the easier option.

As the source file does not contain column headers, the BULK INSERT command looks like this

(C:\SQL2012DIRecipes\CH02\BulkInsertNoColHeaders.Sql):

BULK INSERT dbo.Invoice

FROM 'C:\SQL2012DIRecipes\CH02\Invoices2.Txt'

WITH (FORMATFILE = 'C:\SQL2012DIRecipes\CH02\Invoices2.Xml');

Note that you can map the source field to any existing destination field, whatever the field order in the

destination table, providing that the data types are compatible. You will not need to specify the field and row

terminators as BULK INSERT parameters, as these options are defined in the format file.

Not all source files use the same separator for each field. This too can be handled by a format file. The

C:\SQL2012DIRecipes\CH02\Invoices5 sample file contains a comma for the first field separator; @#@ for the

second; a pipe character (|) for the third; and a tab for the fourth. The following shows the format file to handle

this. You can even use multiple characters as a field separator if you really need to (C:\SQL2012DIRecipes\CH02\

BulkLoadZanySeparators.Xml).




xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance">






COLLATION = "Latin1_General_CI_AS"/>



























104

www.it-ebooks.info



Chapter 2 ■ Flat File Data Sources



If ever you are (un)fortunate enough to receive a source file that encloses one or more fields in quotes, then

you will have noticed that the quotes are loaded into the destination table. Assuming that this is not what you

want (and that you do not want to have to pre-process the data, as described Recipe 2-21), then you can tweak a

format file to strip out the quotes during the data load. For example, I have a file, C:\SQL2012DIRecipes\CH02\

Invoices4.Txt, where the second field is enclosed in quotes. The following is the format file that will remove the

quotes (C:\SQL2012DIRecipes\CH02\QuotedSecondField.Xml):




xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance">






COLLATION = "Latin1_General_CI_AS"/>

























You will need to add the " (the HTML entity for double quotes) for every single occurrence of an

opening or a closing double quote around a field. This respects the CSV standard and allows you to load quoted

fields containing commas (or any other character used as the column separator) contained in the quoted field.

If a field in one record is enclosed in quotes, the same field in all the records of the source file must also be

quoted, or the load will fail.



Hints, Tips, and Traps





Remember to use FIRSTROW = 2 if the source data file contains a header row.



2-13. Loading a Text File Fast Using T-SQL

Problem

You want to perform a fast data load from a text file.



Solution

Use OPENROWSET (BULK) from T-SQL. The following is a code snippet to do this

(C:\SQL2012DIRecipes\CH02\BulkInsertWithOpenrowset.sql):



105

www.it-ebooks.info



Chapter 2 ■ Flat File Data Sources



INSERT INTO

SELECT

FROM



CarSales_Staging.dbo.Invoices

ID, InvoiceNumber, ClientID

OPENROWSET(BULK 'C:\SQL2012DIRecipes\CH02\Invoices.Txt', 

FORMATFILE = 'C:\SQL2012DIRecipes\CH02\Invoicebulkload.Xml') AS MyDATA;



How It Works

Assuming that you master the basics of format files, you can now proceed to use the OPENROWSET (BULK) T-SQL

command, for which a format file is compulsory. The preceding code snippet loads a text file using a predefined

format file.



Hints, Tips, and Traps





The alias (MyDATA in this example) must be provided, or you will get an error message.







The T-SQL used with OPENROWSET (BULK) can be extended using WHERE, ORDER BY, CAST,

CONVERT, and aliases. This gives you great flexibility when loading the data.



2-14. Executing BULK INSERT from SSIS

Problem

You want to get the extra speed that BULK INSERT provides, but from inside an SSIS process.



Solution

Use the SSIS BULK INSERT task to import data as part of an SSIS process. What follows is the method to do this.

1.



In an SSIS package, add a BULK INSERT task to the Control Flow pane. Double-click to

edit the task.



2.



Click the Connections option on the left to display the Connections pane on the right.



3.



Select or create a destination connection. In this example it will be to the CarSales_

Staging database. Once the connection is established, select the destination table

(dbo.Invoices). Defining a destination connection is described in detail in Recipe 2-2.



4.



Select the connection to the source file—or create it if it does not exist.



5.



If you are using a format file, create or select the connection to the format file. You

should end up with something like Figure 2-14.



106

www.it-ebooks.info



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

2-12. Performing a BULK INSERT with a Format File

Tải bản đầy đủ ngay(0 tr)

×