Tải bản đầy đủ - 0 (trang)
5-3. Loading Large Data Sets Using T-SQL

5-3. Loading Large Data Sets Using T-SQL

Tải bản đầy đủ - 0trang

s



2.



Run the following T-SQL snippet to import the native BCP file

(C:\SQL2012DIRecipes\CH05\Clients.bcp in the accompanying examples):

BULK INSERT CarSales_Staging.dbo.Client_BCP

FROM 'C:\SQL2012DIRecipes\CH05\Clients.bcp'

WITH

(

DATAFILETYPE = 'widenative'

) ;



How It Works

When a direct connection to another SQL Server instance is not possible, you may have to resort to “indirect”

means. In essence, you have to export the data as a file, copy the file onto the destination server, and re-import

it into another instance. Now, you can export data as a flat file or as XML; these subjects are handled in Chapter 7.

Equally, importing them in these formats is the subject of Chapters 2 and 3. However, if you are sending data

between SQL Server databases, instances, and even versions, then the venerable yet magisterial BCP utility (and

with it the BCP native file format) really comes into its own. Consequently, I wish to concentrate here on ways of

loading BCP files. The reasons for choosing this method are fairly simple:





reliability—BCP has been around since the very beginnings of SQL Server, and is

remarkably robust, as is its descendant, BULK INSERT.







speed—nothing loads as fast as a native BCP file, in my experience.



This does not mean that it is a perfect solution. The following are a few minor quibbles.





You always need to know the details of a BCP file and to have knowledge about the DDL

defining the table whose data it contains separately.







You cannot just open a native BCP file and reverse-engineer its structure.







If you are not loading all the columns in the source file into the destination table, or if the

source file columns are not in the same order as those of the destination table, you need a

format file to perform column mapping.







This approach is built for bulk loading—not selecting and/or transforming data en route.



There are several advantages to using BULK INSERT for the import phase of this process.





As it is nothing but T-SQL, it can be run from a query window, a stored procedure, or even

using SQLCMD.







BULK INSERT runs in-process; this makes it the fastest option.







You do not need xp_cmdshell and the requisite permissions as you do with BCP when

run from T-SQL.







It is arguably less clunky to use than the BCP command-line executable and all its flags.







BULK INSERT accepts parallel loads, as described in Chapter 13.



The main thing to ensure with BULK INSERT is that the DATAFILETYPE parameter matches the flag used when

exporting the data. Essentially, –n maps to native and –N maps to widenative.

The BULK INSERT snippet in step 2 of this recipe used only one of the available options. As you may surmise,

there are many others. Table 5-2 provides a succinct overview.



249

www.it-ebooks.info



Chapter 5 ■ SQL Server Sources



Table 5-2.  BULK INSERT Options



Parameter



Definition



Comments



DATAFILETYPE='native'



The native (database) data

type



SQL Server data types are used, and all character

fields are non-Unicode.



DATAFILETYPE='widenative' The native (database) data

SQL Server data types are used, and all character

types and Unicode data type fields are Unicode.

for character data

ERRORFILE='pathandfile'



The error file



The file name and full path for the error file used to

log errors.



KEEPNULLS



Null value handling



Any empty columns keep NULLs rather than

inserting default values if these are specified.



KEEPIDENTITY



Keep Identity hint



This keeps IDENTITY data during an import.



CHECK_CONSTRAINTS



Check and foreign key

constraints are applied

during the load



The default is that no check constraints or foreign

key constraints are applied.



FIRETRIGGERS



Triggers fire during the load The default is not to fire triggers during a data load



FIRSTROW=n



The first row to be loaded



The default is the first row of the file.



LASTROW=n



The last row to be loaded



The default is the is last row of the file.



ROWS_PER_BATCH=n



The number of rows per

batch of imported data



Each batch is imported and logged in a separate

transaction. The whole batch must be successfully

imported before any records are committed.



KILOBYTES_PER_BATCH=n



The batch size hint



Approximate number of kilobytes of data per batch.



MAXERRORS=n



The maximum number of

errors before the load fails



Default = 10



ORDER



Sort hint



Tells BULK IMPORT the sort order of the source data.

The source data must be sorted if this is used.



TABLOCK



Table locking hint



A table-level lock are applied during the load to

minimize resources used on lock escalation



To see how you can use multiple load options, consider the following snippet, which takes the original

BULK INSERT command and extends it to specify a number of options

(C:\SQL2012DIRecipes\CH05\BulkInsertWithOptions.Sql):

BULK INSERT CarSales_Staging.dbo.Client_BCP

FROM 'C:\SQL2012DIRecipes\CH05\Clients.bcp'

WITH

(

DATAFILETYPE = 'widenative'

,KEEPIDENTITY

,TABLOCK



250

www.it-ebooks.info



Chapter 5 ■ SQL Server Sources



,KEEPNULLS

,ERRORFILE='C:\SQL2012DIRecipes\CH05\BulkInsertErrors.txt'

,MAXERRORS=50

)



Hints, Tips, and Traps





One final point to note is that, in my experience, you should always export data onto a

local disk and load data from a local disk unless you have absolutely no alternative. Using

data stored locally usually makes for shorter load times and offers less risk of load failure

due to a timeout caused by network latency.







You are probably best advised to drop all indexes on the destination table (in most cases)

and re-create them afterward.







The version of BCP utility used to read a format file must be the same as, or a later than

the version of the format file. For example, SQL Server 2012 BCP can read a version 10.0

format file, which is generated by SQL Server 2008 BCP; but SQL Server 2008 BCP cannot

read a version 11.0 format file, which is generated by SQL Server 2012 BCP.







If you want to create the BCP export file, then you can use the following code:



C:\>BCP "SELECT ID, ClientName, Town, County, ClientSize, ClientSince FROM CarSalesdbo.Client 

ORDER BY ID" queryout C:\SQL2012DIRecipes\CH05\Clients.bcp -N -SADAM02\AdamRemote 

-UAdam –PMe4B0ss



5-4. Load Data Exported from SQL Server from the Command

Line

Problem

You want to load data exported from SQL Server using the command line, without using SSMS or SQLCMD.



Solution

Run BCP.exe to load the data—previously exported as a native BCP file—from a Command window.

Loading data from a native BCP file can be as simple as (C:\SQL2012DIRecipes\CH05\BCPLoad.cmd):

BCP CarSales_Staging.dbo.Client_BCP IN C:\SQL2012DIRecipes\CH05\Clients.bcp –N –T –SADAM02



How It Works

Should you prefer the time-honored way of importing data, then the venerable BCP can import it for you. The

code in this recipe, run from a command prompt, imports the file used in Recipe 5-3 into the destination table

whose DDL is in that same recipe. In fact, it is essentially identical to the BULK INSERT command used earlier.

One major difference is that you can run it without SSMS or SQLCMD.

As you can see, this is very close to the T-SQL used for a BULK INSERT. The major difference is that you

needed to add security information; in this case, -T for integrated security.



251

www.it-ebooks.info



Chapter 5 ■ SQL Server Sources



The following are things to note from the start:





The destination table must exist in the destination database.







It is normally best to drop any indexes before the data load and re-create them afterward.







You can use a URL (\\server\share) rather than a drive letter to reference the source

data file. Nonetheless, I suggest that you consider copying the data to a local disk before

loading it for the reasons given in the previous recipe.







Any of the options described in Table 5-3 can be used when running BCP.







If tweaking all these case-sensitive flags in a command window is less than unhallowed

pleasure for you, then you can always write the BCP command as a script (.cmd) file and

run it by double-clicking.



There are many options that you may be required to use when loading data using BCP. Table 5-3 lists the key

flags which you might need one day.

Table 5-3.  BCP Options



Argument



Definition



Comments



-m



Maximum number of errors



Default = 10.



-n



The native (database) data

types



SQL Server data types are used, and all

character fields are non-Unicode.



–N



The native (database) data

types and Unicode data type

for character data



SQL Server data types are used, and all

character fields are Unicode.



-S



The server or server\instance If no server is specified, BCP connects to the

name

default instance of SQL Server on the local

computer.



-U



User



The login ID used to connect to SQL Server.



-P



Password



The user password.



-T



Integrated security



If used, then no –P and –U flags can be used.



-e



An (optional) error file



The file name and full path for the error file

used to log errors – if needed.



-F



The first row to be loaded



The default is the first row.



-L



The last row to be loaded



The default is the last row.



-b



The number of rows per

batch of imported data



Each batch is imported and logged in a

separate transaction. The whole batch must

be successfully imported before anything is

committed.



-V



Version



See Table 5-6.



-q



Quoted identifiers



This is used to specify a database, owner, table,

or view name that contains a space or a single

quotation mark. You must enclose the entire

three-part table or view name in quotation marks.

(continued)



252

www.it-ebooks.info



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

5-3. Loading Large Data Sets Using T-SQL

Tải bản đầy đủ ngay(0 tr)

×