Tải bản đầy đủ - 0 (trang)
2-3. Automatically Determining Data Types

2-3. Automatically Determining Data Types

Tải bản đầy đủ - 0trang

Chapter 2 ■ Flat File Data Sources


Follow steps 1 to 4 in Recipe 2-2.


Click Advanced.


Click Suggest Types. The Suggest Column Types dialog box appears, as shown in

Figure 2-6.

Figure 2-6.  Determining column types automatically in SSIS


Alter any options you wish to tweak (for instance, the number of rows may be far too

small for an accurate sample of a large file). Click OK.

You will see that all the data types (and lengths, where appropriate) have been sampled and adjusted in the

advanced pane of the dialog box (Figure 2-5).



Chapter 2 ■ Flat File Data Sources

How It Works

You probably noticed that when importing text files, the Flat File connection manager assumes that all source columns

in a delimited text file are strings with a length of 50 characters. This can be too restrictive for several reasons:

Many fields are not strings and consequently require data type conversion as part of the

SSIS package.

The suggested length of 50 characters is insufficient for many string fields, and can cause

package errors at runtime.

Even if all the source columns can fit into 50 characters, this is frequently too wide, as it

reduces the number of records that can fit into the SSIS pipeline buffer, and consequently,

slows down the load process.

Frequently, of course, the problem is solved by the people who provide the source files, who will have

thoughtfully handed over a complete data type description of the file contents. However, there just may be times

when this is not the case, and you have to deduce, discover, or guess the field (or column if you prefer) types

and lengths in a source file. This can be a painful process, especially if it means a trial-and-error-based cycle of

loading and reloading a text file until you have finally found, for each source column, an acceptable data type.

Consequently, a much simpler solution is to ask SSIS to guess the data types and lengths for you. Admittedly, you

can open some files to look at the source data. A quick glance is rarely an accurate analytical sample, however.

What is more, there will be some source flat files that are so large that they either take forever to open or simply

crash your favorite text editor.

It helps to understand your options when determining the data types in the source file, as provided in Table 2-2.

Table 2-2.  Suggested Column Type Options



Number of rows

The number of rows to sample. There seems to be no upper limit in

SQL Server 2012. Up to SSIS 2008, it was limited to 1000 records.

Suggest the smallest integer data type

For columns that contain integers only, suggest the smallest integer

type that can accommodate the data without overflowing.

Suggest the smallest real data type

For columns that contain real (numeric data type) numbers, suggest

the smallest numeric type that can accommodate the data without


Identify Boolean columns using the

following values

Indicates that values can be interpreted as Boolean

Pad string columns / Percent Padding

Takes the length of the longest string ([n]varchar, [n]char) element

and extends the length by the percentage you specify to anticipate

longer strings in future source files.

Assuming that you know which column needs to be set to which data type, as well as setting a specific

column delimiter for one or more columns, you might wish to fine-tune the data types using the Advanced pane.

Table 2-3 describes the available options.



Chapter 2 ■ Flat File Data Sources

Table 2-3.  Flat File Connection Manager Advanced Pane Options




The column name can be set or overridden here. This is the name

that is used from this point on in the SSIS data flow.

Column delimiter

The specific column delimiter for one or more columns. You can

select from the list, or enter or paste the column delimiter used in

the source data.

Data Type

Select the data type from the list of SSIS data types.

Output Column width

The width of the output column. This is for single-byte characters.

Text Qualified

Allows you to specify if the text in this column is qualified by using

the text qualifier set in the General pane.

For example, if you wish to override the current settings for a column’s data type, you could set the data type,

as shown in Figure 2-5.

Should your source data change, you do not have to redo the entire column mapping structure from scratch,

as SSIS lets you add or remove columns to adjust the package to changes in the source data structure.

To add a column, do the following:


Click the column that precedes (or follows) the column to insert.


Click the arrow on the right of the New button. Select Insert Before (or after). A new

column is added to the mapping structure.

Clicking New or Add Column inserts a new column after the existing columns.

Removing a column is as simple as the following:


Select the column to remove.


Click Delete.

The Advanced pane of the Flat File connection manager also lets you specify, for each column, whether the

column is enclosed in quotes. All you have to do is set TextQualified to True, if this is the case, and False if there

are no quotes.

Hints, Tips, and Traps

The details of OLEDB Connections in SSIS are explained in Recipe 1-7.

You can use a .NET destination from SQL Server 2008. From SQL Server 2008 R2 and up

you have the “Use Bulk Insert whenever possible” option for this destination component

to accelerate the data load.

To have SSIS handle varying numbers of column delimiters, see Recipe 2-16.

Adjusting the data types for an existing package generates a warning triangle on the Flat

File source task. To have the warning disappear, just double-click this task and confirm

that you wish for SSIS to resynchronize the metadata.



Chapter 2 ■ Flat File Data Sources

If you are not using SSIS 2012, then the maximum number of rows to sample is 1000;

in some cases, this is not a representative sample. If you are experiencing problems,

then it is probably best to set large data types (I8 for integers, large lengths for character

fields), and then check the maximum real data lengths in the table into which the data is

imported. Data length analysis like this is described in Chapter 8.

Personally, I find it unlikely that you will ever define Flat File connection managers

at package level in SQL Server 2012 (remember this is done in the Connection

Managers folder of the Solution Explorer). This is because they are essentially “one-off”

connections. A destination connection is another matter entirely; it would probably

benefit from being defined at package level.

If the source file is not Unicode, then do not attempt to set it to Unicode, or you risk losing

all your existing column type definitions.

SSIS will not verify that your modifications map to the source file’s structure—so you need

to know what you are doing.

The ways to handle column truncation and data type transformation are given in Chapter 9.

SSIS data types are described in Appendix A.

You might have to copy and paste unusual column separators into the Flat File

Connection Manager Editor, one row at a time, because multiple pasting is not offered.

2-4. Importing Fixed-Width Text Files


You want to import a fixed-width text file into SQL Server.


Create an SSIS package and define a Data Flow using a Flat File connection manager set to accept fixed-width

files. The following explains how you can do this.


Create an SSIS package as described in Recipe 2-2, preparing a Flat File source and an

OLEDB destination.


Create an OLEDB connection manager connecting to CarSales_Staging named



Create a connection manager for the flat file, either as described earlier in Recipe 2-2

step 3, or by right-clicking the Connection Managers tab and selecting New Flat File

Connection, as shown in Figure 2-3.


Once you have specified the connection name and the source file

(C:\SQL2012DIRecipes\CH02\StockFixedWidth.Txt), specify that the format be fixed

width, as shown in Figure 2-7.



Chapter 2 ■ Flat File Data Sources

Figure 2-7.  Defining a fixed-width data source


Display the Columns pane by clicking Columns in the left of the dialog box. You

should see something like Figure 2-8.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

2-3. Automatically Determining Data Types

Tải bản đầy đủ ngay(0 tr)