Tải bản đầy đủ - 0 (trang)
12-5. Applying Change Data Capture with SSIS

12-5. Applying Change Data Capture with SSIS

Tải bản đầy đủ - 0trang

Chapter 12 ■ Change Tracking and Change Data Capture



Solution

Use SSIS 2012 and its new Change Data Capture tasks. The following steps show how it can be done using the

Client table in the CarSales database as the source and using the CarSales_Staging database as the destination.

1.



Implement CDC at the database and table level (unless this has already been done)

using the following code (C:\SQL2012DIRecipes\CH11\EnableCDCSSIS.Sql):

USE CarSales;

GO

sys.sp_cdc_enable_db;

GO

sys.sp_cdc_enable_table

@source_schema = 'dbo',

@source_name = 'Client' ,

@role_name = 'CDCMainRole',

@supports_net_changes = 1



2.



Using the following DDL, create a table to hold the IDs of records to be deleted in the

destination table (C:\SQL2012DIRecipes\CH11\tblCDC_Client_Deletes.Sql):

CREATE TABLE CarSales_Staging.dbo.CDC_Client_Deletes

(

ID INT NOT NULL

) ;

GO



3.



Using the following DDL, create a table to hold the data for records to be updated in

the destination table (C:\SQL2012DIRecipes\CH11\tblCDC_Client_Updates.Sql):

CREATE TABLE CarSales_Staging.dbo.CDC_Client_Updates

(

ID INT NOT NULL,

ClientName NVARCHAR(150) NULL,

Address1 VARCHAR(50) NULL,

Address2 VARCHAR(50) NULL,

Town VARCHAR(30) NULL,

County VARCHAR(30) NULL,

PostCode VARCHAR(10) NULL,

Country CHAR(3) NULL,

ClientType NCHAR(5) NULL,

ClientSize VARCHAR(10) NULL,

ClientSince SMALLDATETIME NULL,

IsCreditWorthy BIT NULL,

DealerGroup HIERARCHYID NULL,

MapPosition GEOGRAPHY NULL

) ;

GO



705

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



4.



Create a destination table using the following DDL:

CREATE TABLE CarSales_Staging.dbo.Client_CDCSSIS

(

ID INT IDENTITY(1,1) NOT NULL,

ClientName NVARCHAR(150) NULL,

Address1 VARCHAR(50) NULL,

Address2 VARCHAR(50) NULL,

Town VARCHAR(50) NULL,

County VARCHAR(50) NULL,

PostCode VARCHAR(10) NULL,

Country TINYINT NULL,

ClientType VARCHAR(20) NULL,

ClientSize VARCHAR(10) NULL,

ClientSince SMALLDATETIME NULL,

IsCreditWorthy BIT NULL,

DealerGroup HIERARCHYID NULL,

MapPosition GEOGRAPHY NULL,

CONSTRAINT PK_Client PRIMARY KEY CLUSTERED

(

ID ASC

) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

) ;

GO

Creating a table to track the LSN range used in each delta load is also required—as in

Recipe 12-4—but as you will see, the CDC Control task can create this for you, and so I

will let it do just this.



5.



Create a new SSIS package. Add the following three connection managers at project level

(by right-clicking the Connection Managers folder in the Solution Explorer and selecting

New Connection Manager). All will have the .ConnMgr extension in the Connection

Managers folder and will appear without this extension in the Connection Managers tab:



Name



Type



Comments



CarSales_OLEDB



OLEDB



The connection to the CDC-enabled source database.



CarSales_Staging_OLEDB



OLEDB



The connection to the destination (synchronized) database.



CarSales_ADONET



ADO.NET



The connection used for LSN tracking. Here I am placing

this in the source database. This must be an ADO.NET

connection or the CDC task cannot use it.



6.



At package level, add a new String variable named CDCState.



706

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



7.



Add a CDC Control task named CDC LSN Start to the Control Flow pane. Open it and

set the following parameters:



Name



Type



Comments



SQL Server CDC Database/

ADO.NET Connection Manager



CarSales_ADONET



This is the previously-defined connection to

the source database that has CDC enabled.



CDC Control Operation



Mark initial load start



This tells the task that this is the start of an

initial load.



Variable containing CDC State



CDCState



The variable to store the LSN.



Connection Manager where the

database where the state is stored



CarSales_ADONET



8.



Click New for the “table to use for storing state”. Confirm the default table DDL

(see Figure 12-2).



Figure 12-2.  Defining the State Storage table in SSIS 2012 CDC

The code is

CREATE TABLE dbo.cdc_states

(

name NVARCHAR(256) NOT NULL,

state NVARCHAR(256) NOT NULL

) ;



707

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



GO

CREATE UNIQUE NONCLUSTERED INDEX cdc_states_name ON

dbo.cdc_states

( name ASC )

WITH (PAD_INDEX = OFF) ;

GO

9.

10.



Click Run to create the table.

Set the State name as CDCState. The dialog box should look like Figure 12-3.



Figure 12-3.  The CDC Control Task Editor

11.



Click OK to confirm your changes.



12.



Add a Data Flow task to the Control Flow pane, connect the CDC control to it, and

switch to the Data Flow pane.



708

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



13.



14.



15.



Add an OLEDB source task, configured as follows:

Name:



CarSales



OLEDB Connection Manager:



CarSales_OLEDB



Data Access Mode:



Table or View



Name of Table or View:



dbo.Client



Add an OLEDB destination task, linked to the source task and configured as follows:

Name:



CarSales_Staging



OLEDB Connection Manager:



CarSales_Staging_OLEDB



Data Access Mode:



Table or View – fast load



Name of Table or View:



dbo.Client_CDCSSIS



Keep Identity:



Checked



Map all the source columns to the destination columns. The data flow should look

like Figure 12-4.



Figure 12-4.  Process flow for SSIS 2012 CDC

16.



Return to the Control Flow tab. Add a CDC Control task. Name it CDC LSN end.

Connect the Data Flow task to it, and then double-click to edit.



17.



Set the same parameters as previously (step 8) but ensure that the CDC Control

Operation is now “Mark initial load end”.  The dialog box should look like Figure 12-5.



709

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



Figure 12-5.  The CDC Control Task Editor dialog box to mark the end of the initial load

18.



Run the package to start Change Data Capture. The entire package should look like

Figure 12-6.



710

www.it-ebooks.info



Chapter 12 ■ Change traCking and Change data Capture



Figure 12-6. Process flow when instantiating CDC

19.



Create a new SSIS package in the same SSIS project (this will enable you to use the

same connection managers that you used in the initial load).



20.



At package level, add a new String variable named CDCState.



21.



Add an Execute SQL task to the Control Flow pane. Configure it as follows:

Name:



Prepare Staging Tables



Connection Type:



OLEDB



Connection:



CarSales_Staging_OLEDB



SQL Statement:



TRUNCATE TABLE dbo.CDC_Client_Updates

TRUNCATE TABLE dbo.CDC_Client_Deletes



22.



Add a CDC Control task, name it Get starting LSN, and connect the previous task

(Prepare Staging Tables) to it. Configure it exactly as described in the previous recipe,

but set the CDC Control Operation to “Get processing range”.



23.



Add a Data Flow task. Connect the CDC Control task to it. Switch to the Data Flow pane.



24.



Add a CDC Source task. Open it and configure it as follows:



Name:



CDC Source



ADO.NET Connection Manager:



CarSales_ADONET



CDC-enabled Table:



dbo.Client



CDC Processing Mode:



Net



Variable containing CDC State:



CDCState



711

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



25.



The dialog box should look like Figure 12-7. Click OK to confirm your modifications.



Figure 12-7.  The CDC Source dialog box

26.



Add a CDC splitter task (in the SSIS toolbox, this is with the Other Transforms) and

connect the CDC Source task to it.



27.



Add an OLEDB destination task and connect the CDC splitter to it. Name it Inserts

and select the InsertOutput as the output to use from the CDC Splitter task.



28.



Configure the OLEDB destination task as follows:



Name:



Inserts



OLEDB Connection Manager:



CarSales_Staging_OLEDB



Data Access Mode:



Table or view – Fast Load



Name of Table or View:



dbo.Client_CDCSSIS



712

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



29.



Map all the data columns (not those used by CDC and which begin with __) and

click OK.



30.



Add an OLEDB destination task and connect the CDC splitter to it. Name it Deletes

and select DeleteOutput as the output to use from the CDC Splitter task. Configure

the OLEDB destination task as follows:



Name:



Deletes



OLEDB Connection Manager:



CarSales_Staging_OLEDB



Data Access Mode:



Table or view – Fast Load



Name of Table or View:



dbo.CDC_Client_Deletes



31.



Map the ID column only and click OK.



32.



Add an OLEDB destination task and connect the CDC splitter to it. Select the

UpdateOutput as the output to use. Name it Updates and configure the OLEDB

destination task as follows:



33.



Name:



Updates



OLEDB Connection Manager:



CarSales_Staging_OLEDB



Data Access Mode:



Table or view – Fast Load



Name of Table or View:



dbo.CDC_Client_Updates



Map all the data columns (but not the CDC specific columns—those beginning with a

double underscore) and click OK. The Data Flow should look like Figure 12-8.



Figure 12-8.  The CDC upsert process



713

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



34.



Return to the Control Flow pane. Add a CDC Control task, which you name Set end

LSN. Connect the Data Flow task to it, and configure as for the “Get starting LSN”

CDC Control task, only be sure to select “Mark processed range” as the CDC control

operation.



35.



Add an Execute SQL task. Connect the “Set end LSN” CDC Control task to it.

Configure as follows:



Name:



Updates



Connection Type:



OLEDB



Connection Manager:



CarSales_Staging_OLEDB



SQL Statement



UPDATE D



C:\SQL2012DIRecipes\CH11\SSISCDCUpdates.Sql



SET

D.ClientName = S.ClientName

,D.Address1 = S.Address1

,D.Address2 = S.Address2

,D.Town = S.Town

,D.County = S.County

,D.PostCode = S.PostCode

,D.Country = S.Country

,D.ClientType = S.ClientType

,D.ClientSize = S.ClientSize

,D.ClientSince = S.ClientSince

,D.IsCreditWorthy = S.IsCreditWorthy

,D.DealerGroup = S.DealerGroup

,D.MapPosition = S.MapPosition



FROM



CarSales.dbo.client D



INNER JOIN



dbo.CDC_Client_Updates S

ON S.ID = D.ID



714

www.it-ebooks.info



Chapter 12 ■ Change Tracking and Change Data Capture



36.



Add an Execute SQL task. Connect the Execute SQL task, which you name Updates,

to it. Configure as follows:



Name:



Deletes



Connection Type:



OLEDB



Connection Manager:



CarSales_Staging_OLEDB



SQL Statement:



DELETED

FROM



CarSales.dbo.client D



INNER JOIN dbo.CDC_Client_Deletes S

ON S.ID = D.ID

You can now run the package (which should look like Figure 12-9). Any modifications made to the source

data will be reflected in the destination table.



Figure 12-9.  The complete CDC process flow



715

www.it-ebooks.info



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

12-5. Applying Change Data Capture with SSIS

Tải bản đầy đủ ngay(0 tr)

×