11-8. Detecting and Loading Delta Data Using T-SQL and a Linked Server When MERGE Is Not Practical
Tải bản đầy đủ - 0trang
Chapter 11 ■ Delta Data Management
Solution
Detect delta identifiers at source, compare at destination, and then request only the delta data for upserts. This is
how it is done - presuming that you have a linked server named ADAMREMOTE containing the CarSales database
and the dbo.Invoice_Lines table. The code is in (C:\SQL2012DIRecipes\CH11\TableMergeReplacement.Sql).
IF OBJECT_ID('TempDB..#Upsert') IS NOT NULL DROP TABLE #Upsert
SELECT
INTO
FROM
ID, VersionStamp
#Upsert
ADAMREMOTE.CarSales.dbo.Invoice_Lines
-- Inserts
;
WITH Inserts_CTE
AS
(
SELECT
ID
FROM
#Upsert U
WHERE
ID NOT IN (
SELECT ID FROM dbo.Invoice_Lines WITH (NOLOCK))
)
INSERT INTO dbo.Invoice_Lines
(
ID
,InvoiceID
,SalePrice
,StockID
,VersionStamp
)
SELECT
SRC_I.ID
,InvoiceID
,SalePrice
,StockID
,VersionStamp
FROM
INNER JOIN
ADAMREMOTE.CarSales.dbo.Invoice_Lines SRC_I WITH (NOLOCK)
Inserts_CTE CTE_I
ON SRC_I.ID = CTE_I.ID
-- Updates
;
WITH Updates_CTE
AS
667
www.it-ebooks.info
Chapter 11 ■ Delta Data Management
(
SELECT
FROM
S_U.ID
dbo.Invoice_Lines S_U WITH (NOLOCK)
INNER JOIN
#Upsert U WITH (NOLOCK)
ON S_U.ID = U.ID
S_U.VersionStamp<> U.VersionStamp
WHERE
)
UPDATE
SET
DST_U
DST_U.InvoiceID = SRC_U.InvoiceID
,DST_U.SalePrice = SRC_U.SalePrice
,DST_U.StockID = SRC_U.StockID
,DST_U.VersionStamp = SRC_U.VersionStamp
FROM
dbo.Invoice_Lines DST_U
INNER JOIN
ADAMREMOTE.CarSales.dbo.Invoice_Lines SRC_U
ON SRC_U.ID = DST_U.ID
INNER JOIN
Updates_CTE CTE_U
ON DST_U.ID = CTE_U.ID
-- DELETES
DELETE FROM
WHERE
dbo.Invoice_Lines
ID NOT IN (SELECT ID FROM #Upsert)
How It Works
It’s not just SSIS that can benefit—in certain circumstances—from an initial detection of delta data before
actually carrying out the required inserts and updates; if you have a linked server connection to your source
data, then you can use this approach using T-SQL too. You can use a similar process to the one described in the
previous recipe to
•
Collect the primary key and delta-flag columns from the source table.
•
Compare this with the destination data table.
•
Return to the source to collect new and modified data.
The preceding code snippet applies this logic to detect and upsert/delete delta data only. It is a “pull”
process run from the destination server. This method, of course, uses more roundtrips to the source server, but
it does give you greater control, and is less of a “black box” than the MERGE function, and may well be easier to
understand. Another point in its favor is that only one temporary table is required on the destination server.
I must nonetheless emphasize that any potential speed gains depend on the infrastructure (the configurations
of the source and destination servers, as well as the network links) and—crucially—the nature of any indexes
on the source and destination tables. So you should test this approach and compare with a simple MERGE before
assuming that it is right for you. This process works best when your testing shows that the selective transfer of
data between the two systems is faster than a simple MERGE.
It works like this: first, the primary key and delta detection flag columns are transferred from the source
server to the temporary #Upsert table on the destination server. Then, any new records are identified (by using a
plain old WHERE NOT IN or a similar clause) using the primary key column in the source and destination tables.
Any updates detected are fetched from the source server and applied at the destination. The delta detection flag
column permits this. Finally, any deletes—soft or hard—can be applied using another WHERE NOT IN or similar
clause. Remember that you can add indexes to the temporary tables if it accelerates the processing.
668
www.it-ebooks.info
Chapter 11 ■ Delta Data Management
■■Note If you do not have the indexes in place which make MERGE so efficient then you may find that MERGE is
slower than alternative solutions. Inevitably, this will depend on each set of circumstances, and so you have to test
any possible solutions in your specific environment.
11-9. Detecting, Logging, and Loading Delta Data
Problem
You want to handle delta data upserts while being able to track the changes and have updates performed as
required.
Solution
Use triggers and trigger metadata tables to detect the data changes and SSIS to perform periodic loads into the
destination database.
1.
Create two tables in the destination database (CarSales_Staging)—dropping any
previous versions that you have already created—to hold the data used for updates
and deletes. For this example, they are as follows (C:\SQL2012DIRecipes\CH11\
TriggerTablesSource.Sql):
CREATE TABLE CarSales_Staging.dbo.Invoice_Lines_Updates
(
ID int NOT NULL,
InvoiceID INT NULL,
StockID INT NULL,
SalePrice NUMERIC(18, 2) NULL,
CONSTRAINT PK_Invoice_Lines_Updates PRIMARY KEY CLUSTERED
(
ID ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) ;
GO
CREATE TABLE CarSales_Staging.dbo.Invoice_Lines_Deletes
(
ID INT NOT NULL,
CONSTRAINT PK_Invoice_Lines_Deletes PRIMARY KEY CLUSTERED
(
ID ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) ;
GO
2.
Create a tracking table in the source database. For the sake of simplicity, I will
presume that the source data has a single primary key column, and that this column
is an INT data type (C:\SQL2012DIRecipes\CH11\TriggerTablesDestination.Sql).
669
www.it-ebooks.info
Chapter 11 ■ Delta Data Management
CREATE TABLE CarSales.dbo.DeltaTracking
(
DeltaID BIGINT IDENTITY(1,1) NOT NULL,
ObjectName NVARCHAR (128) NULL,
RecordID BIGINT NULL,
DeltaOperation CHAR(1) NULL,
DateAdded DATETIME NULL DEFAULT (getdate()),
CONSTRAINT PK_DeltaTracking PRIMARY KEY CLUSTERED
(
DeltaID ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) ;
GO
3.
Again, for the sake of simplicity, here is a (fairly) generic tracking trigger that can be
added to any table (in the source database), which has a single primary key column of
an INT data type (C:\SQL2012DIRecipes\CH11\tr_DeltaTracking.Sql):
CREATE TRIGGER CarSales.dbo.tr_DeltaTracking
ON dbo.Invoice_Lines FOR INSERT, UPDATE, DELETE
AS
DECLARE @InsertedCount BIGINT
DECLARE @DeletedCount BIGINT
DECLARE @ObjectName
NVARCHAR(128)
SELECT
SELECT
@InsertedCount = COUNT(*) FROM INSERTED
@DeletedCount = COUNT(*) FROM DELETED
SELECT
FROM
WHERE
@ObjectName = OBJECT_NAME(parent_id)
sys.triggers
parent_class_desc = 'OBJECT_OR_COLUMN'
AND object_id = @PROCID
-- Inserts
IF @InsertedCount > 0 AND @DeletedCount = 0
BEGIN
INSERT INTO dbo.DeltaTracking (RecordID, ObjectName, DeltaOperation)
SELECT
ID
,@ObjectName AS ObjectID
,'I' AS DeltaOperation
FROM INSERTED
END
-- Deletes
IF @InsertedCount = 0 AND @DeletedCount > 0
BEGIN
INSERT INTO dbo.DeltaTracking (RecordID, ObjectName, DeltaOperation)
670
www.it-ebooks.info