Tải bản đầy đủ - 0 (trang)
4-17. Sourcing Data from PostgreSQL

4-17. Sourcing Data from PostgreSQL

Tải bản đầy đủ - 0trang

Chapter 4 ■ SQL Databases

Figure 4-44.  Configuring a PostgreSQL ODBC driver


Click Add. Select the PostgreSQL ODBC driver.


Click Finish. The ODBC Connector dialog box will appear. Configure the PostgreSQL

ODBC driver so that it contains the elements shown in Figure 4-44. You will use your

own specific parameters, of course.


Save your changes.

How It Works

There is an excellent and functional ODBC driver available to download from the PostgreSQL web site (www.

postgresql.org), which, once configured, allows you to use SSIS, linked servers, OPENROWSET, and OPENQUERY

without any difficulties. As was the case with DB2 and MySQL, no client software is required, which certainly

simplifies matters.

So, to avoid fruitless repetition, and assuming that you have downloaded the latest version of this driver, all you

have to do is to create a DSN as described for MySQL—only configured as in Figure 4-44 (step 4 for DSN setup).

The configuration elements are largely self-explanatory, but nonetheless a concise description is given

in Table 4-2.




Table 4-2. PostgreSQL ODBC Configuration

Configuration Element



The name you choose to identify the ODBC DSN.


The database you are connecting to.


The server hosting the database.


The user with the required access rights to the data source.


An optional description of the DSN.

SSL mode

The SSL mode used. SSL disabled works perfectly fine.


The PostgreSQL port (here, the default is used).


The user password.

Once you have configured the DSN, you can use SSIS, linked servers, OPENROWSET, or OPENQUERY to import

data from PostgreSQL. This can be done exactly as described for MySQL—only using the DSN name that you gave

to the PostgreSQL driver, of course; so I will not repeat it all here, but refer you back to Recipe 4-9 for the details.


This chapter demonstrated many ways of importing source data from SQL databases. The subject is broad, and—

as I wrote initially—not everything can be covered given the enormous scope of the subject. However, in this

chapter you saw how to download data from many of the major relational databases that are currently available.

Specifically, you saw examples involving the following:







Table 4-3 gives you my take on the various methods outlined in this chapter, listing their advantages and




Chapter 4 ■ SQL Databases

Table 4-3.  Comparison of the Methods Used in This Chapter




OLEDB providers

Generally faster.

More complex to install and configure.

ODBC providers

Easier to install and configure.

Generally slower.


Fast data load.

Longer time to set up a package.

Linked servers

Easy to use for querying external databases.

Can be complex to configure.

Requires greater permissions.

SQL Server Migration


Rapid acquisition of source metadata.

Can only open entire tables and datasets.

Can build up a data load project over time.

I have to be fair, and warn you that cross-database data migration can truly be a minefield. All too often

it can be a “minor” detail about the source database that can hold you up for hours until the issue is resolved.

Even more frequently the source database DBA can prove reluctant to share their knowledge. Nevertheless, if

you are patient, and above all do not rush things, then there is nothing to stop you migrating source data from

the databases that we have looked at, and/or connecting to them to define and create a truly heterogeneous data

extraction and load process. All I am trying to say is that a little calm and some charm can be your greatest allies

in this particular corner of the ETL battlefield.

You will notice that I have not discussed the SQL Server Import Wizard in this chapter. Quite simply, if you

have configured the provider and/or client for an external database, then using the SQL Server Import Wizard

is exactly as described in Recipe 1-2 (among others). All you have to do is use an OLEDB or ODBC connection

(depending on the source database and your specific preferences) as the data source. So I will not waste time

here on pointless repetition, and let you use the SQL Server Import Wizard if you so desire.



Chapter 5

SQL Server Sources

In this book, we look at importing data from several relational databases, and some of the ways in which they

can be used as data sources for SQL Server. Yet there is one relational database we have not talked about, and

that is SQL Server itself. So to continue our “data source tour,” let’s examine at some of the ways in which you can

transfer data between SQL Server databases.

This overview includes:

Ad hoc querying external SQL Server instances

SQL Server linked servers

Bulk loading of data from one SQL Server database to another SQL Server database

Loading data from older versions of SQL Server into SQL Server 2005, 2008, and 2012

Backup using COPY_ONLY

Snapshot replication

Copying and pasting tiny amounts of data between databases

Loading data into SQL Server Azure

The choice of SQL Server as a data source may seem surprising, but in many enterprises, there are dozens—

if not hundreds—of SQL Servers, often running different versions of the Microsoft RDBMS. So you may well need

to know what your options are as far as getting data between versions of SQL Server is concerned.

It is not possible to discuss every aspect of data transfer between SQL Server versions, and there are

inevitably certain technologies that fall outside the scope of this book. As my focus is on data integration with

a strong focus on ETL, I will not be examining any of the many High Availability options for SQL Server, nor

anything touching on Service Broker. Neither will I mention the Import/Export Wizard, as this has been covered

extensively in Chapters 1, 2 and 4.

I will look at migrating data to SQL Server Azure in this chapter, however. While the Microsoft database “in

the cloud” will doubtless replace onsite databases from many vendors, it seems most fruitful and comprehensible

to discuss it as a logical destination for data from Microsoft databases.

There are a few points to note as far as following the example given in this chapter is concerned. First, you

will need another SQL Server 2012 instance for many of the examples. This can be either a separate networked

server or a second installation of SQL Server with a defined instance name. I am using the ADAM02\AdamRemote

instance in the examples. You will need to replace this with the server and possibly instance that you are using.

You will also need to deploy the CarSales example database onto this second instance. All examples presume

that you are using the CarSales database unless another database is indicated. Any sample files used in this

chapter are found in the C:\SQL2012DIRecipes\CH05 directory—assuming that you have downloaded the

samples from the book’s companion web site and installed them as described in Appendix B.



Chapter 5 ■ SQL Server Sources

5-1. Loading Ad Hoc Data from Other SQL Server Instances


You want to load data on an ad hoc basis from another SQL Server instance quickly and easily.


Use OPENROWSET and OPENDATASOURCE. This allows you to connect quickly to the source data and select any data

subsets using T-SQL.

This is the code to use for using OPENROWSET (C:\SQL2012DIRecipes\CH05\OpenRowset.Sql):

SELECT Lnk.ClientName

INTO MyTable

FROM OPENROWSET('SQLNCLI', 'Server=ADAM02;Trusted_Connection=yes;',






ClientName') AS Lnk;

You can use OPENDATASOURCE like this (C:\SQL2012DIRecipes\CH05\OpenDataSource.Sql):

SELECT ID, ClientName, Town

INTO MyTable


Integrated Security=SSPI').CarSales.dbo.Client

The source data is loaded into the destination table in both cases.

How It Works

Should you want to import data as a “one-off,” then a quick connection to another SQL server instance is,

fortunately, extremely easy. There are, as for most external relational sources, two ways of establishing the

connection. They are

OPENROWSET: for occasional queries.

OPENDATASOURCE: for occasional queries that could evolve into linked servers one day.

The following are the relevant prerequisites.

An OLEDB provider must be installed on every external SQL Server instance. Admittedly,

this is normally part of an SQL Server installation, but I prefer to state the obvious.

An OLEDB provider must be installed on every SQL Server that is part of a cluster.



Chapter 5 ■ SQL Server Sources

Ad hoc distributed queries must be enabled on the server from which you are

running the query. This is done using the following T-SQL snippet


EXECUTE master.dbo.sp_configure 'show advanced options', 1;




EXECUTE master.dbo.sp_configure 'ad hoc distributed queries', 1;




For an occasional ad hoc query, you may find that OPENROWSET is the easiest solution. To clarify, the

parameters for OPENROWSET are essentially in three parts:

The OLEDB provider

A provider string, containing server and security parameters

A T-SQL query to retrieve the data

As the provider string only specifies the server, you are probably best advised to use three-part notation to

specify the database, schema, and table or view from which you wish to source data. If the login defaults to that

database and schema, then of course, you will have no problems; but I advise this as a best practice habit. If you

wish to use SQL Server security rather than a trusted connection, then replace Trusted_Connection=yes with

logon and password details like this:


Note that the security information is all part of the second parameter, and the parameter elements are

separated by a semicolon. Also, at the risk of stating the obvious, leaving security information in clear text like

this is extremely risky. If you have no other choices, then you should consider wrapping the SELECT statement in

a stored procedure created using the WITH ENCRYPTION option, which hides the text of the stored procedure from

many—but not all—prying eyes. Alternatively, the stored procedure could reside on the remote server.

In this case, it would need to be created by the team that administers that server. A stored procedure is generally

the better option for security because you would not be passing details of your schema over the network.

Remember that you are using pure T-SQL, and so can extend the SELECT clauses (both that passed to the

external server and the code wrapping the OPENROWSET command) to include a WHERE, ORDER BY, and GROUP BY

clauses, as well as column aliases. These techniques are described in greater detail in Recipes 1-4 and 1-5.

If you are using OPENDATASOURCE, you can use SQL Server security, with all the caveats that leaving passwords

in clear text imply. Here is a snippet to show it:





'Data Source=ADAM02\ AdamRemote;User ID Adampassword=


Hints, Tips, and Traps

If you suspect that an ad hoc query may have to become part of something more

permanent one day, then setting it up using OPENDATASOURCE allows you to make the

change to a linked server more easily. Its use of the SQL Server four-part notation allows

you to replace the SQL snippet with a linked server reference at a later date.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

4-17. Sourcing Data from PostgreSQL

Tải bản đầy đủ ngay(0 tr)