Tải bản đầy đủ - 0 (trang)
1-4. Specifying the Excel Data to Load During an Ad-Hoc Import

1-4. Specifying the Excel Data to Load During an Ad-Hoc Import

Tải bản đầy đủ - 0trang

s



Solution

Use SQL Server’s OPENROWSET command as part of a SELECT statement. This lets you use standard T-SQL to subset

the source data. For example, you can run the following code snippets:

1.



In the CarSales_Staging database, create a destination table named LuxuryCars

defined as follows (C:\SQL2012DIRecipes\CH01\tblLuxuryCars.Sql):

CREATE TABLE dbo.LuxuryCars

(

InventoryNumber int NULL,

VehicleType nvarchar(50) NULL

) ;

GO



2.



Enable remote queries, either by running the Facets/Surface Area Configuration tool

(or the Surface Area Configuration tool directly in SQL Server 2005), or running the

T-SQL given in the following

(C:\SQL2012DIRecipes\CH01\AllowDistributedQueries.Sql):

EXECUTE master.dbo.sp_configure 'show advanced options', 1;

GO

reconfigure ;

GO

EXECUTE master.dbo.sp_configure 'ad hoc distributed queries', 1 ;

GO

reconfigure;

GO



3.



Run the following SQL snippet

(C:\SQL2012DIRecipes\CH01\OpendatasourceInsertACE.Sql):

INSERT INTO CarSales_Staging.dbo.LuxuryCars (InventoryNumber, VehicleType)

SELECT CAST(ID AS INT) AS InventoryNumber, LEFT(Marque, 50) AS VehicleType

FROM OPENDATASOURCE(

'Microsoft.ACE.OLEDB.12.0',

'Data Source = C:\SQL2012DIRecipes\CH01\CarSales.xls;Extended Properties = Excel 12.0')...

Stock$

WHERE MAKE LIKE '%royce%'

ORDER BY Marque;



How It Works

There are times when quick access to the data in an Excel worksheet is all you need. This could be because you

need to perform a quick SELECT...INTO or INSERT INTO...SELECT using Excel as the data source. In this case,

firing up SSIS—or even running the Import Wizard (see Recipe 1-2)—to load data can seem like overkill. This

is where judicious application of SQL Server’s OPENDATASOURCE and OPENROWSET commands as part of a SELECT

statement can be extremely useful.



16

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



Indeed, as you will see shortly, once you know how to connect to the source file, even quite complex T-SQL

SELECT statements can be used on Excel source data. And, as you are writing standard SQL commands, they can

be run from a query window or as part of a stored procedure. This is particularly useful when:





You want to read the contents of an Excel worksheet, but don’t want to clutter up your

database with extra tables of information.







The data will be read infrequently.







You know the file (workbook) and worksheet names, and have a good idea of the data

structures—in other words, you can open the file to read it.







When you want to perform ad hoc querying, and choose the columns and filter the data

using standard SQL commands.



Without attempting to be exhaustive, there are some variations on this theme. I use either the Jet driver or

the ACE driver indiscriminately. I use Excel worksheets in both 97–2003 and 2007–2010 formats because the

techniques described works with all these formats. I am not adding INSERT INTO or SELECT ... INTO Code here,

but presume that you will be selecting one or the other in a real–world scenario,



■■Note  As this is, after all, an ad-hoc scenario, you could well have to run SSMS in "Administrator” mode – by

right-clicking on SQL Server Management Studio from the start menu and selecting "Run as Administrator”. This

is because the user running SSMS must have read and Write permissions on the TEMP directory used by the SQL

Server Startup account.

Assuming that you have a named range (TinyRange in the sample file), then you can return the data in the

range using T-SQL like this:

SELECT ID, Marque FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',

'Excel 8.0;Database = C:\SQL2012DIRecipes\CH01\CarSales.xls', TinyRange);

If the range does not contain column headers, then you will need to add the HDR = NO property to the T-SQL,

as follows. Otherwise, the first row is presumed to be column headers.

SELECT ID, Marque FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',

'Excel 8.0;HDR = NO;Database = C:\SQL2012DIRecipes\CH01\CarSales.xls', TinyRange);

If you know the Excel range references corresponding to the data that you want to return, then you can use

an SQL snippet like this:

SELECT ID, Marque FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0;Database = C:\SQL2012DIRecipes\CH01\CarSales.xlsx',

'SELECT * FROM [Stock$A2:B3]');

You must remember to provide the worksheet as well as the range, as no default worksheet is presumed.

Similarly, remember to add HDR = NO if the range does not contain column headers.

As the previous snippet showed, you can pass an entire SELECT statement via the OLEDB driver to Excel. This

presents a whole range of possibilities, such as choosing individual columns. For example:

SELECT ID, Marque FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0;Database = C:\SQL2012DIRecipes\CH01\CarSales.xlsx',

'SELECT ID, Marque FROM [Stock$A1:C3]');



17

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



Just as in a standard T-SQL statement, you can alias the columns returned. For example:

SELECT InventoryNumber,VehicleType FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0;Database = C:\SQL2012DIRecipes\CH01\CarSales.xlsx',

'SELECT ID AS InventoryNumber, Marque AS VehicleType FROM [Stock$A2:C3]');

The “pass-through” query that you send to Excel can also sort the data that is returned. The following

example sorts by Marque:

SELECT ID, Marque FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0;Database = C:\SQL2012DIRecipes\CH01\CarSales.xlsx',

'SELECT ID, Marque FROM [Stock$A2:C3] ORDER BY Marque');

Finally, if you want to add a WHERE clause, you can do so:

SELECT InventoryNumber,VehicleType FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0;Database = C:\SQL2012DIRecipes\CH01\CarSales.xlsx',

'SELECT ID AS InventoryNumber, Marque AS VehicleType

FROM Stock$ WHERE MAKE LIKE ''%royce%'' ORDER BY Marque');

In the provider options, you need to check Supports ‘Like’Operator for such a sort to work. Note also that you

will need to duplicate the single quotes if you are using the LIKE operator.

You might have a source file without headers for the data. In this case, all you need to do is add HDR = NO;

to the syntax. In these circumstances, it is probably best to use column aliases to give the output data greater

readability, or the OLEDB provider will merely rename all the columns F1, F2, and so forth. For example:

SELECT InventoryNumber,VehicleType FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0;HDR = NO;Database = C:\SQL2012DIRecipes\CH01\CarSales.xlsx',

'SELECT F1 AS InventoryNumber, F2 AS VehicleType FROM [Stock$A2:C3] WHERE MAKE LIKE

''%royce%'' ORDER BY Marque');

HDR is not the only property that you might need to know about when importing Excel data. Table 1-2

describes your options. Understanding the IMEX (mixed data types) property is also useful in some cases.

Table 1-2.  Jet and ACE Extended Properties



Property Name



Description



Examples



HDR



Specifies if the first row returned contains headers.



HDR = NO



IMEX



Allows for mixed data types to be imported inside a single column.



IMEX = 1



Extended properties do require further explanation. Here, HDR merely indicates to the driver whether your

source data contains header rows. As the presumption (at least using the Jet and ACE drivers) is that there are

header rows, setting this property to NO when there are no headers avoids not only having the first record appear

as the column names, but also a potential mismatch of data types. It is worth noting that you do not need to

specify the Excel file type (.xls/.xlsx/.xslm/.xlsx/.xlsb) as the ACE driver will recognize the file type automatically.

IMEX is marginally trickier. It does not force the data in a column to be imported as text—it forces the mixed

data type defined in the registry for this OLEDB driver to be used. As this registry entry is text by default, it nearly

always forces the data in as text. It will not convert the data to text. Depending on the driver (that is, when using

the Jet driver in most cases), not setting IMEX = 1 can cause a load failure or return NULLs instead of numeric values

in a column containing text and numbers.



18

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



1-5. Planning for Future Use of a Linked Server

Problem

You want to import only a subset of data from an Excel spreadsheet, but you suspect that you will need to carry

out this operation repeatedly, and eventually migrate it to a linked server solution. You do not want to have to

rewrite everything further down the line.



Solution

Use SQL Server’s OPENDATASOURCE command as part of a SELECT statement. For example,

(C:\SQL2012DIRecipes\CH01\OpendatasourceSelect.Sql):

SELECT ID AS InventoryNumber, LEFT(Marque,20) AS VehicleType

INTO RollsRoyce

FROM OPENDATASOURCE(

'Microsoft.ACE.OLEDB.12.0',

'Data Source = C:\SQL2012DIRecipes\CH01\CarSales.xls;Extended Properties = Excel 8.0')...Stock$

WHERE MAKE LIKE '%royce%'

ORDER BY Marque;



How It Works

The OPENROWSET command is suited to ad hoc querying. However, you may be evaluating data connection

possibilities with a view to eventually using a linked server. In this case, you may prefer to use the

OPENDATASOURCE command as a kind of “halfway house” to linked servers (described in the next recipe). This sets

the scene for you to update your code to replace OPENDATASOURCE with a four-part linked server reference.

Inevitably, there are many variations on this particular theme (which only selects all the data from a source

worksheet and uses only the ACE driver), so here are a few of them. As the objective is to import data into SQL

Server, I will let you choose whether to include this code in either a SELECT..INTO or an INSERT INTO ...SELECT

clause. Of course, you can use the Jet driver if you prefer. If you are using Excel 2007/2010, you must set the

extended properties in the T-SQL to Excel 12.0.

SELECT ID, Marque FROM OPENDATASOURCE(

'Microsoft.ACE.OLEDB.12.0',

'Data Source = C:\SQL2012DIRecipes\CH01\CarSales.xlsx;Extended Properties = Excel 12.0')...Stock$;

To select all the data in a named range, use the following T-SQL:

SELECT ID, Marque

FROM OPENDATASOURCE(

'Microsoft.ACE.OLEDB.12.0',

'Data Source = C:\SQL2012DIRecipes\CH01\CarSales.xls;Extended Properties = Excel 8.0')... TinyRange;

To select—and if you wish alias—columns in the Excel source data, use T-SQL like in the following. Note that

this is applied to the T-SQL, and is not part of a pass-through query.

SELECT ID AS InventoryNumber, Marque AS VehicleType

FROM OPENDATASOURCE(

'Microsoft.ACE.OLEDB.12.0',

'Data Source = C:\SQL2012DIRecipes\CH01\CarSales.xls;Extended Properties = Excel 8.0')...Stock$;



19

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



Finally, to use WHERE and ORDER BY when returning Excel data, merely extend the T-SQL like this:

SELECT ID AS InventoryNumber, Marque AS VehicleType

FROM OPENDATASOURCE(

'Microsoft.ACE.OLEDB.12.0',

'Data Source = C:\SQL2012DIRecipes\CH01\CarSales.xls;Extended Properties = Excel 8.0')...Stock$

WHERE MAKE LIKE '%royce%'

ORDER BY Marque;

In this case, the Excel file must not be password-protected. It is worth noting that OPENDATASOURCE only

works when the DisallowAdhocAccess registry option is explicitly set to 0 for the specified provider, and the Ad

Hoc Distributed Queries advanced configuration option is enabled as described in Recipe 1-3. OPENDATASOURCE

also expects the source data to resemble a table complete with header rows, so ensure that any named ranges

have a header row.

Whether using ACE for Office 2007 or for Office 2010, you must set the Excel version to 12.0—not 14.0 as

the download page suggests. Also, if you are using the Jet driver when connecting to Excel (and Access), these

approaches will not work in a 64- bit environment in SQL Server (2005–2012), even if the Excel format is 97–2003.

If you have to use a driver that causes problems when there are mixed data types in a column, then you can force

the driver to scan a larger number of rows (the default is 8)—or indeed the entire worksheet—to test for mixed

data types. To do this, edit the following registry setting:

HKEY_LOCAL_MACHINE\Software\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows

Setting this value to a figure other than 8 scans that number of rows.Setting it to 0 scans the entire sheet.

This, however, inevitably causes a severe performance hit.

Should you wish to alter the mixed data setting, it is in the following registry hive for Office 2010:

HKEY_LOCAL_MACHINE\Software\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\

ImportMixedTypes

The usual caveats apply to changing registry settings: back up your registry first, and be very careful!



Hints, Tips, and Traps





An error message along the lines of “Msg 7314, Level 16, State 1, Line 2 The OLE DB

provider "Microsoft.Jet.OLEDB.4.0” for linked server "(null)” does not contain the table

"Sheet1$”. “Either the table does not exist or the current user does not have permissions on

that file or folder. It could also mean that you have not specified the right file and/or path.







An error message such as “Msg 7399, Level 16, State 1, Line 4 The OLE DB provider

"Microsoft.Jet.OLEDB.4.0” for linked server "(null)” reported an error. The provider did not

give any information about the error. Msg 7303, Level 16, State 1, Line 4 Cannot initialize

the data source object of OLE DB provider "Microsoft.Jet.OLEDB.4.0” for linked server

“(null)”. " This could very well mean that the Excel workbook file is open, thus it cannot be

opened by SQL Server. All you have to do is close the Excel Workbook. Alternatively there

could be a permissions problem - are you running SSMS as an Administrator?







The Excel file must not be password-protected.







If all you get back is a NULL value (with a column header of F1), then you probably have

not specified the correct worksheet name.







You cannot use UNC paths in ad hoc queries.







For permissions on folders used by Jet, see http://support.microsoft.com/kb/296711/EN-US



20

www.it-ebooks.info



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

1-4. Specifying the Excel Data to Load During an Ad-Hoc Import

Tải bản đầy đủ ngay(0 tr)

×
x