Tải bản đầy đủ - 0 (trang)
1-16. Resolving Complex Data Migration Problems During an Access to SQL Server Upgrade

1-16. Resolving Complex Data Migration Problems During an Access to SQL Server Upgrade

Tải bản đầy đủ - 0trang

CHAPTER 1 ■ Sourcing Data from MS Office Applications



Figure 1-29.  SSMA report creation



Use SSMA to Apply Custom Data Type Mapping for a Project

Creating custom data type mapping in SSMA is also extremely easy, and can be done as follows:

1.



Click Tools ➤ Project Settings.



2.



In the Project Settings dialog box, click Type Mapping in the lower-left corner.

The dialog box will look like Figure 1-30.



54

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



Figure 1-30.  Source and target data type mappings

3.



Click the type mapping pair that you wish to modify.



4.



Click Edit. The dialog box shown in Figure 1-31 appears.



Figure 1-31.  SSMA type mapping



55

www.it-ebooks.info



s



5.



Select a different target type and, if necessary, any other attributes.



6.



Click OK twice.



Use SSMA to Create T-SQL Scripts of Destination Tables

If you need the DDL for destination tables, SSMA can create them for you, as follow:

1.



Clicking Create Schema creates the T-SQL necessary to create a table. To use this

code, click the SQL tab in the bottom -right pane. Scroll down to the bottom of the

SQL code, and you find the both the table DROP and CREATE scripts. These can be

copied into SQL Server Management Studio and tweaked to suit your requirements.



Use SSMA to Find All the Access Databases on Any Drives to Which Your

Computer is Connected

SSMA can trawl through all the disks to which your computer is connected and display all the Access databases it

finds. This is done in the following way:

1.



Click File ➤ Find databases, the Find Databases dialog box appears, as shown in

Figure 1-32.



Figure 1-32. SSMA Find Database Wizard



56

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



2.



Browse for the drive and/or directory you wish to search. Click Add. Repeat this step

as many times as there are paths you wish to search.



3.



Add all or part of the file name to search for (if required).



4.



Select data ranges and file sizes (if required).



5.



Click Scan. SSMA finds all Access files corresponding to your search criteria. They will

appear in the dialog box, where you can select files to add by Control-clicking.



6.



Click Next to display the list of files to verify, and then click Finish to add the selected

files to the Access Metadata Explorer pane.



How It Works

SSMA is a very powerful tool, but also very flexible in the many ways it can help you upgrade your Access

databases to SQL Server. As we have seen in the mini-recipes, these include modifying SQL Server tables based

on the metadata in SSMA, migrating data for a single table, creating a report on a table or tables, applying

custom data type mapping for a project, using SSMS for T-SQL scripting of destination tables, and finding Access

databases.

As you have seen, SSMA creates SQL Server tables that are, in essence, the source Access tables that have

been adapted to SQL Server data types. However, you are not obliged to accept the data types or field lengths

proposed by SSMA, and you can modify them to your heart’s content. Be warned that doing this causes the old

version of the table to be dropped and re-created, and any data will have to be reloaded. SSMA converts Access

data according to a tried and tested mapping scheme. However, if you prefer, you can alter this mapping for an

entire project.

One of SSMA’s most valuable functions is its ability to forewarn you of impending problems. You are free to

heed or ignore—or even doubt the sagacity of—the report, but it can be useful to know, in advance, where the

data migration might fail.

There are several known migration issues; however, most of these do not apply to simple data migration, and

are only a problem in the case of a full-blown upgrade to an SQL Server back-end with an Access front-end.

Interestingly, many of the potential problems are the same as those encountered when using the Access

Upsizing Wizard, so first I suggest that you look at the hints given earlier in Recipe 1-10—specifically those

concerning effective ways to prepare an Access database for a trouble-free upgrade.

You can also save yourself time and trouble by looking into a few classic problem areas—shown in Table 1-5.



Table 1-5.  SSMA Problem Areas



Problem



Solution



Access object names can

beSQL Server keywords.



Access and SQL Server have different reserved keywords, so you need to

double-check that Access is not using a reserved word that causes SQL Server

to throw a fit.



Field and Table Names



Access field and table names can contain characters not generally considered

acceptable for SQL Server. You should modify the T-SQL generated by SSMA

and then use it instead to create the destination tables.



Dates



The SQL Server DATETIME type accepts dates in the range of 1 Jan 1753 to 31

Dec 9999 only. However, Access accepts dates in the range of 1 Jan 1000 to 31

Dec 9999. Map Access data fields to DATETIME2 fields in SQL Server 2012.



57

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



Remember that if you are faced with a potentially tricky Access database to load into SQL Server, you

might find it easier to create a copy of the database, and then look at ironing out any problems in the copied

Access database—this could include removing indexes and constraints, renaming tables and fields etc. In my

experience, this is often the fastest way to get your source data into SQL Server quickly.



Hints, Tips, and Traps





You cannot alter the field names, as one important function of SSMS is to migrate data

and metadata to SQL Server, while leaving Access as a data interface. Altering field names

would make an interface unworkable, and so—no altering field names.







If you really have to alter field names, and wish to use SSMA as a metadata scripting

tool, then you have to use the T-SQL scripts generated by SSMA, and tweak them before

running them in SSMS. Yes, you have to copy and paste each one, individually, into SSMS.







Although you must remember to click Apply to complete all the modifications for a

table, you can synchronize with the database just once. Only remember to right-click the

database name in the SQL Server metadata window to do this.







You see a warning dialog box if the table into which you are copying data is not empty. If

you continue, the table will be truncated.







Each report is saved to disk in a subdirectory named “report” on the project directory.







You can also add and remove type mappings—but I would not advise this in a real-world

scenario.







Doing this for the Default Project Settings alters the data type mappings for all future

SSMA projects.







To create tables without indexes or primary keys, only select the code for the CREATE

statement, and do not copy the ALTER TABLE statement.







SSMA cannot migrate databases that use workgroup protection. You have to remove

Workgroup protection before using SSMS. I suggest that you consult the Microsoft web

site for the best way of doing this.



Summary

This chapter has contained many ways of getting Office data into SQL Server. Some of them may seem to be

similar, duplicate methods, or simply plain abstruse. So to give you a clearer overview, Table 1-6 shows my take

on the various methods, and their advantages and disadvantages.



58

www.it-ebooks.info



CHAPTER 1 ■ Sourcing Data from MS Office Applications



Table 1-6.  Techniques Suggested in This Chapter



Technique



Advantages



Disadvantages



OPENROWSET



Uses pass-through queries.



Can be tricky to set up.



OPENDATASOURCE



Allows source data to be manipulated

using T-SQL.



Can be tricky to set up.



Linked Server



Allows source data to be manipulated

using T-SQL.



Can be difficult to set up.

Frequently very slow in practice.



Import and Export Wizard



Easy to use. Can generate SSIS package.



Limited conversion options.



SSIS



Full data conversion, modification, and

manipulation. Incremental data loads.



Longer to implement. Cannot

create all destination objects at

once.



SSMA



Iterative development. Extensive

options. Can create all destination

objects at once.



Limited conversion options. No

incremental data loads. No partial

data selection.



Access Upgrade Wizard



Easy to use. Can create all destination

objects at once.



Very slow. Limited conversion

options. No incremental data loads.



Now, which solution is best suited to a particular challenge depends on many factors, such as the time

you have available to develop a solution, the need for resilience and logging—or quite simply the nature of the

requirement. You will probably not try to develop an earth-shattering SSIS package when your requirement

is to carry out a one-off load of a few records from Excel. Conversely, automating the Import/Export Wizard is

impossible, so you will not want to use it for daily data loads.

When you have a regular data load to perform, the choice of data load technique can get trickier. One of the

T-SQL-based solutions (OPENROWSET, OPENDATASOURCE, or a linked server) can simplify deployment considerably.

In these circumstances, you will, in all probability, merely copy a stored procedure over to a production server

and/or execute some simple T-SQL snippets. If, however, you want to be able to track all the details of a data load,

then SSIS with its myriad logging possibilities (explained in the final chapter of this book) is doubtless the tool

to choose.

So “it depends” is probably the answer to the question “which approach should I take?”. The main thing

to remember from this chapter is that there are a variety of tools available and you need to be aware of their

possibilities and limitations. This way, you can choose and apply the most appropriate technique in your specific

circumstances to obtain the best solution to your specific challenge.



59

www.it-ebooks.info



Chapter 2



Flat File Data Sources

For more years than most of us care to remember, the single most common source of external data facing an SQL

Server developer or DBA was a text file. The inevitable consequence of the near ubiquity of the “flat” file is the minor

ecosystem of tools and techniques that has grown over time to help us all load text file data sources into SQL Server.

Faced with such a plethora of solutions, the main approaches that I examine in this chapter are the following:





The Import/Export Wizard







SSIS







OPENROWSET and OPENDATASOURCE







BULK INSERT







Linked servers







BCP



All have their advantages and drawbacks, as you would expect. The aim of this chapter is to introduce you to

the uses and usefulness of each, and hopefully to explain how and when each can and probably should be used.

Then, as text files are often not as simple as they could be, I will outline a few techniques to handle some of the

trickier problems that they can pose.

However, before getting into the nitty-gritty of importing data from text files (or flat files—I use the terms

interchangeably), we need to clarify a few basic concepts. First among these: are you dealing with a real CSV file,

and does it matter? It is important to clear this up, as many text files are described as CSV files when this is simply

not the case.

Assuming that the text file that you are receiving does not contain multiple, differing types of records in

a single file, you are probably looking at a delimited file, where the data is “tabular” but the “columns” are

separated by a specific character and the “rows” by another (nonprinting) character. Depending on how such a

file is laid out, it may be considered to be a CSV file.

There is much discussion as to when a text file is a CSV (comma-separated values) file or not. Thanks to

some sterling work a few years ago, there is now a specification, in the form of an RFC (Request For Comment), of

what a CSV file is (see http://tools.ietf.org/html/rfc4180 for details). I suggest considering CSV as a subset

specification of what most flat files are, which is a delimited data format using column separators and record

separators. The RFC specification essentially boils down to the following:





Each record is located on a separate line, delimited by a line break (CR/LF).







The last record in the file may or may not have a final line break.







There may be an optional header line appearing as the first line of the file with the same

format as normal record lines. This header contains names corresponding to the fields in

the file and should contain the same number of fields as the records in the rest of the file.



61

www.it-ebooks.info



Chapter 2 ■ Flat File Data Sources







Within the header and each record, there may be one or more fields separated by commas.







Each line should contain the same number of fields throughout the file.







Spaces are considered part of a field and should not be ignored.







The last field in the record must not be followed by a comma.







Each field may or may not be enclosed in double quotes. If fields are not enclosed in

double quotes, then no double quotes may appear inside the fields.







Fields containing double quotes, commas and line breaks (CR/LF), should be enclosed in

double quotes.







If double quotes are used to enclose fields, then a double quote appearing inside a field

must be “escaped” by preceding it with another double quote.



What precedes is the theory. Practice—as you have probably discovered—can be quite another kettle of fish.

As the RFC states, there are “considerable differences among implementations.” Indeed, many flat-file transfers

are called CSV when they are TSV (tab separated), PSV (pipe separated)—or indeed anything-under-the-sun

separated. So my take is that what matters above all else in real-world data transfer scenarios for delimited data

transfer is consistency. It is simply not important whether the file matches the RFC specification or not. Just about

all the tools that SQL Server offers for flat file loads can handle (or be tweaked to handle) a consistent delimited

file format. So in essence, if you are requesting a CSV, TSV, PSV, or indeed any other form of essentially tabular

data file, you need to ask the following:





Which character is used as a field separator in the source data?







How is the end of a record indicated?







Should the first two characters escaped in the file be used inside a field?







Are quotes used consistently (that is, either for all data in each file “column”)—or not at all?



Once you know these things, then you can attempt to load a file, and try and deal with any difficulties that

it may present. In most cases, if you receive a CSV file that does respect the standard, then you could consider

yourself fortunate. If you are really lucky, you can specify the format. Should this be the case, then the four

questions that I just listed should serve as your basic guideline for specifying the source format.

The “classic” field separator is the comma (the “C” in CSV). However, you may frequently find tab separated,

pipe separated, and many other characters—even small strings of characters—used as field separators. If you

are dealing with data from continental Europe (not to mention other areas of the globe), you may need to use a

semicolon as the separator in order to allow the comma to remain as the decimal separator.

The CSV specification defines the end of record indicator as the CR/LF character. You could also find either

the carriage return (CR) or the line feed (LF) is being used.

This is where, in practice, things can get sticky. You might find that a field separator is escaped by a

backslash (\) or that the whole field is enclosed in quotes, with all quotes in the body of a field doubled. This is

often the single trickiest aspect of flat-file loads. Various techniques to deal with this are described in this chapter.

Sometimes you might have to deal with text files that contain more than one type of record, and are not as

simply tabular as is a “classic” CSV file. These will inevitably require some custom processing, and are examined

at the end of this chapter. Fortunately, SQL Server 2012 has made considerable progress in handling text files

containing varying columns (or multiple column delimiters if you prefer to think of it that way). We will also see

this at the end of the chapter.

To follow the examples given in this chapter, you need to create the sample CarSales and CarSales_Staging

databases that are described in Appendix B. You will also need to download the sample files from the book’s

companion web site and place them in the C:\SQL2012DIRecipes\CH02\ directory on your SQL Server. I

advise you to drop and re-create the destination tables (if they exist already) between recipes to ensure a clean

destination structure.



62

www.it-ebooks.info



Chapter 2 ■ Flat File Data Sources



2-1. Importing Data From a Text File

Problem

You want to import a flat file as fast and as easily as possible



Solution

Run the Import/Export Wizard.

When faced with a delimited text or CSV file, the classic way to begin is to try and import it using the Import/

Export Wizard. Since this wizard was covered in Chapter 1, I will be more succinct here. For a more detailed

description, please refer to Recipe 1-2.

1.



Ensure that the correct table structure has been created in the destination database. In

this example, it is the dbo.Invoices table in CarSales_Staging. See Appendix B for details.



2.



In SSMS, right-click the destination database (CarSales_Staging in this example) and

select Tasks ➤ Import Data. If the Welcome screen appears, click Next.



3.



Select Flat File Source as the Data Source. The dialog box switches to offer all the

flat file options. Browse to your source file or enter the full path and file name

(C:\SQL2012DIRecipes\CH02\Invoices.Txt).



4.



Specify that the column names are in the first row, and that the text qualifier

is . Define the Header Row Delimiter as  and select the format as

delimited. You should end up with a dialog box like Figure 2-1.



63

www.it-ebooks.info



Chapter 2 ■ Flat File Data Sources



Figure 2-1.  Flat file sources in the Import/Export Wizard

5.



Click Columns, which shows you the list of available columns. Here you can change

specific column delimiters if you need to.



6.



Click Advanced. This lets you define the data type for each column. You can also set

the length, precision, or scale (depending on the data type)—or leave the defaults, as

shown in Figure 2-2.



64

www.it-ebooks.info



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

1-16. Resolving Complex Data Migration Problems During an Access to SQL Server Upgrade

Tải bản đầy đủ ngay(0 tr)

×