Tải bản đầy đủ - 0 (trang)
8-20. Recognize Characters in an Image (OCR)

8-20. Recognize Characters in an Image (OCR)

Tải bản đầy đủ - 0trang



Use COM Interop to access the features of Microsoft Office Document Imaging.

■ Note This recipe requires Microsoft Office 2007.

How It Works

The first step is to install the Microsoft Office Document Imaging (MODI), which is not installed by

default by the Microsoft Office installation. Run the Office installer, and select Microsoft Office

Document Imaging from the Office Tools section, as shown in Figure 8-11.

Figure 8-11. Installing MODI




Once the Office feature has been installed, you can add a reference to your project for the Microsoft

Office Document Imaging 12.0 Type Library entry, available under the COM tab and import the MODI

namespace into your class file. Because we are accessing MODI through COM, the API calls we have to

make are a little awkward. The sequence for performing OCR follows:

Create a new instance of Document by calling new Document().

Load the image that you wish to process by calling the Create method on the

Document instance from the previous step, passing in a string that contains the

name of the image file. OCR can be performed on PNG, JPG, GIF, and TIFF files.

Call the OCR method on the Document instance.

Obtain the first element of the Images array property from the Document instance,

and from that Image instance, get the Layout by calling the Image.Layout property.

The Layout class is what we are trying to obtain—it represents the scanned content, and its

members allow us to get information about the OCR results and access the words that have been

scanned. The most important member of Layout is Words, which is a collection of Word instances, each of

which represents a word scanned from the source image and that you can enumerate through to create

the processed result. The Word class has two useful members—the most important is Text, which returns

the string value of the scanned word. The second useful member is RecognitionConfidence, which

returns a value indicating how confident the OCR process was in recognizing the word correctly, on a

scale of 0 to 999.

The Code

The following example loads an image called ocr.GIF (which we have included in the sample code for

this chapter) and performs OCR on it. Each word found is printed out, along with the

RecognitionConfidence value.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using MODI;

namespace Apress.VisualCSharpRecipes.Chapter08


class Recipe08_20


static void Main(string[] args)


// Create the new document instance.

Document myOCRDoc = new Document();

// Load the sample file.


// Perform the OCR.





// Get the processed document.

Image image = (Image)myOCRDoc.Images[0];

Layout layout = image.Layout;

// Print out each word that has been found.

foreach (Word word in layout.Words)


Console.WriteLine("Word: {0} Confidence: {1}",

word.Text, word.RecognitionConfidence);









Database Access

In the Microsoft .NET Framework, access to a wide variety of data sources is enabled through a group of

classes collectively named Microsoft ADO.NET. Each type of data source is supported through the

provision of a data provider. Each data provider contains a set of classes that not only implement a

standard set of interfaces (defined in the System.Data namespace), but also provide functionality unique

to the data source they support. These classes include representations of connections, commands,

properties, data adapters, and data readers through which you interact with a data source.

Table 9-1 lists the data providers included as standard with the .NET Framework.

Table 9-1. .NET Framework Data Provider Implementations

Data Provider


.NET Framework

Data Provider for


Provides connectivity (via COM Interop) to any data source that implements an

ODBC interface. This includes Microsoft SQL Server, Oracle, and Microsoft Access

databases. Data provider classes are contained in the System.Data.Odbc namespace

and have the prefix Odbc.

.NET Framework

Data Provider for


Provides connectivity (via COM Interop) to any data source that implements an

OLE DB interface. This includes Microsoft SQL Server, MSDE, Oracle, and Jet

databases. Data provider classes are contained in the System.Data.OleDb

namespace and have the prefix OleDb.

.NET Framework

Data Provider for


Provides optimized connectivity to Oracle databases via Oracle client software

version 8.1.7 or later. Data provider classes are contained in the

System.Data.OracleClient namespace and have the prefix Oracle.

.NET Framework

Data Provider for

SQL Server

Provides optimized connectivity to Microsoft SQL Server version 7 and later

(including MSDE) by communicating directly with the SQL Server data source,

without the need to use ODBC or OLE DB. Data provider classes are contained in

the System.Data.SqlClient namespace and have the prefix Sql.

.NET Compact

Framework Data


Provides connectivity to Microsoft SQL Server CE. Data provider classes are

contained in the System.Data.SqlServerCe namespace and have the prefix SqlCe.




■ Tip Where possible, the recipes in this chapter are programmed against the interfaces defined in the

System.Data namespace. This approach makes it easier to apply the solutions to any database. Adopting this

approach in your own code will make it more portable. However, the data provider classes that implement these

interfaces often implement additional functionality specific to their own database. Generally, you must trade off

portability against access to proprietary functionality when it comes to database code. Recipe 9-10 describes how

you can use the System.Data.Common.DbProviderFactory and associated classes to write code not tied to a

specific database implementation.

This chapter describes some of the most commonly used aspects of ADO.NET. The recipes in this

chapter describe how to do the following:

Create, configure, open, and close database connections (recipe 9-1)

Employ connection pooling to improve the performance and scalability of

applications that use database connections (recipe 9-2)

Create and securely store database connection strings (recipes 9-3 and 9-4)

Execute SQL commands and stored procedures, and use parameters to improve

their flexibility (recipes 9-5 and 9-6)

Process the results returned by database queries as either a set of rows or as XML

(recipes 9-7 and 9-8)

Execute database operations asynchronously, allowing your main code to

continue with other tasks while the database operation executes in the

background (recipe 9-9)

Write generic ADO.NET code that can be configured to work against any relational

database for which a data provider is available (recipe 9-10)

Discover all instances of SQL Server 2000 and SQL Server 2005 available on a

network (recipe 9-11)

Create an in-memory cache and programmatically create a DataSet (recipes 9-12

and 9-13)

Perform LINQ database queries using a DataSet, and use entity types (recipes 9-14

and 9-15)

Compare the results of LINQ queries (recipe 9-16)




■ Note Unless otherwise stated, the recipes in this chapter have been written to use SQL Server 2008 Express

Edition running on the local machine and the Northwind sample database provided by Microsoft. To run the

examples against your own database, ensure the Northwind sample is installed and update the recipe’s

connection string to contain the name of your server instead of .\sqlexpress. You can obtain the script to set up

the Northwind database from the Microsoft web site. On that site, search for the file named

SQL2000SampleDb.msi to find links to where the file is available for download. The download includes a Readme

file with instructions on how to run the installation script.

9-1. Connect to a Database


You need to open a connection to a database.


Create a connection object appropriate to the type of database to which you need to connect. All

connection objects implement the System.Data.IDbConnection interface. Configure the connection

object by setting its ConnectionString property. Open the connection by calling the connection object’s

Open method.

How It Works

The first step in database access is to open a connection to the database. The IDbConnection interface

represents a database connection, and each data provider includes a unique implementation. Here is

the list of IDbConnection implementations for the five standard data providers:






You configure a connection object using a connection string. A connection string is a set of

semicolon-separated name/value pairs. You can supply a connection string either as a constructor

argument or by setting a connection object’s ConnectionString property before opening the connection.

Each connection class implementation requires that you provide different information in the

connection string. Refer to the ConnectionString property documentation for each implementation to

see the values you can specify. Possible settings include the following:




The name of the target database server

The name of the database to open initially

Connection timeout values

Connection-pooling behavior (see recipe 9-2)

Authentication mechanisms to use when connecting to secured databases,

including provision of a username and password if needed

Once configured, call the connection object’s Open method to open the connection to the database.

You can then use the connection object to execute commands against the data source (discussed in

recipe 9-3). The properties of a connection object also allow you to retrieve information about the state

of a connection and the settings used to open the connection. When you’re finished with a connection,

you should always call its Close method to free the underlying database connection and system

resources. IDbConnection extends System.IDisposable, meaning that each connection class implements

the Dispose method. Dispose automatically calls Close, making the using statement a very clean and

efficient way of using connection objects in your code.

The Code

The following example demonstrates how to use both the SqlConnection and OleDbConnection classes to

open a connection to a Microsoft SQL Server Express database running on the local machine that uses

integrated Windows security:









namespace Apress.VisualCSharpRecipes.Chapter09


class Recipe09_01


public static void SqlConnectionExample()


// Create an empty SqlConnection object.

using (SqlConnection con = new SqlConnection())


// Configure the SqlConnection object's connection string.

con.ConnectionString =

@"Data Source=.\sqlexpress;" + // local SQL Server instance

"Database=Northwind;" +

// the sample Northwind DB

"Integrated Security=SSPI";

// integrated Windows security

// Open the database connection.




Tài liệu bạn tìm kiếm đã sẵn sàng tải về

8-20. Recognize Characters in an Image (OCR)

Tải bản đầy đủ ngay(0 tr)