Tải bản đầy đủ
11 Will the real Mr. Smith please stand up?

11 Will the real Mr. Smith please stand up?

Tải bản đầy đủ

106

CHAPTER 11

Will the real Mr. Smith please stand up?

Personally identifiable data
What is this thing called personally identifiable data? In a nutshell, personally identifiable data is information that uniquely describes an individual. It’s the data captured in
the database that designates you as a purchaser of an order. It’s the data that can be
used to contact you or identify your location. It’s the information that you provide to
the bank teller who grants you access to funds in an account. It’s the information that
provides the means to obtain a license to drive.
An extensive, all-inclusive list of the data elements that are categorized as personally identifiable data can be challenging to obtain. It can vary from country to country
depending on the privacy laws that have been established by governing bodies and
industry regulators. There are, though, some data elements that are generally recognized. Here are a few examples:








Federal identification number
Driver’s license number
Date of birth
Individual’s full name, especially if unique, rare, or well known
Financial account numbers, such as credit card or banking account numbers
Fingerprints (and other biometric information)
DNA molecule information

To the person whose identity this data validates, it’s their passport to maneuver
through society. To the person fraudulently claiming the personally identifiable data
as their own, it can grant them access to another person’s funds or the ability to commit crimes under a stolen identity, presenting legal issues for the valid and innocent
holder of the identity.
The responsibility for a database that contains personally identifiable data is one
that exceeds the typical concerns of the latest backup or tuning the latest resourcegorging query. A DBA must take on the roles of a guardian, defender, and superhero.

Today’s superhero: the DBA
As a child I thoroughly enjoyed waking early on a Saturday morning, grabbing my
bowl of brightly colored rings floating in a sea of milk, and watching “The Justice
League.” This was a cartoon series that presented a dream team of superheroes collaborating to fight the forces of evil in the world. If the Justice League was assembled
today, I believe that it’d have a slightly different lineup. It would be one that would
include a superhero who represents the protection of the innocent from the diabolical activities of those who misuse personally identifiable data. This superhero would
be none other than the DBA.

Our superpowers
Every superhero possesses superpowers. Some may be able bend steel with their bare
hands whereas others can run fast as a lightning bolt. The DBA is no exception. As a

Today’s superhero: the DBA

107

DBA, you truly do have superpowers that you must develop and maintain to protect per-

sonally identifiable data. These superpowers are data conservation and ambassadorship.
DATA CONSERVATION

The first superpower that you possess is a deep understanding of the power and value
of data. Although other roles in a business may value data as something that’s consumed for the benefit of a project or periodic task, as a DBA you value data as a living
entity. It’s something to be conserved, nurtured, and protected. You’re in a natural
position to be concerned about your customer’s personally identifiable data and are
quick to question why it needs to be stored and how it’s going to be used. You’re literally the last line of defense when it comes to consumer data advocacy.
AMBASSADORSHIP

The second superpower that you possess is the magic combination of regulatory compliance and technical acumen. When you’re knowledgeable about governmental laws
in regard to data, industry requirements with which your business must comply, and
general information security best practices, you are indeed an asset. In a business that
has a department that focuses on compliance issues, you’re a partner in determining
the most effective and efficient solutions to protect the business from noncompliance.
For the IT department, you can interpret the compliance requirements into actionable tasks so that they can be successfully implemented.

Tools of the trade
Superpowers are nice, but there are times when a simple tool will suffice. Our favorite
superhero would never use his laser beam eyes to cook a can of pork and beans when
a microwave will do the job. The following are a handful of tools that are available to
you as a DBA.
RETENTION POLICIES

It’s easy for a business to claim the need for data to be perpetually available online.
The “You never know when you need it” anthem can be heard in boardrooms around
the world and that hypnotic rhythm of convenient on-demand access to data is hard to
resist. But this level of availability does come with a price. The cost of hardware and
personnel to support the volume of data increases over time. In addition, you must
take into account legal requirements, such as the Sarbanes–Oxley Act of 2002, that
define archival and destruction requirements for personally identifiable data in an
effort to protect customers and businesses.
If your business doesn’t have a retention policy in place, lead a collaborative effort
to develop a policy. Once the policy is in place, become familiar with it and participate
in its enforcement.
ROLE-BASED PERMISSIONS

Shakespeare once wrote, “All the world’s a stage, and all the men and women merely
players.” There are times when I wonder if he had databases in mind. Considering
that he lived before the word data was even a part of the common lexicon, I guess
that’s unlikely. Nonetheless, this statement certainly applies to anyone who desires

108

CHAPTER 11

Will the real Mr. Smith please stand up?

access to data. You have players that write and read data and you have players that simply read data. There are data elements you wish to disclose to all and data elements
you wish to disclose to a select few. Applying these levels of access individually can be
challenging.
Role-based permissions provide the ability to define data access levels to a given
collection of users, known as a role, in a simplified, consistent, and reliable manner. To
learn more about implementing security for SQL Server databases, refer to http://
msdn.microsoft.com/en-us/library/bb510418.aspx.
DATA SEPARATION

Ask the question “Why does my refrigerator have a separate compartment for a
freezer?” and you’ll likely receive a glare of disdain followed by the obvious answer:
“To keep the items in the fridge from freezing.” When it comes to sensitive data and
nonsensitive data in your database, the same is true: through the separation of these
types of data you can more easily manage their access and integrity.
SQL Server offers database object schemas as a security and organizational feature
to database architecture. When you use this feature, you can efficiently manage rolebased permissions to all database objects contained within a database object schema.
For example, say a database object schema called Customer contains a collection of
tables, views, and stored procedures. Another database object schema called Product
also contains a collection of tables, views, and stored procedures. Access to all the items
in the Customer database object schema can be granted or denied by managing the
permissions to the database object schema itself rather than all the individual objects.
Consider building a schema for personally identifiable data elements that’s distinct
from the ones used for standard data elements. To learn more about database object
schemas, visit http://msdn.microsoft.com/en-us/library/dd283095.aspx.
OBFUSCATION

There have been times, when asked for my federal identification number, that I’ve been
tempted to respond with the hexadecimal conversion of its value. It would successfully
transmit the requested data, but it wouldn’t be useful to anyone unless the recipient
held the key to its conversion to a recognizable format. This temptation is muted by the
great chance that the federal tax authorities may not appreciate my sense of humor.
Several forms of obfuscation are available to you to help protect personally identifiable data. Here are a few examples:




Encryption—This approach requires a hierarchy of symmetric keys, asymmetric
keys, or passwords to encrypt plain text to an unrecognizable format and
decrypt the encrypted value to its original plain-text value. To learn more about
encryption and how it’s implemented, refer to my book Protecting SQL Server
Data, available through Simple-Talk Publishing (http://www.simple-talk.com/
books/sql-books/).
Hashing—This approach is similar to encryption with the exception that once
the plain text has been encrypted it can’t be decrypted. A comparison between
the stored hash and a requested hash implicitly reveals the underlying value.

Summary

109

This approach is commonly used for storage of passwords in a database. You
can use the HASHBYTES method to hash a plain-text value. Here’s a sample of
how the method is used to obfuscate plain text:
SELECT HASHBYTES('SHA1','Plain Text Value');


Repeating character masking—This approach replaces a predefined number of
characters with a single repeating character. Credit card receipts commonly use
this method to hide the account number by replacing all the numbers, except
the last four digits, with an asterisk. Character masking is accomplished through
the use of the REPLICATE method. The following is a sample of how the REPLICATE method is used to obfuscate plain text:
SELECT REPLICATE('*',6) + RIGHT('0123456789',4);



Encoding—This approach use a code to represent a plain-text value. This code
would be understood by a finite group of individuals. This technique is often
used to identify classification of diseases in health-care systems. For example,
the coding used to identify a benign colon polyp is 211.3.

EDUCATION AND EVANGELISM

Among all the tools we’ve discussed there is one that’s a matter of passion and awareness. It’s a tool that doesn’t require a computer, a policy, or a single digit of code. It’s
the education and evangelism of data conservation.
Each of us has a digital footprint that grows on a daily basis. With each transaction,
purchase, or install of a cell phone application, your personally identifiable data is
being transmitted and stored. You can participate in educating the general public in
how to protect themselves. You can raise awareness among other data professionals
whether they’re DBAs, analysts, or data-hungry executives. You can help your
employer by creating and participating in training for employees who handle personally identifiable data on a daily basis. As a result, your influence is expanded in a way
that raises the confidence of your customers and creates an increasingly data-sensitive
environment.

Summary
There are many dimensions to your responsibilities as a DBA: optimizing the performance of a database, constructing a properly normalized data structure, or efficiently
and reliably moving data from one system to another. But in your daily tasks, don’t
lose sight of the fact that you’re a superhero in the world of information security.
The superpowers that you possess must be exercised to prevent atrophy. As you
gaze upon your collection of SQL Server manuals, consider expanding the bookshelf
to include books on the subject of data privacy and security. Through your study on
this subject, you’ll build and maintain an intuitive awareness and sensitivity about personally identifiable data.
The tools available to you are more effective when you’ve invested time in their
mastery. Exercise data retention best practices, permit access to personally identifiable

110

CHAPTER 11

Will the real Mr. Smith please stand up?

data on a need-to-know basis, and isolate sensitive data from standard data. In times of
storage and transmission, always use a method of obfuscation to ensure that data isn’t
intercepted and used by unintended parties. Most important of all, step away from the
computer and passionately educate.

About the author
John Magnabosco, Data Coach at Defender Direct in Indianapolis, is passionate about the security of sensitive data that we all
store in our databases. He’s the author of Protecting SQL Server
Data, published by Simple Talk Publishing. Additionally, he cofounded IndyPASS and IndyTechFest, and blogs regularly at Simple-Talk.com. In 2009 and 2010, John was honored to receive
the SQL Server MVP designation. When his attention is drawn to
recreation, he enjoys writing, listening to music, and tending to
his chile garden.

12 Build your own
SQL Server 2008
performance dashboard
Pawel Potasinski

Have you ever seen any of the fantastic applications for monitoring performance of
database systems? How wonderful your life as a DBA would be if you could sit in
your chair, looking at a big screen full of information showing how your server and
databases are performing and, what’s even more important, how to react when an
issue occurs. This is why powerful tools for monitoring SQL Server databases are
popular and worth their price.
But what if the budget doesn’t allow you to buy any additional software and you
have to rely only on the information that SQL Server itself gives you? The purpose
of this chapter is to give you some ideas on how to use SQL Server features, like
Common Language Runtime (CLR), dynamic management views (DMVs), and SQL
Server Reporting Services (SSRS), to create your own performance dashboard that
will prove helpful in your everyday DBA work.
The approach proposed in this chapter can be implemented in SQL Server 2005
and later (where CLR, DMVs, and SSRS are available).

DMVs as the source of performance-related information
I think that Books Online, the official SQL Server documentation, contains the best
definition of the DMVs: “Dynamic management views and functions return server
state information that can be used to monitor the health of a server instance, diagnose problems, and tune performance.” DMVs are a great source of performancerelated information.
I won’t cover the DMVs in this chapter; instead, I encourage you to become
familiar with all of them on your own. To do so, you can run the query from listing
1 to view all the dynamic objects available in your SQL Server instance, check the

111

112

CHAPTER 12

Build your own SQL Server 2008 performance dashboard

SQL Server documentation for more details, and play with them to see the information they provide. For a good reference on DMVs I recommend a series of articles
titled “A DMV a Day” by Glenn Berry (SQL Server MVP and author of chapter 31) available on Glenn’s blog: http://sqlserverperformance.wordpress.com/2010/05/02/
recap-of-april-2010-dmv-a-day-series/.
Listing 1 Query to list dynamic views and functions
SELECT N'sys.' + name AS [name],
type_desc
FROM sys.system_objects
WHERE name like N'dm[_]%'
ORDER BY name

Later in this chapter, I use a DMV to present a sample solution for creating a performance dashboard, but you can use more of them in your own custom dashboards if
you desire.

Using SQLCLR to get the performance counter values
As a DBA, you’ve probably used Performance Monitor (perfmon.exe) to measure
hardware and operating system performance. The problem with using PerfMon is
that only a subset of its counters is available for a SQL Server 2008 instance using the
sys.dm_os_performance_counters DMV. But using the SQLCLR lets you create a
Microsoft .NET–based user-defined scalar function in every version from SQL Server
2005 on to get the value of every counter registered in the OS of the server. Let me
show you how.
First you have to enable CLR integration in SQL Server. You can do so by setting the
'clr enabled' server configuration option, as shown in the next listing.
Listing 2

T-SQL code for enabling CLR integration in SQL Server 2008

EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;

NOTE

Make sure your company or organization policies allow you to enable
CLR integration. In some environments the security policy is rigorous
and .NET-based user-defined objects can’t be used in SQL Server databases.

Then you can start to develop the .NET code of the function. The easiest way to create a CLR object is to use one of the “big” Visual Studio 2008 or later editions (for
example, Professional Edition) and create a new project from the SQL Server Project
template (see figure 1).

Using SQLCLR to get the performance counter values

113

Figure 1 Creating a
new project based on the
SQL Server Project
template in Visual Studio
2008 Professional

NOTE

If you use a version of Visual Studio that doesn’t contain the SQL Server
Project template, you can use the Class Library template instead. If you
do so, you have to add a project reference to the Microsoft.SqlServer
.Server library to be able to use the Microsoft.SqlServer.Server namespace in the code and to compile the project into a DLL file that can be
imported to the SQL Server database as a new assembly. You’ll find an
article on how to create a SQLCLR assembly and related objects without
using SQL Server Project template here: http://msdn.microsoft.com/en
-us/library/ms131052.aspx.

In the project, create a user-defined function item (or a class if you’re using the Class
Library template). Use the System.Diagnostics namespace and classes from it to get
the performance counter values. Example code for the function may look like that
shown in the next listing.
Listing 3 C# code of user-defined function that returns value of a performance counter
using
using
using
using
using
using

System;
System.Data;
System.Data.SqlClient;
System.Data.SqlTypes;
Microsoft.SqlServer.Server;
System.Diagnostics;

public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlDouble ufn_clr_GetPerfCounterValue(
SqlString CategoryName,
SqlString CounterName,
SqlString InstanceName,
SqlString MachineName
)

114

CHAPTER 12

Build your own SQL Server 2008 performance dashboard

{
MachineName = MachineName.IsNull ? "." : MachineName.Value;
PerformanceCounter p = new PerformanceCounter(
CategoryName.Value,
CounterName.Value,
InstanceName.Value,
MachineName.Value);
float value = p.NextValue();
System.Threading.Thread.Sleep(100);
value = p.NextValue();
return new SqlDouble(value);
}
};

In listing 3 the ufn_clr_GetPerfCounterValue method of the UserDefinedFunctions
class is defined with the SqlFunction attribute, which describes the type of an object.
The PerformanceCounter class and its NextValue method are used to get the value of
a particular counter.
NOTE

Notice that in this example the NextValue method is called twice. This is
because the first call initiates the counter and the second one gets the
value. You may perform the following test: remove one call of the NextValue method, deploy the project to the database, use the function, and
verify that the value returned by the function doesn’t represent the value
of the counter.

The assembly containing the code in listing 3 has to be given the UNSAFE permission
set in SQL Server database. According to security best practices, you should do the
following:






Sign the assembly with a strong name (this can be done in Visual Studio in the
project properties window or by using the sn.exe command-line utility).
In SQL Server, create a certificate based on the strong name contained in the
assembly’s DLL file.
Create a login for the created certificate and grant the UNSAFE permission to
the login.

See the next listing for sample code that implements these steps.
Listing 4

T-SQL code for granting UNSAFE permission to the assembly

USE master;
GO
CREATE ASYMMETRIC KEY SQLCLRKey
FROM EXECUTABLE FILE = 'D:\SQLPerformance.dll';
CREATE LOGIN SQLCLRLogin FROM ASYMMETRIC KEY SQLCLRKey;
GRANT UNSAFE ASSEMBLY TO SQLCLRLogin;
GO

Many DBAs prefer the approach of a separate administrator’s database where they can
store their tools and T-SQL code for common use. Let’s assume you’ve created such a

Sample solution for performance monitoring

115

database called DBAToolbox. Put the assembly containing the dbo.ufn_clr_GetPerf
CounterValue function in this database using the code from the following listing
(assuming the DLL file is located on drive D).
Listing 5 T-SQL code for adding the assembly and the function
USE DBAToolbox;
GO
CREATE ASSEMBLY [SQLPerformance]
AUTHORIZATION [dbo]
FROM FILE = 'D:\SQLPerformance.dll'
WITH PERMISSION_SET = UNSAFE;
GO
CREATE FUNCTION [dbo].[ufn_clr_GetPerfCounterValue](
@CategoryName [nvarchar](4000),
@CounterName [nvarchar](4000),
@InstanceName [nvarchar](4000),
@MachineName [nvarchar](4000)
)
RETURNS [float]
AS
EXTERNAL NAME
[SQLPerformance].[UserDefinedFunctions].[ufn_clr_GetPerfCounterValue]
GO

NOTE

If you use Visual Studio and the SQL Server Project template, the code
from listing 5 will be executed by Visual Studio during project deployment. If that’s the case, you won’t have to run it yourself.

Sample solution for performance monitoring
For many counters there are some well-known best practices that tell you about the
acceptable values of the counter. Every time the counter goes beyond its “valid” range,
pay attention to the counter because it may be a symptom of a performance problem.
One of the possible ways to evaluate performance counter values against their
acceptable limits is to store the limits for each counter in a table and compare them
against the values of the counters. To store the counter limits in the administrator’s
database, create a table and fill it with your baseline data (see the next listing).
Listing 6 T-SQL code for creating table and generating baseline
USE DBAToolbox;
GO
IF OBJECT_ID('dbo.PerfCounters', 'U') IS NOT NULL
DROP TABLE dbo.PerfCounters;
GO
CREATE TABLE dbo.PerfCounters (
PerfCounterID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
Category nvarchar(4000) NOT NULL,
Counter nvarchar(4000) NOT NULL,
Instance nvarchar(4000) NOT NULL DEFAULT '',
IsSQLCounter bit NOT NULL,

116

CHAPTER 12

Build your own SQL Server 2008 performance dashboard

FriendlyName nvarchar(256) NOT NULL,
IsRatioBased bit NOT NULL,
IsActive bit NOT NULL,
BestPractice nvarchar(4000) NULL,
UpperLimit float NULL,
LowerLimit float NULL
);
GO
INSERT INTO dbo.PerfCounters (
Category, Counter, Instance,
IsSQLCounter, FriendlyName, IsRatioBased, IsActive,
BestPractice, UpperLimit, LowerLimit
)
VALUES
(
'Processor', '% Processor Time', '_Total',
0, 'CPU', 0, 1,
'Should be less than 80%', 80, NULL
),
(
'PhysicalDisk', 'Avg. Disk Queue Length', '_Total',
0, 'Avg. Disk Queue', 0, 1,
'Should not be permanently greater than 0', 1, NULL
),
(
'MSSQL$SQL2008R2:Buffer Manager', 'Page life expectancy', '',
1, 'Page Life Expectancy', 0, 1,
'Should not drop below 1000', NULL, 1000
),
(
'MSSQL$SQL2008R2:Databases', 'Percent Log Used', 'AdventureWorks',
1, 'AdventureWorks database - % log used', 1, 1,
'Should not reach 90%', 90, NULL
), (
'MSSQL$SQL2008R2:Plan Cache', 'Cache hit ratio', '_Total',
1, 'Cache hit ratio', 1, 1,
'Should not fall below 90%', NULL, 90
);

Each counter instance is stored in a single row and is described by its category, name,
and optionally by an instance (columns Category, Counter, and Instance). For each
counter there’s a flag, IsSQLCounter, indicating whether the counter comes from the
sys.dm_os_performance_counters DMV (value of 1) or from the PerfMon counters
set (value of 0). Each counter has its friendly name (the FriendlyName column) to
display in the report so you can quickly read the report by the names of the criteria
that are being measured. Moreover, there’s an explanation of the acceptable range for
the counter stored in the BestPractice column. This sentence can show up in the
report every time a warning is raised for a particular counter. The columns UpperLimit and LowerLimit store the upper and lower limits for the acceptable range of values for the particular counter. You can set the limits according to some best practices
or reflecting the baseline that has been established for the system. The IsActive column works as the enable/disable flag for each counter. The IsRatioBased column