Tải bản đầy đủ
5 Example: Listening for messages on the bus

5 Example: Listening for messages on the bus

Tải bản đầy đủ

Example: Listening for messages on the bus

401

Once you make these changes, you can run the service and it’ll start up, authenticate, and start listening for messages coming from the bus. You’ll then need to perform similar actions on the client.

17.5.2 Connecting to the service
In the previous section, we looked at what you had to do to connect a service to the
Service Bus. You had to change the bindings to point to the bus and update the
address. You also had to add some authentication so that the bus knew you were
allowed to use your namespace.
You now need to follow the same steps to change the app.config for the client. You
need to change the client binding so it’s sending messages to the bus. For this example,
you can name your endpoint SBRelayEndpoint, with the same address the service used.

name="SBRelayEndpoint"
address=
"sb://stringreversalinc.servicebus.windows.net/processtring"
binding="netTcpRelayBinding"
contract="StringReversalLibrary.Contract.IReverseString"
behaviorConfiguration="sharedSecretClientCredentials"
/>


The client is going to have to authenticate to the Service Bus as well—you can configure it to use a shared secret. Use the Maine Reversal issuer from section 17.3.7 of this
chapter. Keep in mind that there are two endpoints: one for the ACS service, and one
for the Service Bus. They don’t share issuers. You can configure the credentials by
changing the behavior of the service in the app.config file:



➥ issuerSecret=" ltSsoI5l+8DzLSmvsVOhOmflAsKHBYrGeCR8KtCI1eE=" />




Now the client can call your service from anywhere and always be able to connect to it.
This makes it easy to provision new customers. In the old, direct method, you had to
reconfigure the firewalls to let the new customer through. Now you can point them to
the Service Bus address and give them the credentials they’ll need.
The binding we used in this example is based on TCP, which is one of the fastest
bindings you can use in WCF. It adds the relay capabilities to allow us to message
through the bus instead of directly. There are other bindings available that support
using the relay.
Now that we’ve covered what AppFabric can do today, let’s consider what its future
might hold.

Download from Wow! eBook

402

CHAPTER 17

Connecting in the cloud with AppFabric

17.6 The future of AppFabric
We normally don’t talk about futures in a book because it’s entirely possible that priorities will shift and particular features will never ship. The team has been open about
what they consider their next steps, but any of this could change, so don’t base important strategy on unofficial announcements.
With that said, here are a couple of current goals:
Extending OAuth for web identity—The ACS team wants to extend its use of OAuth
and SWT to include common web identity platforms. These are identities on the
web that people commonly have, instead of traditional enterprise identities
from Active Directory. It will be possible for a user to choose to log into a site
not just with an ACS-provisioned issuer ID, or a SAML token, but also with their
Google, Yahoo!, and Live ID accounts. This is accomplished by ACS federating
with those directories with the updated OAuth protocols. This will also include
Facebook and any OpenID provider. This is exciting for people who are building consumer-centric applications.
Support for all WS-* protocols—Right now you can only protect REST services with
ACS, and the ACS team wants to extend support for SOAP-based services as well.
Once they have support for WS-*, they’ll have support for WS-Federation and
WS-Trust, which makes it easier to federate enterprise identities. This change
will also give them support for CardSpace, which is Windows’ identity selector
for claims-based authentication systems.

17.7 Summary
This chapter gave you a tour of the Windows Azure platform AppFabric. We looked at
how it’s a cousin of Windows Server AppFabric, and how maybe over time they’ll merge.
The first component of AppFabric we looked at was ACS. ACS’s primary concern is
securing REST-based services, and it does this with the OAuth and SWT open standards. ACS makes it easy to federate in other token protocols, like SAML, and makes it
easy to transform tokens from other parties into a form that your application can consume. You can configure ACS with either the ACM command-line application, or the
ACS configuration browser. Both applications use the underlying management REST
service, which you can also use directly if you want.
We looked at how a client first authenticates with ACS to prove who they are, and
then ACS gives them a token to use to gain access to the service. This removes the concern of authentication from the application, which in turn simplifies the codebase,
and makes it easier to adjust to new rules as they evolve.
The second component of AppFabric we looked at was the Service Bus. The Service Bus is a migration of a common enterprise service bus into the cloud. The Service
Bus’s goal is to make it easy for consumers and services to communicate with each
other, no matter where they are or how they’re connected.

Download from Wow! eBook

Summary

403

We adjusted our example to use a netTCP binding, but also to relay the messages
through the cloud to bypass any firewalls or proxies that might be in the way. You had
to make a few adjustments on the service and the client to make this possible. Both
sides have to authenticate to ACS before they connect to the bus—this is how you
secure your service when it’s connected to the bus.
Although this was a simple chapter designed to help you get a feel for AppFabric,
we hope that you’re comfortable enough with the basics to know when it might make
sense to leverage AppFabric. You should be prepared enough to explore on your own
and maybe dive deep with a dedicated book on the topic.
The next chapter will zoom out from all of this detail and help you use the diagnostics manager to understand what your applications are doing. The diagnostics
manager will help you determine what’s happening, and the service management API
will help you do something about it.

Download from Wow! eBook

Running a healthy
service in the cloud

This chapter covers
Getting to know the Windows Azure Diagnostics
platform
Using logging to determine what’s happening
with your service
Using the service management APIs
Making your service self-aware with logging and
service management

Building an application and deploying it to Azure are just the first steps in a hopefully long application lifecycle. For the rest of its life, the application will be in
operation mode, being cared for by loving IT professionals and support developers. The focus shifts from writing quality code to running the application and
keeping it healthy. Many tools and techniques are out there to help you manage
your infrastructure.
What healthy means can be different for every application. It might be a measure
of how many simultaneous users there are, or how many transactions per second

404

Download from Wow! eBook

Diagnostics in the cloud

405

are processed, or how fast a response can be returned to the service caller. In many
cases, it won’t be just one metric, but a series of metrics. You have to decide what
you’re going to measure to determine a healthy state, and what those measurements
must be to be considered acceptable. You must make sure these metrics are reasonable and actionable. A metric that demands that the site be as fast as possible isn’t
really measurable, and it’s nearly impossible to test for and fix an issue phrased like
that. Better to define the metric as an average response time for a standard request.
To keep your application healthy, you need to instrument and gather diagnostic
data. In this chapter, we’re going to discuss how you perform diagnostics in the cloud
and what tools Azure provides to remediate any issues or under-supply conditions in
your system.

18.1 Diagnostics in the cloud
At some point you might need to debug your code, or you’ll want to judge how
healthy your application is while it’s running in the cloud. We don’t know about you,
but the more experienced we get with writing code, the more we know that our code
is less than perfect. We’ve drastically reduced the amount of debugging we need to do
by using test-driven development (TDD), but we still need to fire up the debugger
once in a while.
Debugging locally with the SDK is easy, but once you move to the cloud you can’t
debug at all; instead, you need to log the behavior of the system. For logging, you can
use either the infrastructure that Azure provides, or you can use your own logging
framework. Logging, like in traditional environments, is going to be your primary
mechanism for collecting information about what’s happening with your application.

18.1.1 Using Azure Diagnostics to find what’s wrong
Logs are handy. They help you find where the problem is, and can act as the flight
data recorder for your system. They come in handy when your system has completely
burned down, fallen over, and sunk into the swamp. They also come in handy when
the worst hasn’t happened, and you just want to know a little bit more about the
behavior of the system as it’s running. You can use logs to analyze how your system is
performing, and to understand better how it’s behaving. This information can be critical when you’re trying to determine when to scale the system, or how to improve the
efficiency of your code.
The drawback with logging is that hindsight is 20/20. It’s obvious, after the crash,
that you should’ve enabled logging or that you should’ve logged a particular segment
of code. As you write your application, it’s important to consider instrumentation as
an aspect of your design.
Logging is much more than just remote debugging, 1980s-style. It’s about gathering a broad set of data at runtime that you can use for a variety of purposes; debugging is one of those purposes.

Download from Wow! eBook

406

CHAPTER 18

Running a healthy service in the cloud

18.1.2 Challenges with troubleshooting in the cloud
When you’re trying to diagnose a traditional on-premises system, you have easy access
to the machine and the log sources on it. You can usually connect to the machine with
a remote desktop and get your hands on it. You can parse through log files, both those
created by Windows and those created by your application. You can monitor the health
of the system by using Performance Monitor, and tap into any source of information
on the server. During troubleshooting, it’s common to leverage several tools on the
server itself to slice and dice the mountain of data to figure out what’s gone wrong.
You simply can’t do this in the cloud. You can’t log in to the server directly, and you
have no way of running remote analysis tools. But the bigger challenge in the cloud is
the dynamic nature of your infrastructure. On-premises, you have access to a static pool
of servers. You know which server was doing what at all times. In the cloud, you don’t
have this ability. Workloads can be moved around; servers can be created and destroyed
at will. And you aren’t trying to diagnose the application on one server, but across a multitude of servers, collating and connecting information from all the different sources.
The number of servers used in cloud applications can swamp most diagnostic analysis
tools. The shear amount of data available can cause bottlenecks in your system.
For example, a typical web user, as they browse your website and decide to check
out, can be bounced from instance to instance because of the load balancer. How do
you truly find out the load on your system or the cause for the slow response while
they were checking out of your site? You need access to all the data that’s available on
terrestrial servers and you need the data collated for you.
You also need close control over the diagnostic data producers. You need an easy way
to dial the level of information from debug to critical. While you’re testing your systems, you need all the data, and you need to know that the additional load it places on
the system is acceptable. During production, you want to know only about the most critical issues, and you want to minimize the impact of these issues on system performance.
For all these reasons, the Windows Azure Diagnostics platform sits on top of what is
already available in Windows. The diagnostics team at Microsoft has extended and
plugged in to the existing platform, making it easy for you to learn, and easy to find
the information you need.

18.2 Diagnostics in the cloud is just like normal (almost)
With the challenges of diagnostics at cloud-scale, it’s amazing that the solution is so
simple and elegant. Microsoft chose to keep everything that you’re used to in its place.
Every API, tool, log, and data source is the same way it was, which keeps the data
sources known and well documented. The diagnostics team provides a small process
called MonAgentHost.exe that’s started on your instances.
The MonAgentHost process is started automatically, and it acts as your agent on the
box. It knows how to tap into all the sources, and it knows how to merge the data and
move it to the correct locations so you can analyze it. You can configure the process on
the fly without having to restart the host it’s running on. This is critical. You don’t

Download from Wow! eBook

Diagnostics in the cloud is just like normal (almost)

Your code
IIS
logs

Windows
logs
Perf
counters

IIS FR
logs

Trace
listener

MonAgentHost
Running
config

Crash
dumps

Buffer

Windows server

Stored
config
Storage
account

407

Figure 18.1 The MonAgentHost.exe
process gathers, buffers, and transfers
many different sources of diagnostic
data on your behalf. It’s the agent we’ll
be focusing on in this section.

want to have to take down a web role instance just to dial up the amount of diagnostic
information you’re collecting. You can control data collection across all your instances
with a simple API. All the moving parts of the process are shown in figure 18.1. Your
role instance must be running in full-trust mode to be able to run the diagnostic
agent. If your role instance is running in partial trust, it won’t be able to start.
As the developer, you’re always in control of what’s being collected and when it’s
collected. You can communicate with MonAgentHost by submitting a configuration
change to the process. When you submit the change, the process reloads and starts
executing your new commands.

18.2.1 Managing event sources
The local diagnostic agent can find and access any of the normal Windows diagnostic
sources; then it moves and collates the data into Windows Azure storage. The agent
can even handle full memory dumps in the case of an unhandled exception in one of
your processes.
You must configure the agent to have access to a cloud storage account. The agent
will place all your data in this account. Depending on the source of the data, it’ll
either place the information in BLOB storage (if the source is a traditional log file), or
it’ll put the information in a table.
Some information is stored in a table because of the nature of the data collection
activity. Consider when you’re collecting data from Performance Monitor. This data is
usually stored in a special file with the extension .blg. Although this file could be created and stored in BLOB storage, you would have the hurdle of merging several of
these files to make any sense of the data (and the information isn’t easily viewed in
Notepad). You generally want to query that data. For example, you might want to find
out what the CPU and memory pressure on the server were for a given time, when a
particular request failed to process.
Table 18.1 shows what the most common sources of diagnostic information are,
and where the agent stores the data after it’s collected. We’ll discuss how to configure
the sources, logs, and the (tantalizingly named) arbitrary files in later sections.
Table 18.1

Diagnostic data sources

Data source

Default

Destination

Configuration

Arbitrary files

Disabled

BLOB

DirectoryConfiguration class

Crash dumps

Disabled

BLOB

CrashDumps class

Download from Wow! eBook

408

CHAPTER 18
Table 18.1

Running a healthy service in the cloud

Diagnostic data sources (continued)

Data source

Default

Destination

Configuration

Trace logs

Enabled

Azure table

web.config trace listener

Diagnostic infrastructure logs

Enabled

Azure table

web.config trace listener

IIS failed request logs

Disabled

BLOB

web.config traceFailedRequests

IIS logs

Enabled

BLOB

web.config trace listener

Performance counters

Disabled

Azure table

PerformanceCounterConfiguration
class

Windows event logs

Disabled

Azure table

WindowsEventLogsBufferConfiguration
class

The agent doesn’t just take the files and upload them to storage. The agent can also
configure the underlying sources to meet your needs. You can use the agent to start
collecting performance data, and then turn the source off when you don’t need it anymore. You do all this through configuration.

18.2.2 It’s not just for diagnostics
We’ve been focusing pretty heavily on the debugging or diagnostic nature of the Windows Azure Diagnostics platform. Diagnostics is the primary goal of the platform, but
you should think of it as a pump of information about what your application is doing.
Now that you no longer have to manage infrastructure, you can focus your attention
on managing the application much more than you have in the past.
Consider some of the business possibilities you might need to provide for, and as
you continue to read this chapter, think about how the diagnostic tools can make
some of these scenarios possible.
There are the obvious scenarios of troubleshooting performance and finding out
how to tune the system. The common process is that you drive a load on the system and
monitor all the characteristics of the system to find out how it responds. This is a good
way to find the limits of your code, and to perform A/B tests on your changes. During
an A/B test, you test two possible options to see which leads to the better outcome.
Other scenarios aren’t technical in nature at all. Perhaps your system is a multitenant system and you need to find out how much work each customer does. In a medical imaging system, you’d want to know how many images are being analyzed and
charge a flat fee per image. You could use the diagnostic system to safely log a new image
event, and then once a day move that to Azure storage to feed into your billing system.
Maybe in this same scenario you need a rock-solid audit that tells you exactly who’s
accessed each medical record so you can comply with industry and government regulations. The diagnostic system provides a clean way to handle these scenarios.
An even more common scenario might be that you want an analysis of the visitors
to your application and their behaviors while they’re using your site. Some advanced

Download from Wow! eBook

409

Configuring the diagnostic agent

e-commerce platforms know how their customers shop. With the mountains of data
collected over the years, they can predict that 80 percent of customers in a certain scenario will complete the purchase. Armed with this data, they can respond to a user’s
behavior and provide a way to increase the likelihood that they’ll make a purchase.
Perhaps this is a timely invitation to a one-on-one chat with a trained customer service
person to help them through the process. The diagnostics engine can help your application monitor the key aspects of the user and the checkout process, providing feedback to the e-commerce system to improve business. This is the twenty-first-century
version of a salesperson in a store asking if they can help you find anything.
To achieve all of these feats of science with the diagnostic agent, you need to learn
how to configure and use it properly.

18.3 Configuring the diagnostic agent
If you’re writing your code in Visual Studio, the default Azure project templates
include code that automatically starts the diagnostic agent, inserts a listener for the
agent in the web.config file, and configures the agent with a default configuration.
You can see this code in the OnStart() method in the WebRole.cs file.
public override bool OnStart()
{
DiagnosticMonitor.Start("DiagnosticsConnectionString");
RoleEnvironment.Changing += RoleEnvironmentChanging;
return base.OnStart();
}

q

Starts
diagnostic
agent

The agent starts q with the default configuration, all in one line. The line also
points to a connection string in the service configuration that provides access to the
Azure storage account you want the data to be transferred to. If you’re running in the
development fabric on your desktop computer, you can configure it with the wellknown development storage connection string UseDevelopmentStorage=true. This
string provides all the data necessary to connect with the local instance of development storage.
You also need to create a trace listener for the diagnostic agent. The trace listener
allows you to write to the Azure trace log in your code. Create a trace listener by adding the following lines in your web.config. If you’re using a standard template, this
code is probably already included.



➥ name="AzureDiagnostics">



Download from Wow! eBook

410

CHAPTER 18

Running a healthy service in the cloud





After you’ve set up the trace listener, you can use the trace methods to send information to any trace listeners. When you use them, set a category for the log entry. The
category will help you filter and find the right data later on. You should differentiate
between critical data that you’ll always want and verbose logging data that you’ll want
only when you’re debugging an issue. You can use any string for the trace category you
want, but be careful and stick to a standard set of categories for your project. If the categories vary too much (for example, you have critical, crit, and important), it’ll be too
hard to find the data you’re looking for. To standardize on log levels, you can use the
enumerated type LogLevel in Microsoft.WindowsAzure.Diagnostics. To write to
the trace log, use a line like one of the following:
using System.Diagnostics;
System.Diagnostics.Trace.WriteLine(string.Format("Page loaded on {0}",
➥ System.DateTime.Now, "Information");
System.Diagnostics.Trace.WriteLine("Failed to connect to database. ",
➥ "Critical");

Only people who have access to your trace information using the diagnostics API will
be able to see the log output. That being said, we don’t recommend exposing sensitive
or personal information in the log. Instead of listing a person’s social security number,
refer to it in an indirect manner, perhaps by logging the primary key in the customer

Figure 18.2 When writing to the trace channel, the entries are stored by the Windows Azure
diagnostic trace listener to the Azure log, which can then be gathered and stored in your storage
account for analysis. The trace output is also displayed in the dev fabric UI during development.

Download from Wow! eBook

Configuring the diagnostic agent

411

table. That way, if you need the social security number, you can look it up easily, but it
won’t be left out in plain text for someone to see.
Another benefit of using trace is that the trace output appears in the dev fabric UI,
like in figure 18.2.
At a simple level, this is all you need to start the agent and start collecting the most
common data. The basic diagnostic setup is almost done for you out of the box
because there’s so much default configuration that comes with it.

18.3.1 Default configuration
When the diagnostic agent is first started, it has a default configuration. The default
configuration collects the Windows Azure trace, diagnostic infrastructure logs, and IIS
7.0 logs automatically. These are the most common sources you’re likely to care about
in most situations.
When you’re configuring the
G
A
S
agent, you’ll probably follow a comGrab the config
Adjust the config Start the agent
mon flow. You’ll grab the current
Figure 18.3 Use the GAS process to configure and
running configuration (or a default
work with the diagnostic agent. Grab the config,
configuration) from the agent,
adjust the config, and then start the agent.
adjust it to your purposes, and then
Sometimes you’ll grab the default config and
restart the agent. This configuration
sometimes the running config.
workflow is shown in figure 18.3.
By default, the agent buffers about 4 GB of data locally, and ages out data automatically when the limit is reached. You can change these settings if you want, but most
people leave them as is and just transfer the data to storage for long-term keeping.
Although the agent ages out data locally to the role instance, the retention of data
after it’s moved to Azure storage is up to you. You can keep it there forever, dump it
periodically, or download it to a local disk. After it’s been transferred to your account,
the diagnostic agent doesn’t touch your data again. The data will just keep piling up if
you let it.
In the next few sections, we’ll look at some of the common configuration scenarios, including how to filter the log for the data you’re interested in before it’s
uploaded to storage.

18.3.2 Diagnostic host configuration
You can change the configuration for the agent with code that’s running in the role
that’s collecting data, code that’s in another role, or code that’s running outside
Azure (perhaps a management station in your data center).
CHANGING THE CONFIGURATION IN A ROLE

There will be times when you want to change the configuration of the diagnostic
agent from within the role the agent is running in. You’ll most likely want to do this
during an OnStart event, while an instance for your role is starting up. You can
change the configuration at any time, but you’ll probably want to change it during

Download from Wow! eBook