Tải bản đầy đủ - 0trang
UI vs. Render Thread, and BitmapCache Mode
Render thread This is designed to be very lightweight, and it is mainly responsible for
stitching together textures to hand off to the Graphics Processing Unit (GPU). The Render
thread handles simple (double) animations, translate, scale, rotate and perspective transforms,
opacity (but not opacity masks), and rectangular clipping—all of which can be hardwareaccelerated via the GPU. Note that for scale transforms, whenever the scale crosses the 50 percent threshold of the original (or previously rasterized) texture size, the visual is re-rasterized
on the UI thread.
Briefly, the initial template expansion and rasterization for an element occurs on the UI thread.
These rendered elements are then cached in memory as bitmaps and handed off to the Render
thread which works with the GPU to draw the frame and add it to the back buffer for display. From
this point, if you don’t make any changes to the cached element, it doesn’t have to be redrawn. Any
changes you might make to the cached element fall into two categories:
A change that can be handled by the GPU. Some examples are rectangular clipping, Translate
Transform, or ScaleTransform.
A change that requires the element to be re-rendered. This can include a color change, nonrectangular clipping, opacity mask, padding or margin changes, an so on. In this case, the
cached bitmap will be deleted, and the element will be re-rendered on the UI thread to generate a fresh bitmap for caching.
So, there are at least two threads in the system, plus one or more additional threads that the application itself can choose to create, either indirectly or directly, as summarized in Figure 14-2.
Note Version 7.1 also introduces a third system thread that is intended specifically to
In general, you want to take advantage of the Render thread as much as you can to offload work
from the UI thread. To see where there might be opportunities for optimizing between the UI thread
and the Render thread, you can turn on display of redraw regions and bitmap caching by using the
Application settings, as shown in the following code snippet. Note that the standard Microsoft Visual
Studio project templates generate code for this in the App.xaml.cs, which you can uncomment.
Application.Current.Host.Settings.EnableRedrawRegions = true;
Application.Current.Host.Settings.EnableCacheVisualization = true;
506 Part III Extended Services
Pass Rastered Images
Pass Rastered Images
Figure 14-2 The Render thread, UI thread, and background threads.
Consider an application that moves a ball around the screen. Depending on how you write the
code, this will result in different rendering behavior. The following sample application (the Bouncing
Ball solution in the sample code) does not take advantage of the Render thread and GPU; instead, it
implements timer-based callbacks on the UI thread. The application also responds to user touch, via
a GestureListener, so even more work is being done on the UI thread. The application provides four
AppBar buttons. With the first three, the user can toggle EnableRedrawRegions, BitmapCache mode,
and EnableCacheVisualization, respectively. This allows you to see the effects of the design choices on
the UI rendering behavior (you’ll examine the purpose of the fourth button later).
private void appbarRedraw_Click(object sender, EventArgs e)
private void appbarCache_Click(object sender, EventArgs e)
if (ball.CacheMode == null)
ball.CacheMode = new BitmapCache();
Chapter 14 Go to Market 507
ball.CacheMode = null;
private void appbarCacheViz_Click(object sender, EventArgs e)
Figure 14-3 illustrates the application with EnableRedrawRegions turned on, showing which elements are being redrawn with each frame (the colors are arbitrary; they cycle between purple, yellow,
and magenta). As the ball bounces around the screen, the redraw region is a rectangle that expands
or contracts to include all UI elements. Thus, the redraw rectangle varies in size, depending on where
the ball is in relation to the other UI elements.
Figure 14-3 The regions of your application that need to be redrawn can vary over time.
The redrawing highlighted when the EnableRedrawRegions flag is turned on is essentially drawing in software (on the CPU) and does not offload any work to the Render thread and the GPU. In all
cases, the first time a visual is drawn, it will be drawn in software, but the aim generally should be to
draw a visual once on the UI thread, and then hand it off to the Render thread (which in turn, hands
it off to the GPU) for all subsequent drawing. So, what you want to avoid is the situation in which you
can see the same region being drawn repeatedly (as evidenced by the colors changing repeatedly).
The bottom line is that if you see something in your application that’s frequently changing color,
it means it’s being frequently redrawn. You should therefore examine your code to see if you can
508 Part III Extended Services
Figure 14-4 demonstrates the application again, this time with EnableCacheVisualization turned
on, which shows the areas of the application that are cached. The un-cached surfaces are rendered
in software and the cached surfaces are passed to the GPU and rendered in hardware. With this flag
turned on, each element/texture that is handed off to the GPU is tinted blue and has a transparency
applied. This way, you can see where textures are overlapping. The darkest shades indicate that multiple textures are lying atop one another.
Note Turning on EnableCacheVisualization degrades performance, so you should not
attempt to measure frame rates while this is active. Note also that the behavior of this
flag on Windows Phone is different from the behavior on desktop Silverlight: on desktop Silverlight, the tinted areas are areas that are not drawn by the GPU; on Silverlight for
Windows Phone, the tinted areas are those that are drawn by the GPU.
Figure 14-4 Cache visualization with BitmapCache mode turned off (on the left) and on (right).
You can specify that an element should have its rendered bitmap cached in XAML, as shown here:
Or, you can specify that it should be cached in code:
ball.CacheMode = new BitmapCache();
Chapter 14 Go to Market 509
The effect of setting BitmapCache mode is to skip the render phase for the element, which will
have a significant effect on performance. The screenshot on the right in Figure 14-4 shows what
happens if you set the CacheMode on the ball to BitmapCache. The UI thread works considerably less
to display the ball, the frame rate on this application goes up, and the fill rate goes down. The frame
rate, fill rate, and the other performance counters are discussed in more detail in Chapter 8, “Diagnostics and Debugging.”
When you use bitmap caching, you should group cached elements together, following non-cached
elements in the visual tree. Do not interleave cached/non-cached elements. This way, non-cached elements can be included in a single intermediate background texture, which improves performance.
Note There’s a downside to caching: it takes up additional memory, and the more you
cache, the higher your fill rate will be. Therefore, you shouldn’t simply cache everything. Instead, you should profile your application by using EnableRedrawRegions and
EnableCacheVisualization, and look for opportunities to optimize caching. If you do this
right, you should see the frame rate go up, and the FillRate count go down.
The fourth App Bar button is implemented to simulate an operation that needs to do work on the
private void appbarBlockUI_Click(object sender, EventArgs e)
If you start the ball bouncing and then tap this button, the bouncing will stop for three seconds.
This demonstrates the critical flaw in this application’s design, because so much relies on work being
done on the UI thread.
The previous example relied on user interaction, and a lot of work was done on the UI thread.
Figure 14-5 shows an alternative example of a bouncing ball (the BouncingStoryboard solution in the
sample code); this one involves only minimal user interaction (start and stop App Bar buttons) and
does most of the work on the Render thread. In addition, the animation is fixed and declared in XAML
via a storyboard. This contains two main animations: one that moves the ball from top to bottom
(using the Canvas.Top attached property), and another that moves the ball from left to right (using
Canvas.Left). The top-bottom animation itself contains a third animation, which uses one of the Silverlight EasingFunctions to provide a bouncing motion.
510 Part III Extended Services
The big change here is to move the per-frame manual animation (which was all done on the UI
thread) to a double animation implemented via a storyboard, which can all be done by the Render
The App Bar buttons are there to start and stop the storyboard. There is also a button to toggle
redraw regions, and one to block the UI thread. Because this ball is part of a simple animation and
is automatically cached, there’s no button to set CacheMode. Observe also that if you block the UI
thread, in this application, the ball will continue to bounce because the animation is all being done on
the Render thread.
Figure 14-5 Using the Render thread for animations reduces the amount of redrawing.
The following objects will be automatically cached:
The target of any storyboard-driven animation that uses the Render thread (as shown in the
The target of any plane projection, either static or animated.
All MediaElement objects.
Child items in a ScrollViewer or ListBox.
Chapter 14 Go to Market 511
As discussed in Chapter 8, the three most important performance counters to monitor are the UI
Thread Frame Rate, the Render Thread Frame Rate, and the Fill Rate. If your UI thread is overloaded,
you’ll see the UI Thread Frame Rate drop, which is a sign that you need to offload work from the UI
thread. From the other end, the Fill Rate corresponds to how hard the GPU is working, and as the Fill
Rate exceeds 2, the Render Thread Frame Rate will drop. So, a high Fill Rate is a sign that you need to
minimize your use of UI elements (by reducing the number and/or complexity of your elements, setting BitmapCache mode, avoiding interleaved cached/non-cached elements, and so on).
UI Layout and ListBoxes
The ListBox is an obvious element for which scale has significant performance implications. UI layout
is the most expensive operation performed on the UI thread. There’s nothing to stop you from creating complex layouts with nested Grids, StackPanels, nested ListBoxes, plus complex ValueConverters,
custom controls, and so on. This is bad enough if you only have one of these on your page, but as
soon as you use such a complex element as an item within a ListBox, the scale issues become more
obvious. Plus, of course, there’s nothing to stop you from putting many thousands of these items into
your ListBox. On top of that, you could be sourcing your data from a remote service over the Internet, and the data might include large images or large volumes of redundant metadata. The potential
permutations implied in this kind of model represent a potential recipe for bad performance and an
Here are some UI best practice guidelines for optimizing runtime performance and responsiveness
when using ListBoxes:
Avoid using complex item data templates. Most particularly, don’t use nested ListBoxes,
and don’t use UserControls or custom controls. Also, ensure that you have the data template
in a fixed-size container such as a Grid with an explicit Height set on it. As a performance optimization, the ListBox calculates the height of three screen’s worth of items (the one currently
visible, plus one above and below), and this doesn’t work if your items vary in size.
Avoid using complex converters. If possible, try to perform conversions in the data source
request or as you pull the data into a local cache, before attempting to render it in the UI.
Offload work to background threads. Typical candidates for this include the retrieval,
processing, and caching of item data, leaving only the final data-binding on the UI thread. The
trade-off here is that doing work on a background thread before dispatching to the UI thread
might cause items to load more slowly. However, the advantage is that the UI will remain
responsive. If you do it right, even the slower load might be apparent only on the initial batch
of items, because you can continue working on the background while the user is exercising the
UI. As you’re doing work on the background thread, be sure to yield control frequently (perhaps by a simple Thread.Sleep) so that the OS can schedule the UI thread more frequently.
Virtualize your data if you can. This is particularly relevant if you need to manipulate raw
data and compute the final data for rending in the UI. The ListBox virtualizes the UI (via the
VirtualizingStackPanel, which is the default items host for the ListBox), such that if you have,
for example, 1,000 items, only 3 screen’s worth of UI elements are created, and then these
512 Part III Extended Services
are recycled as the user scrolls other items into view. However, the data is not virtualized; the
ListBox property you data-bind is ItemsSource, which is defined as an IEnumerable. This means
that the only way the ListBox knows the size of the list is to enumerate all the items. This information is required so that the ListBox can configure itself correctly; for example, to calculate
the scrollbar indicator. More significantly, it assumes that all of your data items are constructed
completely before handing off to the UI. You can mitigate this by binding to collections that
implement IList, which provides Count, IndexOf, and indexer (this) methods. In your implementation, you can be smart about constructing each of your computed items for the list.
For example, you can defer the composition/computation processing for items until they’re
required for the UI.
Consider your caching strategy. If you’re virtualizing your data, you have the opportunity
to do some intelligent caching. Depending on the context, you might also be able to segment your data into static and dynamic components. You can also consider the standard .NET
technique of WeakReferences.
Minimize work while scrolling. First, don’t do anything on the UI thread while the user is
scrolling, because that would make the UI less responsive. Second, don’t do any work for items
that are not visible to the user; this implies waiting for the user to stop scrolling before you can
calculate which items are now visible. You could also segment your data templates according
to whether the user is scrolling; that is, use a simple template if she’s scrolling (perhaps just
text, or a low-resolution image), and a more sophisticated template for items that are visible
when she has stopped.
More UI Performance Tips
Apart from the aforementioned ListBox-specific issues, there are other UI-related performance optimizations that you can consider:
JPG versus PNG It is faster to decode JPG images than PNG images, so you should prefer
JPG over PNG. The only case for which you need to use PNG is if your images have transparency; otherwise, you should default to JPG in the absence of other constraints. Note that the
difference is often very small, so it is not worth spending a lot of effort on this.
Static versus dynamic images It is often faster to render static images than to create
them dynamically at runtime. A complex image could be constructed in XAML, in code, or by
loading an image file. Constructing it in XAML involves more steps, and is the slowest of the
three approaches. Loading from a file is generally the fastest—decoding an image is relatively
cheap, and the benefits mount up if you are reusing the visual multiple times. Of course, if the
visual depends on something at runtime, then you might have no option but to create it in
Image scaling It is common to use a fixed-size Image control and then pull in the image
source from a file or resource, which might or might not be the same size as the control.
There’s an obvious performance penalty when scaling images. This is particularly so with
respect to wasted memory for images that are larger than you need, so you should try to
Chapter 14 Go to Market 513
get the source images at the right size in advance. This is another technique that becomes
more critical if you’re adding image items to a ListBox. Furthermore, Windows Phone has a
maximum image size of 2048x2048 pixels. If you use larger images, they will automatically be
sampled at a lower resolution. The algorithm to perform this sampling picks a simple ratio,
so your image can end up with a resolution that is significantly less than 2048x2048. And of
course, it will be slower to render.
Custom decoding If you are scaling images manually, you should use the PictureDecoder
API. You can use this to specify how to decode, and to what target size. Without this, the normal behavior would be to first decode at the image’s native resolution, and then scale down.
There’s a slight performance gain and a significant memory gain in using PictureDecoder.
Visibility versus Opacity If you need to hide/show UI elements, you have a choice between
using the Visibility property or the Opacity property. An element with Visibility=Collapsed
incurs no cost, the system will not walk the visual tree for the element, and events will not be
propagated for the element. On the other hand, when the time comes to unhide it, setting
Visibility=Visible will incur a heavy cost in creating the element’s tree, so it might be slower
to appear. Using Opacity=0 means that the element’s visual tree is in existence at all times,
which will add to the overall cost of your UI. On the other hand, when the time comes to set
Opacity>0, this will be extremely quick, provided the element is cached. If the element is not
cached, you will pay a penalty. To recap, using Opacity to show/hide a visual that is not cached
is the worst thing you can do, from a performance standpoint. Conversely, the optimal strategy is generally to use Opacity and to enable BitmapCache mode.
Resource versus Content An image embedded as a resource becomes part of your assembly. One side-effect of this is that it is read twice at startup. The reason for this is that the
Windows Phone application platform has to read your assemblies for security purposes to
ensure that this is a valid, certified Windows Phone application. Then, the Common Language
Runtime (CLR) also has to read it (for verification purposes, type resolution, JIT buffering, and
so on). So, if you embed a large amount of images into your assembly, they’ll all be read twice.
If, instead, you set their build action to be Content, then they’ll simply be added to your XAP,
but not to your assembly. The trade-off is that while embedded resources slow down startup
time, they are faster to load subsequently.
Desktop versus phone Be very careful about reusing code or controls from desktop Silverlight applications. Even though they might work, they might work very slowly. The Rating
control from the Silverlight toolkit is one example: this has a very large number of UI elements.
Using even one instance of this should give you pause, but you should definitely avoid making
this a part of an item control in a ListBox.
Inline XAML It is very easy to declare complex UI elements in XAML, as opposed to defining the same elements in code directly. You can use the XamlReader class to dynamically load
XAML at runtime. However, you will pay a performance penalty for this; parsing the XAML
and executing the resulting code is always going to be slower than executing code that you’ve
written to do the same work directly.
514 Part III Extended Services
Panorama versus Pivot Although the Panorama control is an ItemsControl, it is not virtualized: all of the content inside each of the PanoramaItems is rendered on initial load. If you
think about it, this is what enables the Panorama to show part of the next (and sometimes,
previous) items immediately upon load. You should stick to the Metro guidelines, which
suggest that you should not be using a Panorama for heavy work anyway. The Panorama is
intended to be a front-page “magazine” experience. It should be an attractive entry to your
application that encourages the user to explore further. It should not be used for complex
controls or complex lists. The standard themes used on Panorama further encourage this; for
instance, the heading is huge and takes up a lot of space. That’s not a reason to fill up the
remaining space, it’s an encouragement to be minimalist on the Panorama. By contrast, the
Pivot does virtualize its content. Only the first PivotItem is populated on initial load, although
the system does immediately trigger a load for the items to the right and left. So, effectively,
three items are loaded at or shortly after startup. A three-item Panorama will have similar
startup time to a three-item Pivot, but the more items you have, the more the load times
diverge. The Panorama user experience (UX) encourages users to navigate back and forth
between the items, so you can mostly assume that all items need to be available at all times.
This is not true of the Pivot, for which some items might never be seen by the user in any given
session. Thus, if you have, for example, complex lists or animations on an item, you should be
proactive about creating/starting/stopping such elements.
Progress bars The standard library provides a ProgressBar. The Toolkit provides a
PerformanceProgressBar. For determinate scenarios (that is, where you can determine the
percentage progress), use the standard ProgressBar. For indeterminate scenarios, use the
PerformanceProgressBar. The critical difference is that the Toolkit PerformanceProgressBar
does most of its work on the Render thread, whereas the standard ProgressBar uses the UI
thread heavily, and also uses about three times more video memory. You can compare the
difference by running the TestProgressBars solution in the sample code and turning on redraw
regions. Applications built to target version 7.1 can take advantage of the built-in progress
indicator in the SystemTray. Doing this affords performance benefits over using any XAMLbased progress bars.
Non-UI Performance Tips
WebClient versus HttpWebRequest In Windows Phone 7, when you make a web request
by using WebClient, regardless of which thread you make it on, the result will always be
returned on the UI thread. In fact, the WebClient class is a wrapper around HttpWebRequest,
whose primary purpose is to simplify making web requests. This has performance implications; especially in data-binding scenarios in which your data-bound items are being sourced
from the web. Conversely, requests made by using HttpWebRequest return on the background
thread, as expected, so you should opt for HttpWebRequest in most cases. On top of that, even
though HttpWebRequest responses are raised on a background thread, it still uses resources
on the thread where it was created. So, the optimal approach for a network-heavy scenario is
to use HttpWebRequest, and to create it on a background thread.
Chapter 14 Go to Market 515
Network calls Avoiding chatty network calls is a well understood performance optimization technique, but one worth repeating here. If your ListBox items are data-bound to a web
service, you ideally want to pull in a batch of items in one network call and cache them locally
in a list of some kind. You definitely don’t want to be making an individual web method call
for each item. You should also default to filtering on the server, as opposed to bringing down
high volumes of redundant data which you then filter on the client.
Web service data formats Traditional web services typically send data in Simple Object
Access Protocol (SOAP) format. Newer web services, including Windows Communications
Foundation (WCF) Data Services, expose data with the OData protocol, in either XML or JSON
format. OData formats involve significantly less overhead than traditional SOAP formats.
OData JSON, in turn, offers significantly less overhead than OData XML.
Static versus dynamic Bing maps If all you need to do in your scenario is show a few static
maps, you should use the Bing web services. Don’t use the standard Bing dynamic maps unless
you need their inherent richer interactive capabilities. If you’re not using the Bing Maps control, you save on the control, and you also save on the assembly itself, which is not pulled into
Pages in separate assemblies If your application has pages that are rarely visited, consider
factoring them out to a separate assembly. That way, you save on startup time and memory
usage. It’s important to minimize startup time; first, because slow startup is a bad UX, and
second, because if your startup is too slow, you will fail marketplace ingestion. Factoring pages
out to secondary assemblies means that the additional page(s), and their containing assembly,
are only loaded if and when they’re actually used at runtime. To do this, you would create a
class library project and add the page(s) to that project. Then, add a reference to the class
library from your main project. The syntax of the URI needed for a page in another assembly
is “;component/.xaml”, which appears as follows in actual
Minimize constructors Constructors for UI elements as well as handlers for the Loaded
event are executed before the first frame of an application is presented. You can reduce
startup time by reducing the work that you do on constructors and Loaded event handlers.
Wherever it makes sense, you should move work from these methods to later methods. After
the Loaded event, the next most commonly used methods are the OnNavigatedTo override
and the LayoutUpdated event handler. It is generally better to do work in either of these methods than in the constructor or Loaded event handler. However, note that OnNavigatedTo is also
516 Part III Extended Services