Tải bản đầy đủ - 0 (trang)
UI vs. Render Thread, and BitmapCache Mode

UI vs. Render Thread, and BitmapCache Mode

Tải bản đầy đủ - 0trang


Render thread  This is designed to be very lightweight, and it is mainly responsible for

stitching together textures to hand off to the Graphics Processing Unit (GPU). The Render

thread handles simple (double) animations, translate, scale, rotate and perspective transforms,

opacity (but not opacity masks), and rectangular clipping—all of which can be hardwareaccelerated via the GPU. Note that for scale transforms, whenever the scale crosses the 50 percent threshold of the original (or previously rasterized) texture size, the visual is re-rasterized

on the UI thread.

Briefly, the initial template expansion and rasterization for an element occurs on the UI thread.

These rendered elements are then cached in memory as bitmaps and handed off to the Render

thread which works with the GPU to draw the frame and add it to the back buffer for display. From

this point, if you don’t make any changes to the cached element, it doesn’t have to be redrawn. Any

changes you might make to the cached element fall into two categories:



A change that can be handled by the GPU. Some examples are rectangular clipping, Translate

Transform, or ScaleTransform.

A change that requires the element to be re-rendered. This can include a color change, nonrectangular clipping, opacity mask, padding or margin changes, an so on. In this case, the

cached bitmap will be deleted, and the element will be re-rendered on the UI thread to generate a fresh bitmap for caching.

So, there are at least two threads in the system, plus one or more additional threads that the application itself can choose to create, either indirectly or directly, as summarized in Figure 14-2.

Note  Version 7.1 also introduces a third system thread that is intended specifically to

handle input.

In general, you want to take advantage of the Render thread as much as you can to offload work

from the UI thread. To see where there might be opportunities for optimizing between the UI thread

and the Render thread, you can turn on display of redraw regions and bitmap caching by using the

Application settings, as shown in the following code snippet. Note that the standard Microsoft Visual

Studio project templates generate code for this in the App.xaml.cs, which you can uncomment.

Application.Current.Host.Settings.EnableRedrawRegions = true;

Application.Current.Host.Settings.EnableCacheVisualization = true;

506  Part III  Extended Services


Render Thread

UI Thread

Background Thread(s)

Pass Rastered Images

Per-Frame Callback

Per-Frame Callback

Create Thread

Per-Frame Callback

(Report Progress)

Pass Rastered Images

Do Work

(Cancel Thread)

Per-Frame Callback

Per-Frame Callback

Per-Frame Callback

Figure 14-2  The Render thread, UI thread, and background threads.

Consider an application that moves a ball around the screen. Depending on how you write the

code, this will result in different rendering behavior. The following sample application (the Bouncing

Ball solution in the sample code) does not take advantage of the Render thread and GPU; instead, it

implements timer-based callbacks on the UI thread. The application also responds to user touch, via

a GestureListener, so even more work is being done on the UI thread. The application provides four

AppBar buttons. With the first three, the user can toggle EnableRedrawRegions, BitmapCache mode,

and EnableCacheVisualization, respectively. This allows you to see the effects of the design choices on

the UI rendering behavior (you’ll examine the purpose of the fourth button later).

private void appbarRedraw_Click(object sender, EventArgs e)


    Application.Current.Host.Settings.EnableRedrawRegions =



private void appbarCache_Click(object sender, EventArgs e)


    if (ball.CacheMode == null)


        ball.CacheMode = new BitmapCache();


Chapter 14  Go to Market   507




        ball.CacheMode = null;



private void appbarCacheViz_Click(object sender, EventArgs e)


    Application.Current.Host.Settings.EnableCacheVisualization =



Figure 14-3 illustrates the application with EnableRedrawRegions turned on, showing which elements are being redrawn with each frame (the colors are arbitrary; they cycle between purple, yellow,

and magenta). As the ball bounces around the screen, the redraw region is a rectangle that expands

or contracts to include all UI elements. Thus, the redraw rectangle varies in size, depending on where

the ball is in relation to the other UI elements.

Figure 14-3  The regions of your application that need to be redrawn can vary over time.

The redrawing highlighted when the EnableRedrawRegions flag is turned on is essentially drawing in software (on the CPU) and does not offload any work to the Render thread and the GPU. In all

cases, the first time a visual is drawn, it will be drawn in software, but the aim generally should be to

draw a visual once on the UI thread, and then hand it off to the Render thread (which in turn, hands

it off to the GPU) for all subsequent drawing. So, what you want to avoid is the situation in which you

can see the same region being drawn repeatedly (as evidenced by the colors changing repeatedly).

The bottom line is that if you see something in your application that’s frequently changing color,

it means it’s being frequently redrawn. You should therefore examine your code to see if you can

reduce this.

508  Part III  Extended Services


Figure 14-4 demonstrates the application again, this time with EnableCacheVisualization turned

on, which shows the areas of the application that are cached. The un-cached surfaces are rendered

in software and the cached surfaces are passed to the GPU and rendered in hardware. With this flag

turned on, each element/texture that is handed off to the GPU is tinted blue and has a transparency

applied. This way, you can see where textures are overlapping. The darkest shades indicate that multiple textures are lying atop one another.

Note  Turning on EnableCacheVisualization degrades performance, so you should not

attempt to measure frame rates while this is active. Note also that the behavior of this

flag on Windows Phone is different from the behavior on desktop Silverlight: on desktop Silverlight, the tinted areas are areas that are not drawn by the GPU; on Silverlight for

Windows Phone, the tinted areas are those that are drawn by the GPU.

Figure 14-4  Cache visualization with BitmapCache mode turned off (on the left) and on (right).

You can specify that an element should have its rendered bitmap cached in XAML, as shown here:

Or, you can specify that it should be cached in code:

ball.CacheMode = new BitmapCache();

Chapter 14  Go to Market   509


The effect of setting BitmapCache mode is to skip the render phase for the element, which will

have a significant effect on performance. The screenshot on the right in Figure 14-4 shows what

happens if you set the CacheMode on the ball to BitmapCache. The UI thread works considerably less

to display the ball, the frame rate on this application goes up, and the fill rate goes down. The frame

rate, fill rate, and the other performance counters are discussed in more detail in Chapter 8, “Diagnostics and Debugging.”

When you use bitmap caching, you should group cached elements together, following non-cached

elements in the visual tree. Do not interleave cached/non-cached elements. This way, non-cached elements can be included in a single intermediate background texture, which improves performance.

Note  There’s a downside to caching: it takes up additional memory, and the more you

cache, the higher your fill rate will be. Therefore, you shouldn’t simply cache everything. Instead, you should profile your application by using EnableRedrawRegions and

EnableCacheVisualization, and look for opportunities to optimize caching. If you do this

right, you should see the frame rate go up, and the FillRate count go down.

The fourth App Bar button is implemented to simulate an operation that needs to do work on the

UI thread:

private void appbarBlockUI_Click(object sender, EventArgs e)




If you start the ball bouncing and then tap this button, the bouncing will stop for three seconds.

This demonstrates the critical flaw in this application’s design, because so much relies on work being

done on the UI thread.

The previous example relied on user interaction, and a lot of work was done on the UI thread.

Figure 14-5 shows an alternative example of a bouncing ball (the BouncingStoryboard solution in the

sample code); this one involves only minimal user interaction (start and stop App Bar buttons) and

does most of the work on the Render thread. In addition, the animation is fixed and declared in XAML

via a storyboard. This contains two main animations: one that moves the ball from top to bottom

(using the Canvas.Top attached property), and another that moves the ball from left to right (using

Canvas.Left). The top-bottom animation itself contains a third animation, which uses one of the Silverlight EasingFunctions to provide a bouncing motion.


        Duration="0:0:12" Storyboard.TargetName="ball"  





510  Part III  Extended Services



        Duration="0:0:12" Storyboard.TargetName="ball"  




The big change here is to move the per-frame manual animation (which was all done on the UI

thread) to a double animation implemented via a storyboard, which can all be done by the Render


The App Bar buttons are there to start and stop the storyboard. There is also a button to toggle

redraw regions, and one to block the UI thread. Because this ball is part of a simple animation and

is automatically cached, there’s no button to set CacheMode. Observe also that if you block the UI

thread, in this application, the ball will continue to bounce because the animation is all being done on

the Render thread.

Figure 14-5  Using the Render thread for animations reduces the amount of redrawing.

The following objects will be automatically cached:


The target of any storyboard-driven animation that uses the Render thread (as shown in the

BouncingStoryboard solution).


The target of any plane projection, either static or animated.


All MediaElement objects.


Child items in a ScrollViewer or ListBox.

Chapter 14  Go to Market   511


As discussed in Chapter 8, the three most important performance counters to monitor are the UI

Thread Frame Rate, the Render Thread Frame Rate, and the Fill Rate. If your UI thread is overloaded,

you’ll see the UI Thread Frame Rate drop, which is a sign that you need to offload work from the UI

thread. From the other end, the Fill Rate corresponds to how hard the GPU is working, and as the Fill

Rate exceeds 2, the Render Thread Frame Rate will drop. So, a high Fill Rate is a sign that you need to

minimize your use of UI elements (by reducing the number and/or complexity of your elements, setting BitmapCache mode, avoiding interleaved cached/non-cached elements, and so on).

UI Layout and ListBoxes

The ListBox is an obvious element for which scale has significant performance implications. UI layout

is the most expensive operation performed on the UI thread. There’s nothing to stop you from creating complex layouts with nested Grids, StackPanels, nested ListBoxes, plus complex ValueConverters,

custom controls, and so on. This is bad enough if you only have one of these on your page, but as

soon as you use such a complex element as an item within a ListBox, the scale issues become more

obvious. Plus, of course, there’s nothing to stop you from putting many thousands of these items into

your ListBox. On top of that, you could be sourcing your data from a remote service over the Internet, and the data might include large images or large volumes of redundant metadata. The potential

permutations implied in this kind of model represent a potential recipe for bad performance and an

unresponsive UI.

Here are some UI best practice guidelines for optimizing runtime performance and responsiveness

when using ListBoxes:





Avoid using complex item data templates.  Most particularly, don’t use nested ListBoxes,

and don’t use UserControls or custom controls. Also, ensure that you have the data template

in a fixed-size container such as a Grid with an explicit Height set on it. As a performance optimization, the ListBox calculates the height of three screen’s worth of items (the one currently

visible, plus one above and below), and this doesn’t work if your items vary in size.

Avoid using complex converters.  If possible, try to perform conversions in the data source

request or as you pull the data into a local cache, before attempting to render it in the UI.

Offload work to background threads.  Typical candidates for this include the retrieval,

processing, and caching of item data, leaving only the final data-binding on the UI thread. The

trade-off here is that doing work on a background thread before dispatching to the UI thread

might cause items to load more slowly. However, the advantage is that the UI will remain

responsive. If you do it right, even the slower load might be apparent only on the initial batch

of items, because you can continue working on the background while the user is exercising the

UI. As you’re doing work on the background thread, be sure to yield control frequently (perhaps by a simple Thread.Sleep) so that the OS can schedule the UI thread more frequently.

Virtualize your data if you can.  This is particularly relevant if you need to manipulate raw

data and compute the final data for rending in the UI. The ListBox virtualizes the UI (via the

VirtualizingStackPanel, which is the default items host for the ListBox), such that if you have,

for example, 1,000 items, only 3 screen’s worth of UI elements are created, and then these

512  Part III  Extended Services


are recycled as the user scrolls other items into view. However, the data is not virtualized; the

ListBox property you data-bind is ItemsSource, which is defined as an IEnumerable. This means

that the only way the ListBox knows the size of the list is to enumerate all the items. This information is required so that the ListBox can configure itself correctly; for example, to calculate

the scrollbar indicator. More significantly, it assumes that all of your data items are constructed

completely before handing off to the UI. You can mitigate this by binding to collections that

implement IList, which provides Count, IndexOf, and indexer (this[]) methods. In your implementation, you can be smart about constructing each of your computed items for the list.

For example, you can defer the composition/computation processing for items until they’re

required for the UI.



Consider your caching strategy.  If you’re virtualizing your data, you have the opportunity

to do some intelligent caching. Depending on the context, you might also be able to segment your data into static and dynamic components. You can also consider the standard .NET

technique of WeakReferences.

Minimize work while scrolling.  First, don’t do anything on the UI thread while the user is

scrolling, because that would make the UI less responsive. Second, don’t do any work for items

that are not visible to the user; this implies waiting for the user to stop scrolling before you can

calculate which items are now visible. You could also segment your data templates according

to whether the user is scrolling; that is, use a simple template if she’s scrolling (perhaps just

text, or a low-resolution image), and a more sophisticated template for items that are visible

when she has stopped.

More UI Performance Tips

Apart from the aforementioned ListBox-specific issues, there are other UI-related performance optimizations that you can consider:




JPG versus PNG  It is faster to decode JPG images than PNG images, so you should prefer

JPG over PNG. The only case for which you need to use PNG is if your images have transparency; otherwise, you should default to JPG in the absence of other constraints. Note that the

difference is often very small, so it is not worth spending a lot of effort on this.

Static versus dynamic images  It is often faster to render static images than to create

them dynamically at runtime. A complex image could be constructed in XAML, in code, or by

loading an image file. Constructing it in XAML involves more steps, and is the slowest of the

three approaches. Loading from a file is generally the fastest—decoding an image is relatively

cheap, and the benefits mount up if you are reusing the visual multiple times. Of course, if the

visual depends on something at runtime, then you might have no option but to create it in


Image scaling  It is common to use a fixed-size Image control and then pull in the image

source from a file or resource, which might or might not be the same size as the control.

There’s an obvious performance penalty when scaling images. This is particularly so with

respect to wasted memory for images that are larger than you need, so you should try to

Chapter 14  Go to Market   513


get the source images at the right size in advance. This is another technique that becomes

more critical if you’re adding image items to a ListBox. Furthermore, Windows Phone has a

maximum image size of 2048x2048 pixels. If you use larger images, they will automatically be

sampled at a lower resolution. The algorithm to perform this sampling picks a simple ratio,

so your image can end up with a resolution that is significantly less than 2048x2048. And of

course, it will be slower to render.






Custom decoding  If you are scaling images manually, you should use the PictureDecoder

API. You can use this to specify how to decode, and to what target size. Without this, the normal behavior would be to first decode at the image’s native resolution, and then scale down.

There’s a slight performance gain and a significant memory gain in using PictureDecoder.

Visibility versus Opacity  If you need to hide/show UI elements, you have a choice between

using the Visibility property or the Opacity property. An element with Visibility=Collapsed

incurs no cost, the system will not walk the visual tree for the element, and events will not be

propagated for the element. On the other hand, when the time comes to unhide it, setting

Visibility=Visible will incur a heavy cost in creating the element’s tree, so it might be slower

to appear. Using Opacity=0 means that the element’s visual tree is in existence at all times,

which will add to the overall cost of your UI. On the other hand, when the time comes to set

Opacity>0, this will be extremely quick, provided the element is cached. If the element is not

cached, you will pay a penalty. To recap, using Opacity to show/hide a visual that is not cached

is the worst thing you can do, from a performance standpoint. Conversely, the optimal strategy is generally to use Opacity and to enable BitmapCache mode.

Resource versus Content  An image embedded as a resource becomes part of your assembly. One side-effect of this is that it is read twice at startup. The reason for this is that the

Windows Phone application platform has to read your assemblies for security purposes to

ensure that this is a valid, certified Windows Phone application. Then, the Common Language

Runtime (CLR) also has to read it (for verification purposes, type resolution, JIT buffering, and

so on). So, if you embed a large amount of images into your assembly, they’ll all be read twice.

If, instead, you set their build action to be Content, then they’ll simply be added to your XAP,

but not to your assembly. The trade-off is that while embedded resources slow down startup

time, they are faster to load subsequently.

Desktop versus phone  Be very careful about reusing code or controls from desktop Silverlight applications. Even though they might work, they might work very slowly. The Rating

control from the Silverlight toolkit is one example: this has a very large number of UI elements.

Using even one instance of this should give you pause, but you should definitely avoid making

this a part of an item control in a ListBox.

Inline XAML  It is very easy to declare complex UI elements in XAML, as opposed to defining the same elements in code directly. You can use the XamlReader class to dynamically load

XAML at runtime. However, you will pay a performance penalty for this; parsing the XAML

and executing the resulting code is always going to be slower than executing code that you’ve

written to do the same work directly.

514  Part III  Extended Services




Panorama versus Pivot  Although the Panorama control is an ItemsControl, it is not virtualized: all of the content inside each of the PanoramaItems is rendered on initial load. If you

think about it, this is what enables the Panorama to show part of the next (and sometimes,

previous) items immediately upon load. You should stick to the Metro guidelines, which

suggest that you should not be using a Panorama for heavy work anyway. The Panorama is

intended to be a front-page “magazine” experience. It should be an attractive entry to your

application that encourages the user to explore further. It should not be used for complex

controls or complex lists. The standard themes used on Panorama further encourage this; for

instance, the heading is huge and takes up a lot of space. That’s not a reason to fill up the

remaining space, it’s an encouragement to be minimalist on the Panorama. By contrast, the

Pivot does virtualize its content. Only the first PivotItem is populated on initial load, although

the system does immediately trigger a load for the items to the right and left. So, effectively,

three items are loaded at or shortly after startup. A three-item Panorama will have similar

startup time to a three-item Pivot, but the more items you have, the more the load times

diverge. The Panorama user experience (UX) encourages users to navigate back and forth

between the items, so you can mostly assume that all items need to be available at all times.

This is not true of the Pivot, for which some items might never be seen by the user in any given

session. Thus, if you have, for example, complex lists or animations on an item, you should be

proactive about creating/starting/stopping such elements.

Progress bars  The standard library provides a ProgressBar. The Toolkit provides a

PerformanceProgressBar. For determinate scenarios (that is, where you can determine the

percentage progress), use the standard ProgressBar. For indeterminate scenarios, use the

PerformanceProgressBar. The critical difference is that the Toolkit PerformanceProgressBar

does most of its work on the Render thread, whereas the standard ProgressBar uses the UI

thread heavily, and also uses about three times more video memory. You can compare the

difference by running the TestProgressBars solution in the sample code and turning on redraw

regions. Applications built to target version 7.1 can take advantage of the built-in progress

indicator in the SystemTray. Doing this affords performance benefits over using any XAMLbased progress bars.

Non-UI Performance Tips


WebClient versus HttpWebRequest  In Windows Phone 7, when you make a web request

by using WebClient, regardless of which thread you make it on, the result will always be

returned on the UI thread. In fact, the WebClient class is a wrapper around HttpWebRequest,

whose primary purpose is to simplify making web requests. This has performance implications; especially in data-binding scenarios in which your data-bound items are being sourced

from the web. Conversely, requests made by using HttpWebRequest return on the background

thread, as expected, so you should opt for HttpWebRequest in most cases. On top of that, even

though HttpWebRequest responses are raised on a background thread, it still uses resources

on the thread where it was created. So, the optimal approach for a network-heavy scenario is

to use HttpWebRequest, and to create it on a background thread.

Chapter 14  Go to Market   515






Network calls  Avoiding chatty network calls is a well understood performance optimization technique, but one worth repeating here. If your ListBox items are data-bound to a web

service, you ideally want to pull in a batch of items in one network call and cache them locally

in a list of some kind. You definitely don’t want to be making an individual web method call

for each item. You should also default to filtering on the server, as opposed to bringing down

high volumes of redundant data which you then filter on the client.

Web service data formats  Traditional web services typically send data in Simple Object

Access Protocol (SOAP) format. Newer web services, including Windows Communications

Foundation (WCF) Data Services, expose data with the OData protocol, in either XML or JSON

format. OData formats involve significantly less overhead than traditional SOAP formats.

OData JSON, in turn, offers significantly less overhead than OData XML.

Static versus dynamic Bing maps  If all you need to do in your scenario is show a few static

maps, you should use the Bing web services. Don’t use the standard Bing dynamic maps unless

you need their inherent richer interactive capabilities. If you’re not using the Bing Maps control, you save on the control, and you also save on the assembly itself, which is not pulled into

your XAP.

Pages in separate assemblies  If your application has pages that are rarely visited, consider

factoring them out to a separate assembly. That way, you save on startup time and memory

usage. It’s important to minimize startup time; first, because slow startup is a bad UX, and

second, because if your startup is too slow, you will fail marketplace ingestion. Factoring pages

out to secondary assemblies means that the additional page(s), and their containing assembly,

are only loaded if and when they’re actually used at runtime. To do this, you would create a

class library project and add the page(s) to that project. Then, add a reference to the class

library from your main project. The syntax of the URI needed for a page in another assembly

is “;component/.xaml”, which appears as follows in actual


NavigationService.Navigate(new Uri(

    "/MyPageLibrary;component/Page2.xaml", UriKind.Relative));


Minimize constructors  Constructors for UI elements as well as handlers for the Loaded

event are executed before the first frame of an application is presented. You can reduce

startup time by reducing the work that you do on constructors and Loaded event handlers.

Wherever it makes sense, you should move work from these methods to later methods. After

the Loaded event, the next most commonly used methods are the OnNavigatedTo override

and the LayoutUpdated event handler. It is generally better to do work in either of these methods than in the constructor or Loaded event handler. However, note that OnNavigatedTo is also

516  Part III  Extended Services


Tài liệu bạn tìm kiếm đã sẵn sàng tải về

UI vs. Render Thread, and BitmapCache Mode

Tải bản đầy đủ ngay(0 tr)