r7 - 06 Dec 2007 - 15:56:28 - BryanStearnsYou are here: OSAF >  Projects Web  >  DevelopmentHome > PerformanceProject > BusyDevelopersGuideToChandlerPerformanceOptimization

Busy Developers Guide to Chandler Performance Optimization

Does some action you do in Chandler feel too slow? If so, you should first check if our existing performance tests cover that scenario.

If there is no test, you should start by developing a test. Start by copying from one of the existing tests. This way you and others will be able to run the same test easily, repeatedly, with the same settings. Also, if we add that test to Tinderbox we get automatic monitoring of how the performance changes over time.

Once you find or write a test, you want to run the test a few times to get a baseline number. When you make improvements, you want to make sure that they significantly improve from the baseline.

To find out what exactly is too slow, you MUST profile the code. Just by looking at the code it is practically impossible to know.

See how you can run the performance tests.

Workflow:

  1. Identify slow usage scenario
  2. Find or create a test
  3. Get baseline performance numbers
  4. Get profile
  5. Analyze profile
  6. Create potential fix
  7. Repeat from 4. until satisfied
  8. Verify results
  9. Communicate

Getting Times, Reading Performance Test Output

Whether you are running the test for the first time to get a baseline number or after some changes, you should first close down all other programs to reduce the normal random performance variation. You should also run the test several times to see how much actual variance there is between the runs.

IMPORTANT: Always measure and profile using optimized release builds.

An easy way to get the numbers is to use rt.py, which can also take an optional --repeat option to specify how many times to run the performance test; in practice 5 is reasonably good number of repetitions:

./tools/rt.py -t PerfStampEvent --repeat=5

Note: If you're doing one of the large-data tests (that is, named "PerfLargeDataSomethingOrOther"), you'll need to generate the large repository (which rt.py will automatically tell the large-data tests to restore when it runs them). The PerfImportCalendar test creates this repository, so just run it once to make that happen:

./tools/rt.py -t PerfImportCalendar

rt.py will pretty print the results and calculate standard deviation for you. These will look like this:

PerfStampEvent.py      4.62   4.52   4.76   4.54   5.69  |   4.62 ±   0.49

The numbers before | are the individual runs, the number immediately after | is the median of the values and the last number is the standard deviation.

WARNING: The pretty printed results show only the last measured test from each file. See catsProfile command line argument for more information.

Below the pretty printed values, and also when you run without rt.py, will be all of the results including lines such as the following:

OSAF_QA: Perf_Stamp_as_Event.Note_creation | 13845 | 1.299634
OSAF_QA: Perf_Stamp_as_Event.Change_the_Event_stamp | 13845 | 0.221906
OSAF_QA: Perf_Stamp_as_Event | 13845 | 2.324717

The text after OSAF_QA: specifies the test, next is revision number, and the last shows the time in seconds. So which line is the actual test result? perf.py has the official list, but it is generally easy to determine, for example in the above case the middle line shows the time it took to stamp.

Getting a Profile

There are several ways to profile Chandler code.

catsProfile command line argument

WARNING: --catsProfile will not work correctly with tests that do several tests in a single file. These include at least the following: PerfLargeDataJumpWeek.py, PerfLargeDataOverlayCalendar.py, PerfLargeDataSwitchCalendar.py, PerfLargeDataSwitchToAllView.py, PerfSwitchToAllView.py and PerfLargeDataSharing.py.

If you add the --catsProfile=<filename> command line argument when you run Chandler, the scripting and testing framework will automatically generate a hotshot profile for you and save it in filename. The framework will try its best to skip profiling code that is part of the the test framework and which would not be run in the real world scenario.

rt.py can make this even easier. For example, to get a profile of PerfStampEvent.py, run this:

./tools/rt.py -Pt PerfStampEvent

This will create the hotshot profile in ./test_profile/PerfStampEvent.hotshot.

Modifying code to run in hotshot profiler

If --catsProfile does not work for you, you can also manually make hotshot profile the call you want to test:

# we want to profile foobar()
import hotshot
prof = hotshot.Profile("foobar.prof")
prof.runcall(foobar)

There is more documentation about hotshot on the Python website, including interacive samples on how to run and analyze a hotshot profile.

Using quickprofile

Alec Flett wrote a quickprofile module. This is how you can get a profile for a function:

    from util.easyprof import QuickProfile
    @QuickProfile('foobar.prof')
    def foobar():

Analysing profile

Whatever tool you use, you are looking for code that is too slow.

There are three interesting things per function: cumulative time, individual time, and number of times called. You should first sort by cumulative time.

Look for functions and methods that account for at least 1% of the profile (preferably much more than that), starting with the highest % of course.

Some functions are slow because they are called often. Some others are simply slow. The worst are those that are slow and are called often.

When looking at the profile, think about what you are seeing. Are all the function calls actually needed? In the best case you find code that should not be called at all and you can remove it, or can remove the call to it in the scenario you are profiling. In some cases you will find code that is called too often, for example creating an object in a loop when it could be created once outside the loop.

One typical performance optimization is trading space for speed. In other words, caching the results of slow calculations.

Perhaps in the majority of cases there are no simple oopsies in the code that can be fixed. You will need to think about a different implementation, using a different algorithm perhaps.

You should also read Python Performance Tips.

KCachegrind

By far the easiest way is to use a visual profile analyzer. Currently we know of only one, and it is available on Linux only: KCacheGrind.

To use KCachegrind you first need to convert the hotshot profile to KCachegrind format, and then launch KCachegrind:

$ hotshot2calltree filename -o filename.prof
$ kcachegrind filename.prof

easyprofileanalyzer

The second easiest method is probably to use easyprofileanalyzer, written by Alec Flett. Read the script to see how to use it.

hotshot

Read the manual.

timeit module

Python has a handy timeit module which makes it easy to compare the performance of (usually) small and fast blocks of code. timeit is especially well suited to cases where you have tight loops executing hundreds or thousands of times and need to find the fastest implementation.

An example:

>>> import timeit
>>> t = timeit.Timer('1 > 0 and 1 < 2')
>>> t.timeit()
0.24477005004882812
>>> t = timeit.Timer('0 < 1 < 2')
>>> t.timeit()
0.22036290168762207

Verifying results

Once you have identified slow code and think you have fixed it, there are a few verification steps. Reprofile, and make sure that the function you optimized is now taking a smaller percentage of the profile. Run the performance test to make sure that wall clock agrees that the test got faster.

You will of course need to run unit and functional tests before checkin, but since performance optimization is notorious for introducing regressions you should consider playing with the app a bit manually to make sure that everything is still in order. Asking for code review prior to checkin is also highly recommended.

When you checkin, mention how much the expected speedup is. Finally after checkin, monitor the performance numbers to make sure that the expected performance gains materialized. It would be nice to note in the bug the actual gains.

Communicate

Finally, let other people know what you find, and what you are working on. By letting others know of the problems you find, and how you solved them, will often help others realize similar improvements in areas they are working on. And of course letting others know what you are working on will avoid duplicate work.

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r7 < r6 < r5 < r4 < r3 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.