Performance Counters: Difference between revisions

From GNU Radio
Jump to navigation Jump to search
(first shot)
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
[[Category:Usage Manual]]
== Introduction ==
== Introduction ==


Line 72: Line 73:
# export: Allow counters to be exported over ControlPort.
# export: Allow counters to be exported over ControlPort.
# clock: sets the type of clock used when calculating work_time ('thread' or 'monotonic').
# clock: sets the type of clock used when calculating work_time ('thread' or 'monotonic').


== Performance Monitor ==
== Performance Monitor ==

Latest revision as of 22:43, 12 March 2019

Introduction

Each block can have a set of Performance Counters that the schedule keeps track of. These counters measure and store information about different performance metrics of their operation. The concept is fairly extensible, but currently, GNU Radio defines the following Performance Counters:

  1. noutput_items: number of items the block can produce.
  2. nproduced: the number of items the block produced.
  3. input_buffers_full: % of how full each input buffer is.
  4. output_buffers_full: % of how full each output buffer is.
  5. work_time: number of CPU ticks during the call to general_work().
  6. work_time_total: Accumulated sum of work_time.

For each Performance Counter except the work_time_total, we can retrieve the instantaneous, average, and variance from the block. Access to these counters is done through a simple set of functions added to every block in the flowgraph:

 float pc_<name>[_<type>]();

In the above, the <name> field is one of the counters in the above list of counters. The optional <type> suffix is either 'avg' to get the average value or 'var' to get the variance. Without a suffix, the function returns the most recent instantaneous value.

We can also reset the Performance Counters back to zero to remove any history of the current average and variance calculations for a particular block.

 void reset_perf_counters();

Compile-time and Run-time Configuration

Because the Performance Counters are calculated during each call to work for every block, they increase the computational cost and memory overhead. The more blocks used, the more impact this may have. So while it turns out after some experimentation that the Performance Counters add very little overhead (less than 1% speed degradation for a 24-block flowgraph), we err on the side of minimizing overhead in the scheduler. To do so, we have added compile-time and run-time configuration of the use of Performance Counters.

Compile-time Config

By default, GNU Radio will build without Performance Counters enabled. To enable Performance Counters, we pass the following flag to cmake:

 -DENABLE_PERFORMANCE_COUNTERS=True

Note that this affects the GNU Radio block class and the scheduler itself. Out-of-tree projects will inherit directly from GNU Radio because of the inheritance with gr::block. Turning on Performance Counters for GNU Radio will require a recompilation of the OOT project but no extra configuration.

Run-time Config

Given the Performance Counters are enabled in GNU Radio at compile-time, we can still control if they are used or not at run-time. For this, we use the GNU Radio preferences file in the section PerfCounters. This section is installed into the gnuradio-runtime.conf file. As usual with the preferences, this section or any of the individual options can be overridden in the user's config.conf file or using a GR_CONF_ environmental variable.

The options for the PerfCounters section are:

  1. on: Turn counters on/off at run-time.
  2. export: Allow counters to be exported over ControlPort.
  3. clock: sets the type of clock used when calculating work_time ('thread' or 'monotonic').

Performance Monitor

See [ControlPort#Performance Monitor] for some details of using a ControlPort-based monitor application, gr-perf-monitorx, for visualizing the counters. This application is particularly useful in learning which blocks are the computationally complex blocks that could use extra optimization or work to improve their performance. It can also be used to understand the current 'health' of the application.