GNU Radio 4.0 Summary of Proposed Features

From GNU Radio
Jump to navigation Jump to search

High Level Design Goals

GNU Radio 4.0 seeks to make major changes to the core GNU Radio code in order to achieve the following goals

  • Modular Runtime Components
  • Improved Support of Heterogeneous Architectures
  • Support for Distributed Architectures

In addition, there are many things we are able to improve "while we're at it" that aren't related to performance, but more toward the Developer and User Experience. These include:

  • Separating the Block API from the runtime
  • YAML based block design methodology [[1]]

Main Features


GNU Radio 3.x uses a fixed runtime that is intended to support operation on GPP-only platforms. The scheduler which uses 1 thread per block (TPB) has been generally effective, but is not suitable to all applications. Rather than solve the problem for every potential user, GR 4.0 will provide a modular architecture for the major runtime components so that application specific version can be used when appropriate.

The currently proposed modular components are

  • Scheduler
  • Runtime
  • Custom Buffers

Heterogeneous Architectures

GNU Radio 3.10 introduced a Custom Buffers feature for streamlined data movement to and from hardware accelerators. GR 4.0 seeks to extend this capability by not being constrained by the GR3.x API, which allow more flexible custom buffers to be specified, rather than being locked with the block. For instance, a block might have a CUDA implementation that assumes the work() method is already in GPU memory. Depending on the platform, this could be more effectively handled if the data is in device memory, pinned memory, or utilizing managed memory. By separating the buffer abstraction from the block, one block implementation can be used on different platforms.

Scheduler and runtime modularity is also intended to be useful for heterogeneous architectures. For instance, consider a multi-gpu server. The current CPU scheduler with GPU custom buffers can handle a single GPU effectively, but probably can't adequately utilize the multi-gpu resources without a custom scheduling component.

Distributed Architectures

Sometimes it is useful to run a flowgraph across multiple host processors. One example could be a distributed DSP problem where channels of filtered data are sent to different machines for computationally intensive signal processing. This can be done manually currently in GR3.x with the use of ZMQ or Networking blocks and setting up orchestration scripts to control the flow between flowgraphs running on different machines.

The goal for 4.0 is to integration this behavior by use of a modular runtime that can automatically handle the serialization and configuration of graph edges that cross host boundaries.

There are a few main components to this feature:

  1. Serialization of stream and message data
  2. RPC control of the runtime
  3. Custom Runtime to be able to integrate things like Kubernetes

Streamlined Developer Experience

Improve PMT library


New Dependencies


Meson is a powerful build system that uses a python-like syntax rather than CMake

Originally intended as a placeholder for the build system (replace CMake) since it is easier to get things up and running quickly, it has turned out to be quite powerful and less mind-boggling. We should consider sticking with it.


Use yaml for preferences and for configuration of plugin components with a public factory method


Replace Boost.Test for c++ unit tests

Removed Dependencies

  • Boost (no boost is a hard requirement)

Vendorized Dependencies

The following dependencies are added as submodules (actually using meson's wrap functionality)

  • CLI11 (replaces Boost program_options)
  • cppzmq
  • nlohmann-json
  • moodycamel
  • pmtf