Hackfest1310
Hackfest October 2013
This hackfest was a one-day event following GRCon '13.
Note that one day is not lots of time - in most cases, the work done was discussions and planning on how to proceed.
GNU Radio Companion
Many people hacked on GRC. Among other things, an unofficial roadmap was written to guide future developments. Things hacked on include:
- QT Port
- Inline Python blocks
Hackers: Sebastian, Isaac, Johannes
Covering Coverity Issues
We have a new static code checker at http://coverity.com. Many people have volunteered to fix bugs brought up by this.
Better CRC support in packet headers
Now uses up to 8 Bits of CRC.
Hackers: Martin, Matt
Improve fosphor GR integration
Tried and finally deferred WX support.
Then worked on QT integration (some core patches, some gr-fosphor refactoring)
Hackers: Sylvain
GNU Radio buffer support for accelerators
A design was hashed out for allowing GNU Radio block buffer memory to be managed by the blocks themselves, to support zero-copy transfers to and from block accelerator hardware memory. This will be useful for supporting FPGA-, DSP-, and GPU-based devices.
VOLK and embedded
Debugging infrastructure to include pure asm proto kernels. A NEON kernel or two was written and hashed out. First run at instructions to help people getting started with cross compiling GNU Radio at OpenEmbedded. There was also some OE work/thinking cleaning up GR recipe and kicking up meta-sdr.
Better Support for Burst Modems
I started this work at the Hackfest in June 2013 with help and input from Ben Reynwar, Johnathan Corgan, and Matt Ettus. At the GRCon13 hackfest, I was aided by Jeff Long, Nick Foster, and Kapil Borle. A huge thanks to them for their help.
We've been wanting to have better burst modem support in GNU Radio for a while. The problem has been that we built the phase, time, and frequency synchronizer blocks that are great for continuous streaming modems but can have fairly long convergence times. In burst modems, we might only get a packet at a time with long areas of dead air where the synchronizers will lose tracking. When a burst comes in, the convergence takes too long to properly acquire the symbols.
I started working on a solution to this at the Hackfest in June, 2013. The idea was to create a block that would correlate against a known preamble and use the results to estimate the timing and phase of the symbols at this point. The peak of the correlation gives us the estimate of where the optimal timing offset should be and the phase of the correlation is the phase estimate of the symbols at that point. We could extend this by correlating twice to get two peaks and take the phase difference between those peaks over the time period and estimate the current frequency offset, as well, assuming that the frequency is close enough that there is less than a 2pi rotation between peaks. But we'll save that for later.
The new block we came up with is gr::digital::correlate_and_sync. It takes in as arguments to the constructor the preamble as complex symbols, the pulse shaping filter, and the number of samples per symbol. The fourth parameter is the number of filters to use in the internal PFB filterbank we use to create the correlation vector.
The preamble is a vector of symbols. I expect that the preamble will most likely be binary, like BPSK, but this concept should work fine with higher order modulations, too. The filter is a set of taps for the pulse shaping filter, like an RRC filter. In the constructor of this block, we take the filter, the number of samples per symbol, and the number of PFB filters and pass this to a PFB arbitrary resampler. We then run the preamble symbols through the resampler, which produces the upsampled and pulse-shaped symbols that we are looking for at the receiver. Note that sps does not have to be an integer. The block's task is to correlate all incoming samples with our pulse-shaped preamble. If there is a correlation, we have found the start of a packet and can calculate the timing and phase estimates. The constructor also calculates the correlation threshold value. For this, we take \sum_i(s[i]^s[i]) where s[i] is the pulse-shaped preamble symbol at index i and ^ indicates the complex conjugate. We then take 90% of the sum of the power in the preamble as our threshold. This concept is fairly arbitrary and should probably be worked on for a more robust method.
The work function of this block is pretty simple. It has one input stream and one output stream. The input is the complex samples from the receiver. These samples are copied directly onto the output stream. The samples are also filtered using our pulse-shaped preamble vector. The correlation peak is tested by first observing that the width of the correlation peak will be sps number of samples wide. So we look at sample i and sample (i-sps). If the difference between these two symbols is greater than the threashold we calculated in the constructor, we declare a correlation. We then try to hone in on the peak by walking through the correlation peak until the next value is less than the previous value. This is determined to be our peak.
With the peak value identified, we then calculate the timing estimate and phase estimate. The timing estimate is calculated by looking at the center of mass of the correlation peak around the nearest 3 samples. This provides us with a fractional sample estimate of where the correct sample timing really is. We then calculate the phase estimate by taking the arctan of the correlation at the index of the peak correlation.
The block then takes the timing and phase estimates and produces tags at the correlation peak's sample index, called "time_est" and "phase_est," respectively. The timing estimation is a fractional value of where the estimated peak is beyond the where the tag is. The block also produces a "corr_est" tag that contains the estimate of the correlation value. We do not currently use this value, but it seems potentially useful as an indicator in the confidence of our time and phase estimates.
Follow-on processing blocks can now look for these tags and use them to reinitialize their states. Currently, we have programmed the gr:digital::pfb_clock_sync_ccf block to look for and use the "time_est" tags while the gr::digital::costas_loop looks for and uses the "phase_est" tags.
In the PFB clock sync, the block looks to see if any "time_est" tags are in the current window. If it finds one, it extracts the timing offset estimate. The calculation done here converts this fractional sample estimate to the closest filter arm that represents that offset. Knowing the right arm of the filterbank to use gives us a small error value that we can then quickly lock onto and track. If no tags are found, the block behaves as normal and simply keeps tracking the timing error of the incoming samples.
The Costas loop looks for "phase_est" tags and extracts the phase estimate from there. Because the phase is calculated directly as the phase in radians, which is what the Costas loop tracks, we simply take the value from the tag and set the current phase of the loop to it. The loops frequency is not affected and is allowed to continue running as usual. Again, if no "phase_est" tags are found, the Costas loop will simply keep tracking both phase and frequency of the incoming samples.
We created an example script that generates burst packets with a known preamble, passed the samples through a channel model where we could change the noise, frequency offset, and timing offset, and then into the receiver. The receiver included the correlate_and_sync, pfb_clock_sync_ccf, and costas_loop_cc along with a few stages of plotting to see the data streams. This simulation is checked in as gr-digital/examples/demod/test_corr_and_sync.grc. I have since created a UHD transmitter and receiver called uhd_corr_and_sync_tx/rx.grc in the same directory to allow easy over-the-air tests.
The simulation test looks at the output of the correlate_and_sync block to see the correlation and make sure they are behaving properly. We also look at the resulting time and phase/frequency synchronized samples in both time and as a constellation. For the time domain plot, we trigger off the "time_est" tag to make it easier to see and understand what's going on. The constellation plot is only marginally useful. Because we have bursts, many samples are noise around 0, so those show up prominently. Also, when we first receive packets, there is a time before the tags are used where the timing and phase are off from the previous dead air time. This makes the constellation look worse than it is. I like to use the drop-down menu of the constellation plot (using the middle mouse button) to set the data stream's transparency property to "high." This helps by dimming out areas where only a few samples occur and making the real constellation points more densely shaded.
We have now worked out the major issues in getting this to work, which partially involved getting the math correct, though more time was spent on handling issues of tags through rate changing blocks and blocks with delays. I was able to do an over-the-air test with great success. Below is a screenshot of the receiver side of the OTA experiment with the bursts being easily observed.
[[File:ota_corr_and_sync-scaled.png|]]
Lessons Learned
Debugging this was a great thing for GNU Radio. We learned a lot, fixed a few things, and produced some great new tools for interacting with signals and the tag streams.
First, we learned how to handle tags through blocks that have a dynamic relative rate. Most blocks are synchronous and therefore have a static relative rate. We can easily move the location of tags through these rate-changing blocks based on this value. But a clock sync block does not have an absolute relative rate. It will be approximately 1/sps, but this can shift to one side or the other depending on the timing offset and how the algorithm converges. Sometimes it may produce more or less than expected because of this. Because of this, we have added a concept of being able to update the rate over time as the block works. For a block like a clock sync, we call "enable_update_rate(true)" in the constructor. The scheduler uses this flag to change the block's relative_rate by recalculating it as nitems_written/nitems_read. This is the long-term rate of the block and makes sure that the tags don't improperly walk away as the rate changes. This turned out to be our single biggest problem because the Costas loop block was now getting tags improperly delayed by the clock sync so that the phase adjustment was occurring on the wrong sample.
Second, we learned how to handle tags being propagated through a block with a delay, like a gr::block::delay or a FIR filter. We can now let the blocks tell the scheduler what the number of samples of delay it imposes on the data stream by using the "declare_sample_delay" call. The scheduler uses this to move tags appropriately. So we can now say on a FIR filter that we want the tags to propagate and come out located at the group delay of the filter (like (N-1)/2 for a symmetric FIR filter). This does not set the delay of a block; it only allows us to tell the scheduler what that delay is. For most blocks, this is 0 and defaults as such. For a filter block, we actually don't know what the delay is or what a user might want the delay to be, so it is a settable value. There is now a box to enter this value in the GRC properties boxes for FIR filter (fir, interp_fir, and fft filters).
Third, we have added some new features to the QTGUI time sinks. We can now see any stream tags coming into the sink. They are displayed as markers on the plot as "key: value" at the sample they are associated with. We also now have a fully-featured triggering system in the sinks. We can trigger of a rising of falling edge exceeding a level or we can trigger when we see a tag's key. These are very useful features for debugging and exploring how a system is working. These features will be part of the 3.7.2 version release.