↖️ Blog Archive

Arduino TNC in Rust, Part Three

Bradley Gannon

2026-04-11

TL;DR: I explored several potential receiver solutions and built a modest prototype version in Python. The Nano’s hardware constraints rule out more complicated approaches, but the one I selected—which is based on counting zero-crossings—appears to work acceptably well for realistic signals.

Goal

I want to build a system that takes AFSK samples from an attached radio and converts them into AX.25 frames. Whenever the system receives a complete and valid frame, it must forward it to the host over the serial link. To do this, the receiver has to first identify whether the mark or space tones (1200 Hz and 2200 Hz, respectively) are present in the input and decide between them for each sample. The receiver must then use this new binary signal to perform clock recovery, which is the process of determining where the symbol boundaries are. This allows the receiver to sample the binary signal at the right moments to produce a new stream of marks and spaces at the symbol rate (1200 Bd) instead of the audio sample rate (often 44100 Hz).

The frames are sent with NRZI encoding, which means that a symbol change indicates a zero bit, and the absence of a symbol change indicates a one bit. Applying this rule converts the symbol list into an unaligned bitstream. Then the receiver scans the bitstream for a flag byte (0x7e) and aligns its byte stream accordingly. Subsequent bytes are recorded until another flag appears, taking bit stuffing into account. This is an unvalidated HDLC frame, which the receiver can validate by checking the CRC. Finally, if the CRC is valid, then the receiver extracts the AX.25 frame and sends it to the host.

Probable Solution

The Nano runs at 16 MHz, and most relevant instructions take 1–3 cycles, so for a ~40 kHz sample rate we’re looking at maybe 200 instructions per sample. And that’s actually somewhat more than the practical maximum, since a few cycles need to be left over for handling the serial connection. This is what I mean when I say “hardware constraints”—most modern approaches that assume essentially infinite compute per sample are out of the question. The Nano also only has 2 KB of RAM, a good portion of which is statically allocated for the receive and transmit message buffers, so I can’t get too clever with memory usage either.

But it won’t help to complain. I set myself this goal with this particular hardware, and I still have no reason to believe it’s impossible or even especially difficult. As far as I can tell, this chip would have been considered impressive when APRS was new.1 I can loosen the constraint somewhat by reducing the sample rate if needed, although this is in tension with receiver quality.

The plan is as follows. Sample the received audio at the highest practical rate (probably at least 20 kHz) and, given the 1200 Bd symbol rate, compute the number of samples per symbol. In this case, we’ll land somewhere between about 15 and 40. As samples come in, monitor their sign and record a zero crossing whenever the sign changes.2 Maintain a list of indices where zero crossings have occured within the current symbol period, pruning old ones and incrementing the rest on every sample. To reduce the effects of high frequency noise, disregard zero crossings that happen too soon after the previous one, where “too soon” means within some fraction of the space tone’s period.

Now we have a list of plausible zero crossings. Our goal here is to estimate the number of cycles that the received signal has completed during the current symbol window. If the number of cycles is near 1, then we can say that the symbol is probably mark, while if the number of cycles is near 1.8, then we can say that the symbol is probably space.3 We assume that the time between two zero crossings represents a half cycle of the signal, so the number of cycles between the first and last zero crossing is equal to one less than the number of zero crossings divided by two. But that only accounts for the cycles within that portion of the symbol window. For the extra parts beyond the first and last zero crossing, we estimate based on the assumption that the first and last pairs of zero crossings have the same spacing as the unknown ones beyond the symbol window’s boundary. Summing these three values gives our final estimate for the number of cycles completed during this symbol period. We can then apply a threshold at the midpoint (1.4) to decide which symbol is present in this window.

Here’s the same idea written as an equation:

c={Z>112[(Z1)+Sz0z0z1+z1z2z1]Z=112Z=00 c = \begin{cases} Z > 1 & \frac{1}{2} \left[ (Z - 1) + \frac{S - z_0}{z_0 - z_1} + \frac{z_{-1}}{z_{-2} - z_{-1}} \right] \\ Z = 1 & \frac{1}{2} \\ Z = 0 & 0 \\ \end{cases}

where cc is the estimated number of cycles in the symbol window, ZZ is the number of zero crossings in the window, SS is the number of samples per symbol, and ziz_i is the ithi^{\text{th}} oldest zero crossing.

Audacity screenshot showing 37 highlighted samples from an APRS transmission and some other samples on the edges. The signal starts slightly negative, proceeds through a negative peak, then another complete cycle before ending on a positive peak.
Consider this example, which is a tiny excerpt from the WA8LMF test CD. The highlighted region is 37 samples wide, which is one symbol period at this sample rate (44.1 kHz). There are three zero crossings at indices 4, 14, and 23 (counting from the right, zero-indexed). Using the arithmetic described above, this gives an estimated cycle count of 12[(31)+4144+37232314]2.0\frac{1}{2} \left[(3 - 1) + \frac{4}{14 - 4} + \frac{37 - 23}{23 - 14}\right] \approx 2.0. This is greater than 1.4, so we can conclude that the symbol is most likely space. The estimate is slightly high because the frequency changed near the beginning of the symbol period, but the relevant zero crossing is outside the symbol window. A slight improvement to this technique would be to remember all zero crossings within the window plus one more and use it to get a more accurate phase delta for the beginning of the window.

Our list of symbols gives an estimate per sample, but to get the actual message we have to perform clock recovery to get a list of symbols per symbol period. We can recover the clock by observing when the symbol estimate changes and assuming that moment to be the transition between two symbol periods. We then use this new alignment to produce a list of symbols at the symbol rate rather than the sample rate (by copying the estimated symbol at the midpoints between transitions into a new list). Applying NRZI decoding to this list of symbols gives a list of bits, which we can align and decode using more or less the same logic that I described above.

An important property of this approach is that it doesn’t require much compute. I haven’t done a proper analysis, but essentially the Nano only has to perform a zero crossing check for each sample, update the zero crossing list, and do a few arithmetic operations to get the new cycle estimate. I expect the whole routine to need at most a few dozen instructions. It’s also worth noting that, while I’ve described the algorithm as if we have access to all samples, it works equally well when operating on one sample at a time.

I wrote a Python version of this algorithm up to byte alignment, which I’ve included below. Notably, this version neglects bit stuffing. It seems to work well enough on some real and synthetic test data, but I haven’t done enough experimentation to quantify its performance. I think I could compare it against direwolf with the right test setup. It’s almost certainly a lot worse than direwolf, but then again direwolf can’t run in real time on an 8-bit MCU.4

Show/hide Python prototype
from scipy.io import wavfile
from math import ceil

from enum import auto, Enum

sample_rate, data = wavfile.read("./input.wav")

SYMBOL_RATE_HZ = 1200
samples_per_symbol = ceil(sample_rate / SYMBOL_RATE_HZ)
CYCLE_THRESHOLD = 1.4


class Symbol(Enum):
    MARK = auto()
    SPACE = auto()


class State:
    def __init__(self, window_size):
        self.window = [0] * window_size
        self.sign = True
        self.zero_crossings = []
        self.estimated_cycles = 0
        self.symbol = Symbol.MARK
        self.symbol_phase = 0

    def update(self, sample):
        self.window.append(sample)
        self.window.pop(0)
        self.zero_crossings = [
            z + 1 for z in self.zero_crossings if z < len(self.window)
        ]
        sample_sign = sample > 0
        if self.sign != sample_sign and (
            len(self.zero_crossings) == 0
            or self.zero_crossings[-1] >= samples_per_symbol / 8
        ):
            self.zero_crossings.append(0)
        self.sign = sample_sign
        if len(self.zero_crossings) >= 2:
            self.estimated_cycles = (
                (len(self.zero_crossings) - 1)
                + (len(self.window) - self.zero_crossings[0])
                / (self.zero_crossings[0] - self.zero_crossings[1])
                + self.zero_crossings[-1]
                / (self.zero_crossings[-2] - self.zero_crossings[-1])
            ) / 2
        elif len(self.zero_crossings) == 1:
            self.estimated_cycles = 0.5
        else:
            self.estimated_cycles = 0

        self.symbol_phase = (self.symbol_phase + 1) % len(self.window)
        sample_symbol = (
            Symbol.MARK if self.estimated_cycles < CYCLE_THRESHOLD else Symbol.SPACE
        )
        if sample_symbol != self.symbol and self.symbol_phase > 4:
            self.symbol_phase = 0
            self.symbol = sample_symbol


state = State(samples_per_symbol)
previous_symbol = Symbol.MARK
buffer = 0
byte_phase = 0
for i, d in enumerate(data):
    state.update(d)
    if state.symbol_phase == 18 and 0.5 < state.estimated_cycles < 2.5:
        buffer = (buffer << 1) % 256
        if state.symbol == previous_symbol:
            buffer += 1
        previous_symbol = state.symbol
        byte_phase = (byte_phase + 1) % 8
        if byte_phase == 0:
            print(f"0x{buffer:02x} ", end="")
        if buffer == 0x7e:
            byte_phase = 0
print()

Other Ideas

Matched Filter

It seems that the optimal approach for this problem is to use a matched filter. I don’t understand the theory for this solution, but in practice it amounts to a bunch of multiplications and additions over the symbol window between the input and some digital oscillators. By convolving the mark and space oscillators over the input, we can compute the energy of the signal with respect to that frequency. I think this is related to the Goertzel algorithm, but I’m not sure. In any case, this would be an attractive option if it didn’t require so much arithmetic. Every sample requires computation over every preceding sample in the symbol window, which gets expensive in this setting. But I’ve already admitted that my understanding is incomplete, so maybe there’s some speed trick I’m missing here.

Analog Filtering

I touched on this last time. Instead of doing everything digitally, why not build bandpass filters in the analog domain and feed their outputs to independent ADC inputs? Then the software just has to compare voltages to make the symbol estimates. There’s nothing really wrong with this solution, it’s just that it adds part count and assembly complexity while reducing flexibility. Also, most radios suitable for APRS have some amount of emphasis applied, so in this case the receiver would need to apply AGC to the bandpass outputs in order to compare them fairly. But the main thing (for me, anyway) is that the filters need to be pretty sharp because the tone frequencies are only a quarter of a decade apart. And I’m also much more comfortable in the digital domain.

Instantaneous Phase Estimation

This is something I came up with on my own, so I’m naturally biased towards it. The idea is to try to figure out what the transmitter’s phase accumulator value is for each sample. As the samples come in, the phase differences between any two samples should be close to one of two values (one for each of the two tone frequencies), so it should be possible to make a symbol estimate based on that. If you know the amplitude, which you could maybe get from a fast AGC, then you can use acos or asin to get an ambiguous phase that’s either in one half of the cycle or another. The sign of the signal’s slope resolves this ambiguity, so the receiver is left with (hopefully) the correct value for the phase at the current sample. This would in principle be much faster to respond than the other methods because it only considers the current and previous sample.

There are several problems with this solution. First, it requires knowing the signal amplitude, which probably isn’t a dealbreaker but is definitely a failure mode. It also relies on inverse trig functions, which means using a lookup table to keep speed high. But the biggest problem is that it’s probably pretty susceptible to noise. I think you could mitigate some of this with moving averages and hysteresis, but overall it’s a lot of complicated machinery for questionable benefit.

A Note About Phase

Assuming a phase-continuous signal (which we’ve been doing implicitly this whole time) and a phase accumulator that starts at zero, we would expect a symbol transition to fall on some value t=(m+1.8s)mod1t = (m + 1.8s) \mod 1, where mm and ss are the numbers of mark and space symbols transmitted so far, respectively. This turns out to only allow t{0,0.2,0.4,0.6,0.8}t \in \{0, 0.2, 0.4, 0.6, 0.8\}, so at any zero crossing we can say that the symbol period boundary is exactly tt away. I haven’t worked out how to use this observation to the receiver’s advantage in clock recovery, or even whether it’s useful at all, but it seems significant to me. If you could find a way to select among those five options, then clock recovery would become easier and more reliable.

Next Time

Now that I’ve yapped a bunch about receiver solutions in theory, I can spend next time trying to build it for the Nano. I did begin some refactoring to make the receiver integration easier, but this devolved into improving the KISS state machine to support non-data commands. I’ll probably need to complete that before getting into the receiver implementation. I’m also curious about the receiver’s performance relative to direwolf, which is probably easier to test in Python or similar. So maybe I won’t get to the receiver itself for a while after all.


  1. Wikipedia claims that Bob Bruninga built the first recognizable APRS system in 1984 on a VIC-20, which had comparable RAM capacity to the Nano but only a 1 MHz clock.↩︎

  2. The HAL represents the ADC output as an unsigned value, which is fair enough because it’s supposed to represent a non-negative voltage relative to ground, but in this case we have to define the midpoint as “zero” and handle the arithmetic based on that.↩︎

  3. For symbol tones at 1200 and 2200 Hz, a symbol rate of 1200 Bd gives 1 and 1.8 cycles per symbol, respectively.↩︎

  4. To be fair, neither can this algorithm (yet). But it’s much more plausible.↩︎