Self-Playing Piano, Part Three

TL;DR: I improved my design by simplifying the microcontroller arrangement and moving most of the processing logic to ordinary userspace software. The new controller code communicates directly with the key driver boards over UART-over-USB, which is more resistant to EMI than I2C or SPI. I also learned how to make PCBs on a CNC milling machine, which has reduced my iteration time and cost for board development. I’m disappointed by the apparently small amount of progress I made during this phase, despite working in my spare time all summer. This is mostly due to the many ideas that I tried and discarded during that time, which has hopefully resulted in a better system.

If you just want to see this current design, click here. As always, the project is on GitHub here.

First Board Spin

In the middle of May, I ordered a set of 11-key driver boards from DKRed. They arrived about two weeks later¹, and at first they seemed to be working fine. This didn’t last, and over the next several weeks I fought several design issues of my own creation. The design used a standalone Atmega328P microcontroller with the minimum supporting circuitry, and my intention was to send PWM commands to it over I2C. A 3-pole DIP switch would encode the identity of the particular module and its corresponding position on the keyboard, which would allow it to respond to the correct subset of commands appearing on the I2C bus. In this way, I could build identical circuits and code across all modules and just set the DIP switches.

Power Problems

I designed the solenoids to run at 24 V, but the Atmega328P needs a 5 V supply, so I added a cheap buck converter module to step the 24 V supply down to 5 V. I learned quickly that this wasn’t a stable solution. When a solenoid turned on, the current demand on my bench supply was so high that the output momentarily drooped by several volts. This caused the buck converter to droop as well, resulting in a microcontroller brown out and reset. The reset caused the controller to dump all of its state, which gave the appearance of the solenoid not being activated at all except for a brief pulse.

One way to solve this would be to add bulk capacitance in the neighborhood of half a farad to the 24 V line. My rough math suggested that this would be enough to bridge the gap while the supply recovered. This is probably a good idea regardless of any digital electronics concerns because it gives more power to the solenoids in the early PWM cycles, which may improve performance slightly. But in the short term I decided to put off this improvement and simply power the 5 V net with a separate USB supply from mains power. This was easy enough with a commodity USB charger and a cable that breaks out the four lines.

A bigger problem arose when I started running more solenoids. In the design phase, I didn’t understand that it was a good idea to separate digital and power ground to prevent transient shifts in potential from corrupting the digital circuits. I’d assumed that “ground is ground”, but the high currents through the solenoids violate that assumption.² This meant that my decision to add +24 V and GND copper pours throughout the board (front and back, respectively) was a bad one. The proper way to do it—so I’m told—is to define a single point where all different ground “types” meet (ideally right next to the supply) and keep them separate everywhere else. Also, don’t do something silly like using filled zones to couple different nets together. Anecdotally, my later designs with this approach don’t have nearly the same power-related issues to their predecessor.

This video helped me understand some of the basics, although I don’t claim to understand half of the total content.

Electromagnetic Interference

I2C uses two wires that carry a clock (SCL) and data (SDA). It’s not any more resistant to electromagnetic interference (EMI) than a random pair of unrelated wires, since that’s more or less what it is. Somehow I’d forgotten that solenoids with decently high current and fast switching will generate substantial EMI, particularly in their immediate vicinity (~10 cm). This proved to be a big problem and would often clobber bits.

At first, I tried mitigating this with error detection (i.e., parity bits). This solved the wrong problem. That is, it allowed the receiver to reject messages that were complete but unintelligible, but the real issue was that the EMI was causing messages to be considered incomplete. The receiver expects a certain number of clock pulses during certain I2C states, and if it doesn’t see all of them then the state machines on the transmitter and receiver can lose sync. This usually causes a “stuck bus” where the receiver either won’t respond to commands anymore or won’t even relinquish control of the bus because it’s still waiting for the previous message to be completed. At that point it’s pretty much game over, and parity bits won’t help.

Then I tried using the general call address, which is a special address that all receivers respond to. It’s the “hey everybody!” of I2C. I thought I could use a message for that address as a way to get all modules to briefly stop operating their solenoids and mostly eliminate EMI so further messages could come through unaltered. This suffered from the same issues because it still required a proper I2C frame to make it through. Eventually I broke out of I2C completely and used a separate pin on the Teensy connected directly to a separate pin on the Atmega that I called the “listen pin”. This pin’s only job would be to indicate whether the modules should be running PWM on their solenoids (logic high) or listening for I2C messages (logic low). This approach worked well, but it required a separate line out to every module.

I did briefly consider adding shielding to the I2C lines, but I never got around to it because it seemed like a bandage when a proper “design-level” treatment was needed.

Failure Summary

What a mess. By the time I gave up on this design, it was already almost July. Along the way I encountered lots of other smaller problems, mostly of my own making. For example, did you know that cheap DIP sockets have a low two-digit number of insertion cycles? Well, now I do, after an evening of confusion.

Detour: Routing PCBs at Home

After my first board design failed, I decided to reduce the cost of further board iteration by learning how to make simple boards at home. I’ve written about a different technique before that was somewhat reliable, but I didn’t like that there were chemical by-products, and in any case that process can’t support through-hole components. After some research to confirm that it was feasible, I bought a 3018-PROVer V2 desktop CNC mill from SainSmart. This is a small, inexpensive, entry level mill, which reduced my concerns about the possibility of completely failing to get anything working (or breaking something).

As it turns out, this machine works quite well for milling reasonable PCBs.³ The way it works is delightfully straightforward. You start with a piece of copper-clad FR4, which is just ordinary PCB material with thin copper foil on one (or two) sides. These are plentiful on eBay, although the quality (i.e., flatness and surface uniformity) is questionable. Before using the mill, the entire surface is conductive, so there’s no structure to confine electrical signals. The PCB design from Kicad (or whatever other EDA software you like) is a description of where those structures should be. If we define a small zone around each structure and remove the copper there, then the structure will become electrically isolated from its neighbors. The mill achieves this by spinning a small, sharp piece of metal at high speed and precisely moving it throughout the zone. Repeat for all zones, drill all holes, and finally cut out the PCB from the larger piece of FR4. Sand down the sharp bits and it’s ready to solder.

This is called “isolation routing” because it isolates the parts of the circuit that shouldn’t be electrically connected. Generally it’s best to use a routing bit that’s shaped like the letter V, often called an engraving bit, since the tip is small enough to route the tiny features of the board. I mostly followed this guide, except instead of FlatCAM I used pcb2gcode since FlatCAM seems unmaintained. Kicad even has a CLI that lets you export the kinds of manufacturing files that pcb2gcode needs, which allows me to wrap the whole thing up in a Makefile.

One notable detail with this process is that it’s necessary to mirror the PCB design before routing (at least in the case of the “front” layer, which is the only layer in a single-sided design). The reason is that through hole parts are almost always designed to be soldered on the bottom side of the board, but the mill operates on the top side. To make it all come out right, you have to “unflip” the design in software because you already need to flip the board to mill it.⁴ I couldn’t find a CLI tool to flip G-code, so I wrote a basic one as a part of a larger “G-code toolkit” that I call gctk. This tool also helps with simple translation, and I intend to add more operations as I need them. G-code is complicated, but it’s not so bad if you restrict yourself to a reasonable subset.

The use of an engraving bit makes precise control over tool height critical for success. The FR4 blanks I bought are of the “1 oz” variety, which means they’re supposed to have one ounce (about 28 g) of copper on every square foot of substrate, giving a thickness of about 0.035 mm. That’s all we need to remove, but in principle if we remove more material then it’s not a big deal. The problems are that the (a) engraving bit widens as it goes deeper, so we need to control its depth to control routing width, and (b) the material is generally not flat, especially when clamped down. These problems combine in such a way that if you assume the material and machine are in the same plane, you’ll get unusably inconsistent routing.

My machine came with a height probe that uses electrical contact between the tool and the probe to tell the controller where the probe is relative to the tool. With some light modification, it’s easy enough to get the engraving bit to make contact directly with the copper surface and get pretty good measurements of the surface height (~0.01 mm precision maybe?). Candle has a mesh probing and leveling feature that helps a lot here (also called a heightmap).⁵ I did briefly try building this into gctk, but Candle’s interactive and integrated approach is much better.

FR4 is basically fiberglass, so it’s important to not breathe the dust that the mill produces when cutting it. I wear a mask and use my shopvac with the machine’s “dust shoe” to suck up the dust at least during outline routing, which creates the most chips. I’ve seen some people use oil on the cutting surface to prevent the dust from escaping into the air as much, but I haven’t tried it myself.

Overall, this part of the project was a success. I ran several more board iterations for essentially zero cost and with much faster turnaround than even the most impressive board houses. As usual, this at-home process has worse tolerances than a professional board, but that’s fine for prototyping. I also don’t have to pay for three or four boards in an order when I only need one to validate or reject the design. I’m excited to use this machine for lots of other stuff too. A CNC mill has been on my list for a while, and now I’ve got a cheap one to experiment with.

Current Design

I stumbled around in the idea forest for a while before arriving at the design I’m working with now. I’ve done this too many times to seriously think that this is the final design, but I do think it’s the best one I’ve come up with so far.

Ditch the Teensy, Keep the Arduino

Earlier in the project, I’d been testing my Atmega328P code on an Arduino Nano, which is a development board with several convenient features. I intended to use a standalone microcontroller with the bare-minimum supporting components to reduce cost, but it turns out that Arduino Nanos aren’t much more expensive than their components and have more features than the standalone arrangement, so there isn’t much benefit in avoiding them.

One major convenience that the Nano offers is programming and communication over USB. The board has a USB-to-UART chip connected to the UART lines on the Atmega328P, so serial communication with the USB host is trivial. The USB connection also provides power, so at some point I realized I could probably get rid of all connections to the key modules except for USB and solenoid power/ground. USB is faster and more resistant to EMI than I2C for this application because the USB standard defines a differential pair at the physical level, which gives good common mode rejection. Also, USB cables with shielding and ferrite beads are plentiful and inexpensive. When you combine all of these benefits, it’s clear that USB is at least worth trying.

I did try it, and I was happy with the results. I haven’t seen any system behavior that I can attribute to communication failures like before. The only disadvantage to this approach that I’ve found so far is that there will be eight devices in the full system, so I’ll need to use USB hubs to connect them all to my machine. I also might need to inject power on the buses to power all the Nanos, but it’s unclear whether that will actually be necessary.

This change leaves the Teensy without much to do except being a kind of “smart USB hub” that translates MIDI messages received as a USB device into serial messages sent as a USB host to the key modules. USB host support isn’t available for the Teensy under Rust, and the Teensy’s existence in the system adds some complexity anyway, so for now I’ve decided to remove it. The key modules now connect directly to my laptop, and some host-side code presents a MIDI interface to the OS and sends serial messages to the appropriate modules directly. This could easily be a Raspberry Pi or any other random computer with a USB port.

Rust on Arduino Nano

Up until now, I had been writing AVR assembly or C for the Atmega328P to minimize build complexity on that side of the interface. This choice did achieve low build complexity, but in exchange I had to manage communication logic at the interface between the Teensy (in Rust) and the MCU (in C). After switching to USB as the physical layer, I realized I should probably just use Rust everywhere and define the communication protocol in a common crate dependency for both binaries. With some care, I was able to build serialization and deserialization logic that both sides could import and use, with the correctness of the logic being ensured by library tests.

But eventually I realized that I could go further. In earlier iterations of the communication protocol, I was counting bits and being careful to minimize waste for performance reasons. Every message would tie up the whole bus and stop all PWM operations, so it needy to be snappy. Under USB, with high data rates and independent buses, this was less important and I could allow ergonomics and reliability to drive the design. With this in mind, I converted to postcard, a serde implementation specifically for embedded devices. Now I didn’t even have to define the protocol myself. As long as both sides use the same structures, serialization and deserialization routines are generated automatically at compile time. I could now move freely in the design space and avoid chasing down annoying bugs related to the communication channel and encoding.

PWM Schedule

Pulse width modulation is more or less the canonical way to achieve variable force with solenoids. The driver circuits generally use solenoids because they’re fast, small, inexpensive, and have no moving parts. The only real issue they have is that they aren’t that efficient if you turn them “halfway on”. Excess power dissipation in this region can be destructive, so it’s best to turn FETs all the way on or all the way off.⁶ This doesn’t leave you with much choice in force, though, and in this application we care about being able to vary force with at least a modest dynamic range. It’s especially important to be able to reduce the average current through the solenoids while the keys are pressed so they hold the keys down but don’t overheat. Fortunately, we can use the high switching speed of MOSFETs to rapidly turn them on and off, giving a lower average power delivered to the load. The exact value of this average depends mainly on the duty cycle of the square waveform, which is the fraction of a complete cycle during which the FET is on. If the frequency is high enough, we get the illusion of varying force.

But there’s another problem. If the frequency is below about 20 kHz, the switching frequency becomes audible, which is annoying and distracts the listener from the sound of the piano. If we’re going to use PWM, then we need to achieve ultrasonic switching. Since the beginning of the project, I’ve been up against this problem. Fast microcontrollers can do it no problem, even without using hardware PWM peripherals, but I want to do it on an Atmega328P, since practically everyone who’s played with hobby electronics has one laying around. That chip only runs at 16 MHz (usually), which doesn’t give a lot of cycles with a 50 microsecond waveform. (I went into more detail on this in my previous post on this project.)

I’ve found an approach that sidesteps this problem. All of my solutions so far have relied on the idea of dividing each 50 microsecond period into 128 velocity periods. I call them that because they’re the only moments when a given signal can change from high to low. Selecting among these 128 “off ramps” gives the same 7 bits of precision that MIDI velocities encode.⁷ This approach requires that the microcontroller check the desired states of all of its keys at every step and set them accordingly. That’s a lot of work to do in only a few hundred cycles, so it pushes the design into weird places.

Somewhere along the way I observed that rather than viewing the total waveform as containing 128 opportunities for changes, we can view it as containing a maximum of (in this case) 11 changes across all keys. In other words, the current system spends a lot of time checking for state changes unnecessarily. I reframed the code to execute a list of actions, where an action is either “delay by some number of microseconds” or “change this pin’s state to high or low”. Then, to set particular duty cycles for a set of keys, we compute the “schedule” of actions once ahead of time and loop over it until a new one comes in. In the default case of all keys being off, we get the default schedule (set all pins low, then delay 50 microseconds). This approach is much easier on the microcontroller and scales up better with the number of keys. We retain the ability to encode arbitrary PWM duty cycle, and best of all we can serialize this as a heapless vector with postcard and unpack it on the other end with no problems.

Next Steps

I set out in May with the intention of expanding my design to fill the whole keyboard. I thought my design was solid enough to handle that expansion, but it wasn’t. Learning this over the course of several months was hard, and at times I was frustrated, stressed, and discouraged.⁸ Those aren’t the kinds of emotions that I want to feel when working on a hobby project, at least not often. These feelings are what prompted me to make changes to my project “operating system”, as I’ve written about previously.

I’m still glad to have learned so much and come up with what I believe is an even better design. I can’t imagine this will be the final iteration, but my grandfather would say that I’m just “finding all the ways it don’t work.” Eventually I have to run out of places where problems can hide.

Anyway, next time I’ll make another attempt at a fully working key module and maybe try out some improved mechanical approaches for supporting the solenoid assemblies.

I ordered four boards, but I was pleasantly surprised to find eight in the package. I guess it was cheap enough to fit a few extra on the panel that they decided to throw them in.↩︎
Because the resistivity of copper isn’t zero like we can typically assume.↩︎
“Reasonable” means 0.5 mm trace width, 0.5 mm trace clearance, and 0.8 minimum hole size. Kicad makes it easy to set and check these design rules. I’ve heard you can reduce these tolerances somewhat, but in those cases another process may be better.↩︎
The back layer is already flipped in Kicad and probably other EDA packages, so there wouldn’t be any need to flip it again in, say, a double-sided board. But I didn’t try those in this project.↩︎
Just make sure to use the latest HEAD on GitHub, since I think the heightmap feature is broken on the latest release in nixpkgs (1.1). I don’t remember exactly which releases work and don’t at this point.↩︎
Incidentally, in all likelihood several billion MOSFETs operating in this digital regime have collaborated to help you read this.↩︎
In practice, I doubt that any piano mechanism can reliably reproduce seven bits of loudness precision, and in any case my own actuator is even less likely to be able to meet that requirement. But even so, it’s nice to shoot for zero precision loss as a challenge.↩︎
There were a few other side quests that I didn’t write about here because they weren’t interesting enough, but maybe they’ll show up later on.↩︎

🔗First Board Spin

🔗Power Problems

🔗Electromagnetic Interference

🔗Failure Summary

🔗Detour: Routing PCBs at Home

🔗Current Design

🔗Ditch the Teensy, Keep the Arduino

🔗Rust on Arduino Nano

🔗PWM Schedule

🔗Next Steps