Close

On Vibe-coding Electronics

A project log for Muon Sortes

Using a custom six channel muon detector to create a cosmic oracle experience with a high quality e-ink display.

allan-binderAllan Binder 6 hours ago0 Comments

context: I'm building an e-ink smart clock, that uses 4x geiger counters to detect muons/high energy particles, and uses that to give users a cosmic oracle experience, and RNG based games/events. Press a button, ask a question, then a cosmic ray hitting the earths upper atmosphere answers.

3-4 million tokens before I ordered a single part
I have been toying around with this idea for a year or two (since watching Alpha-Phoenix's youtube video of the same idea) so I had a good idea of the shape and user experience that I was going for, but very little substance in terms of execution and design. I spent around two weeks and easily 2-3 million Opus 4.6 tokens, and >300k on GPT 5.4 reducing my uncertainty as much as possible. That phrasing is important because the model's were not in the designer seat; they would help me visualize and fill in understanding gaps far more than actually making architectural decisions. I'm sure they could have, but I wouldn't have been able to troubleshoot anything if it wasn't my own design. I think one thing people miss about AI workflows, when used properly, you learn exactly what you need to know, but not much else. If I were doing this 10 years ago I would have likely needed to learn 50-60% more information that was never used, because I didn't actually know what I needed to know. With the current generations of tools I was able to laser focus on information that was worth investing in, and basically nothing else. Take it or leave it, I'm sure some people will see that as passe, but experiencing it has been a breath of fresh air from my previous-overly-ambitious projects. 

The result of this pre-planning phase was a fully designed and simulated breadboard, high confidence decision tree and a breadth of edge cases. One of my proudest accomplishments in this project so far is not hitting any roadblocks that required ordering a new part to continue working. I only have maybe 3 or 4 unused components, now that the breadboard prototype is proven. Anytime I hit a snag, I had the right capacitor or resistor or ferrite bead or whatever already to just keep moving forward. The entire build process was planned in isolated phases with concrete deliverables at the end of each phase, so if a new problem arrived, it would be clear what general area the source is from. 

Claude Code as a live lab notebook

During the entire breadboard phase I had a terminal window open with Claude Code running alongside the physical build. Every time I moved a jumper, measured a voltage, swapped a component, or observed a behavior change, I'd relay it to the session to be documented. This was incredibly valuable, and allowed me to run simulations when i wasn't with the device (remote Claude code sessions running simulations of my breadboard design to test a hypothesis while I'm out on a run is wild, taking a step back). Another advantage of documenting through Claude Code was allowing me to export the full breadboard schematic with descriptions of problems, then going to GPT 5.4/5.5 for a second opinion. This was fruitful every time I did it, and the two models work incredibly well together, more on this later though. Multiple times I would give Claude "jumper 56,g-46,a" and it would tell me that 46,a is already taken by something else, are you sure, and I would find I wired something wrong an hour ago, then correct it before needing to power on again. 

Quick model notes
This project started mid-way thought the Opus 4.6 /GPT 5.4 lifecycle, then as both rolled over to 4.7 and 5.5 I switched over to that (gradually for 4.7; I'll explain in a second). Opus 4.6 did 95% of the planning phase with GPT 5.4 for second opinions when it was clear I was hitting a knowledge gap in 4.6, or it was struggling with something that was clearly a limitation of the model. It's hard to describe what that constitutes, but if you've worked with these models extensively, you get a feel for different types of mistakes and uncertainties. For planning specifically, I much prefer Anthropic models if only for their tone. I am capable of doing the technical side of things, but if the model's tone doesn't align with me, or their output isn't structured how I like, it takes mental energy away from the task. This is my number one issue with OpenAI models; on a technical level I am absolutely certain I would have been fine with GPT 5.x alone, but the list of lists and bullet points style writing feels not-human. Also, to comment their tone or prose, Anthropic models actually feel like they are enjoying the project, they will show "excitement" when I come up with a novel solution, and get terse when I am getting distracted. I have found OpenAI models much more egalitarian and "I-am-a-tool-don't-pretend-I-am-human" which is probably healthier for my mental state, but I don't get the boost of focus or mental energy that comes along with the model telling me I'm the bestest boy for solving a problem it was struggling with.

During the introduction of 4.7 I am always wary about giving new models access to production code (sonnet 3.7 iykyk), especially because 4.7 has a massive prose change from 4.6, which on it's surface doesn't inspire confidence. I kept 4.6 active for two weeks or so, occasionally flipping over to 4.7 only to see the overly verbose prose and the "wait this isn't right" outside of thinking tokens (really, that should only ever be in thinking tokens. I suspect Anthropic's post training restricted it's thinking budget to save compute so that stuff just got moved to the output, but I digress). But after it passing tests and finding things 4.6 missed I switched over fully and it's actually been better than 4.6 on most technical dimensions. It can certainly trace a signal path better, but I've found it makes mistakes after 250k tokens (all models do), so I have an alert to remind me to compact or make a handoff doc if I hit that. I really can't tell the difference between GPT5.4 and 5.5, but I only used either of them for less than 10 prompts each. Always professional and very very knowledgable. OpenAI models feel like a contractor from a different firm is in the office for the day, whereas Anthropic models feel like an old coworker that doesn't like small talk. 

Would I have done this without AI?

Honestly probably not. In hindsight I very clearly had the skillset and capabilities, but passing the threshold of confidence to take this idea and turn it into a physical object, was probably too high for me to reach on my own. I don't really have anyone in my life that is capable of troubleshooting or spitballing technical solutions if something goes wrong. Two geiger tubes would have certainly been reachable with tutorials, but after seeing it's low detection rate, and deciding that was a poor user experience, I don't think I would have had confidence to pursue 4 tubes/six way coincidence gates. This is also not mentioning anything about programming the firmware for the ESP32 or the e-ink display, which again would have been doable, but taken weeks not minutes, for me to learn and execute.  

Takeaway

Overall, amazing experience doing this AI assisted, Claude Code is an incredible tool. I understand a lot of people have mixed feelings about AI in general, but in my sample size of one, I am getting massive massive value from it on a daily basis, across disparate parts of my life. I would love to hear other people's thoughts. How are you using LLMs in a hardware setting? Do you think I actually did this myself or should I put an asterisks every time I say "I made this"? Anyway thank for reading o7.

Discussions