My big fat rookie mistake as a software design engineer

It seems that we often wax nostalgic around the holidays, and significant past events, whether good or bad, come back to us.

In talking with a colleague, I was reminded of my first rookie mistake as a software design engineer. While it felt horrible at the time, it taught me a very important lesson early on, about the critical importance of testing no matter how small a code change might be.

The year was 1983. I was working in the Texas Instruments Home Computer Division. It was my first full-time software engineering job out of college.

I had been developing the Hexbus BIOS for the new TI 99/8 computer for several months. After each major change I made to the TMS9995 assembly language code, I ran through a set of tests with various Hexbus devices, to ensure there were no regressions. The day had come to release the binaries for manufacturing into production ROMs. There was one last small (one byte) change that had to be made, to improve the handshake timing on the bus. I made the change and ran through all the device testing. Or so I thought.

Unfortunately, I had inadvertently placed the Hexbus HX-2000 Wafertape drive on a small shelf below my desk, and because it wasn’t right there in my sight among the other Hexbus devices, I neglected to test that one device. Not realizing my omission, I signed off on shipping the binaries for the final Hexbus ROMs. As I returned to my desk, I caught sight of the wafertape drive, and suddenly realized I hadn’t tested it in the final round. So, I immediately tested it, and sure enough, the one byte change I had made to the BIOS code had triggered a failure in one of the wafertape I/O operations. “Doh!” (This was about six years before The Simpsons series, and yet I somehow channeled Homer to instinctively say “Doh!”)

I immediately knew why the change had affected the wafertape device, fixed the regression, retested everything (and I mean everything, this time), and contritely went to my supervisor to explain what had happened, corrected binaries in hand. Luckily, we were able to stop the presses in time and get the fix incorporated into the production ROMs. Had production already started, it would have cost us $10,000 to stop and restart the process. My supervisor assured me that I would have been able to take care of that cost over time, through “a payroll deduction plan.”

The lesson?

Test everything, no matter how small the change to the code is. And double-check that you actually did test everything.

Leave a Reply