Monday, November 28, 2016

Bug Hunt

Is this going to be a stand-up fight, sir, or another bug hunt?

PFC Hudson, Aliens
If the bugs are seven feet tall, then a bug hunt is a stand-up fight.

Thirteen days. That's how long it took me to hunt and kill this one. A WPF GUI on top of a .NET wrapper on top of a C++ wrapper for CLIPS.

After overhauling the internal representation of the primitive data types in CLIPS, successfully passing the regression tests, and successfully upgrading the CLIPSJNI demo examples, I thought upgrading the .NET examples would be fairly straightforward. But two of the examples were periodically crashing.

So I went back through past revisions trying to find the point at which the code stopped working. I finally got to the point where the removal of a single unused slot from a struct caused the crashing behavior. Not good at all since this was now likely a corruption issue.

As I delved further into the issue, it became clear that this was a problem related to both running CLIPS in an embedded mode and the CLIPS garbage collection routines. Debugging the issue in Visual Studio was also a huge PITA for a variety of issues, so when I reached the point at which it was clear that the issue was unrelated to the WPF and .NET code, I created a C++ example that produced the same behavior running with MacOS.

That was day thirteen. Once I had the program in Xcode, that's when the magic happened thanks to these diagnostic tools:



With the Address Sanitizer enabled, I was able to immediately determine where the initial issue started and based on thirteen days of scouring the code, quickly implement a fix for it with a half dozen lines of code. So thanks to the fellow who originally told me about these Xcode diagnostic tools. It looks like there's some similar functionality for Visual Studio, just not as easy to use as checking a box.

No comments:

Post a Comment