2021. What an interesting year. With the world turned upside down by a pandemic that seemingly had its sights set on...
Bringing Artificial Intelligence Development in a DevOps World
If you didn’t attend N2TUG, you missed out on some great presentations. It was a fantastic conference at the Texan Gaylord resort – basically an arcology completely protected from the elements and the hot Texan weather with little community pods scattered throughout. It was somehow reminiscent of slightly connected data center, with a few silos; rather an appropriate venue, in my opinion anyway. What caught my attention was Justin Simmons’ presentation taking about AI. While not a normal thing to talk about at NonStop conferences, HPE has made some large strides, bringing Apollo (massive computing power) to market and making a lot of new things possible. With NonStop being increasingly integrated into the Enterprise, capabilities are becoming more available to our own applications.
It made me think – what doesn’t – that it might be time to investigate what we can do to use what we have today to cut down our development costs. Well, my costs, in particular – I’m selfish that way, if you didn’t know. If I can save the cost of a code reviewer or assign them back to development rather than looking at my own code, so much the better. After all, “my code is always perfect”, n’est-ce pas? Well, it isn’t, and I get really annoyed with myself for even the most trivial bug, which are often the hardest to find. A missed increment, an unchecked null pointer. So, the thought: what’s out there “today” that I can use? I looked a year ago, when I was first talking to our new COO about it, and there was nothing useful – well, there was an Intel neural network stick that I could plug into a USB port, but I could not find one anywhere on an NS3 where it would be recognized. It made me sad.
Enter FB Infer. Not the most advanced thing or truly AI, but AI-ish. Infer is a static scanner but seems to have an inference rules engine built into it that does a decent (not perfect) job at catching some common mistakes. It is an Open Source engine on GitHub contributed and maintained by Facebook – don’t judge, please. So, I spun up an Ubuntu VM on my DevOps machine (called the VM Nanei after the smartest of my Bengals – don’t tell Kira I said that) and installed the Infer binary distribution. That took about 5 minutes. Then I cloned our newest application over to that box using git from our BitBucket server (another 15 seconds), which I can’t talk about, but it’s very cool and in Java. Then ran the Infer engine, which took another 10 minutes. Bang. Three unchecked potential null pointer accesses. All three were legitimate although not likely. If the system would have had a resource issue, say connections to the SQL/MP database, we would have had null pointer exceptions that would be very hard to explain (in context). After adding checks in the code, I pulled the new version from BitBucket, reran the Infer engine on the new version, and Infer was now happy with me.
The important thing here is that it would have taken a human at least a week to find these conditions doing a code review. Testing would never have caught the conditions because the situation is nearly impossible to simulate, and I would not have even thought to do it. But on a busy development box, this would have cost our support organization hours if not days to figure out why a NullPointerException happened at those very spots. Now we know, and the checks are in. 5 minutes to fix the code. Total loaded cost, under $500 compared with $10,000 that it could have been to catch this in a code review. I’m sold – not that I’m advocating getting rid of code reviews, but this is an extra set of artificial eyes. We’re now looking at how to integrate this into our Jenkins Pipelines – for Java it’s easy. C/C++ is more difficult, and TAL, well, I won’t go there, yet. Maybe I’ll end up being a contributor. Or maybe I’ll go have time Sangria on a beach instead.
When we consider the cost of deployment, catching problems even before the code is even delivered by a developer to the integration stream is a really interesting capability. Fixing problems found in development are almost always less expensive that a production hit. Even more so, now that our community is taking its steps into Edge Computing, this cost difference is even more important. Sure, the Google Play Store can handle updates to your apps on Smart Phones – you know who you are – but when the Edge involves IoT devices, updates are significantly harder. Re-flashing an EPROM because of a bug you could have caught? Who gets fired for that one? I think I’m going to stick when leveraging whatever technology I can use to cut down my own risks. What about you?
Side note: with a little luck, and approval from the review committee, I’ll be showing how this all works at TBC.