Whilst NonStop remains the world’s No.1 choice for Mission-Critical systems, identifying and retaining resource with the...
When bits were flipped …
When so many of us are working remotely and completely dependent on the networks we connect to, it’s worth a look at how it once looked and of the difficulties getting one NonStop to talk to the other … a fun look from times past.
A long time ago in a city shrouded in fog, I started out as a field software engineer for Tandem. A customer was moving from one data center to another by installing NonStop II’s in their new center and then deinstalling the corresponding NonStop at their old center. We got a call to tell us that their NonStop II could not communicate with their NonStop and we must have done something wrong either with the comms hardware or the way that we implemented the bitsync protocol on the NonStop II as both sides were throwing comms errors. Their staff had tested the comms link multiple times by putting a datascope on one side and looping it back to the other side. This was pre-SNAX when a customer or contractor had to code the communications protocol themselves.
Since I was a bit of a comms guy I was dispatched after hours to help figure out what was going on. I connected their datascope to one side of the link and watched the proper bitsync frames going across the link. I then went to the other building and did the same. On each side, the outbound data was fine, but the inbound data was total garbage. The customer told me that this was proof that our controllers or software were defective. At this time I had spent around 4 hours on the problem and was convinced that the fault was in the comms link and not in our systems but I could not convince the customer.
Cupertino was an hour away and there were plenty of datascopes in the lab, so I reached out to SSG and asked if someone could put one in their car and drive it up to the city. A couple hours later we had a datascope on either side of the link, each set up to send a repeating U, which is 01010101 in ASCII. And at the opposite side, I was seeing a line of asterisks displayed — which is 10101010. In other words, ones and zeroes were being flipped. It was obvious that the wire pair on either the send or receive side were reversed, so that the electrical signals were inverted. Think of putting a battery in backwards. By this time it was 2AM so I left everything in place and went home to sleep.
The next morning I brought the customer into the data center on one side and showed him what was being sent and what was being received. He again pointed out that it had to be our controller. “But the NonStops aren’t even connected to the link. I have a datascope connected on each side, sending a binary pattern and that pattern is being flipped when it gets to the other side. The reason your employees said the link was fine was because they were doing a loopback, so the bits were equally flipped on the way back, making them look correct.”
You would think that customer should have been contrite because the problem was his all along and we had solved it, but it didn’t quite work out that way. However, I did receive recognition from my manager that we made a large customer happy. And the city shrouded in fog? San Francisco.