Off-by-one errors are the bane of the programmer. They’re right up there alongside the null reference exception that Tony Hoare infamously calls his “billion dollar mistake“. They’re responsible for everything from data loss to program crashes to security failures.
A wise programmer will know how to avoid off-by-one situations through the judicious use of techniques like using no raw loops. This won’t always rescue bad code, however. Proper range testing is an integral part of quality code assurance.
Range testing, like “no raw loops”, is not some secret science – the objective is right there in the name. Sadly, it is more common than not for a programmer to test only a single condition (or none at all) to verify the integrity of their code base. With discipline in testing, it is easy to cull away a common class of bugs.
Identify Edge Cases
The key in range testing is to identify all of the edge case values for your algorithm. The simplest scenario is testing a collection that can have 0 or more values: 0, 1, 2, …
In this scenario, our primary edge case is the value 0. To fully exercise off-by-one errors, however, it is important to also test the ranges around that value; in this case, -1 and 1.
Even though it sounds unintuitive, it’s important to test potentially “nonsense values” by evaluating the possible values of the data type that describes the range. If a signed integer type describes your collection length, you’ll want to make sure you test against potentially negative lengths. In many cases, the collection your algorithm is working on will prevent this possibility, but you’d also be surprised how easy it is to run in to (can we say “overflow?”). If it’s a possibility, don’t neglect it!
Multiple Edge Cases
If your collection has some other “interesting” edge case, you’ll want to similarly identify the values defining and surrounding that edge case for testing.
Let’s say that you’re trying to display only the five most recent elements of a data feed. You’ve now got multiple edge cases that require testing. Your primary edge cases are now 0 and 6, because “something interesting” happens at those values. To better exercise your algorithm, you’d want to test the values: -1, 0, 1, 5, 6, and 7.
For Added Goodness
It’s highly likely by now that your identified edge cases cover most of the standard errors exhibited by your code. If you want some extra sanity checking, sometimes it can be handy to test random points in the ranges between your identified edge cases. If you picture an infinite number line, simply break the line up according to the edge cases we’ve identified and select a random position in each of the remaining ranges for testing:
…, -1, 0, 1, …, 5, 6, 7, …
We can see there are three generic “ranges” here. Just pick a value within each range and add a test for them.
And That’s a Wrap
There’s no magic behind range testing. All it takes is discipline. But, all too often, a few basic exercises of an algorithm are left ignored. Otherwise, you’ll fall afoul of the two hardest things in computer science: cache invalidation, naming things… and off-by-one errors.