Functions that Lie
Usually code is the source of truth. Nothing could be more precise and exhaustive than the code itself. Code is the ultimate source of knowledge how given system or program works. But sometimes it lies to you. That’s not because it is inherently bad or it is cheating, it is because programmers taught it so.
Let’s dive straight in. Here we got a piece of code managing some kind of protected storage. Credentials are stored in a file, so code may look like this:
void MyStorage::LoadFileContents(std::string const& filename)
{
ReadCredentials();
MyFile f(filename, m_Username, m_Password);
f.LoadContents(m_FileContents);
}
bool MyStorage::SaveContents(std::string const& filename)
{
ReadCredentials();
MyFile f(filename, m_Username, m_Password);
f.SaveContents(m_FileContents);
}
Being familiar with the code, the developer can easily make a function that updates used credentials if they are changed on the remote side. One simply updates the file – and that’s it!
But here’s the problem. The file is updated with new credentials, but it seems MyStorage
client fails to re-read them. Looks like it continues to use old credentials. But we’ve clearly seen the file is being re-read. Or is it?
void ReadCredentials()
{
// we already have credentials, no need to read
if (!m_Username.empty())
return;
...
}
Oh, wow! Function name indicates that it always reads credentials from the file, but it turns out instead it only reads them for the first time, and all consecutive calls simply check if there’s anything in the cache. Of course updating the file itself would have no effect if application already read something.
See what has happened? A developer had to spend extra time investigating what was going on only because the function name was bad. I’ll put things boldly: it wasn’t just ‘bad’, it was the worst kind of bad: misleading. That’s the case of a function name lying to you.
One may argue that developers should be familiar with the code they mess up with back to front, they should study all the pieces of documentation and read every line of the code to make sure they know what exact impact their fixes will have on the product.
But I’d disagree. That’s straight up impossible in my opinion. Projects are getting larger, codebases grow way too fast. If something can be done to speed up code analysis and save a bit of precious time, that’s definitely worth doing. The whole development world moves in the direction of lessening the burden of the developers – languages help rapid application development more and more, becoming more strict and safe, helping their users not to shoot themselves in the foot. Thus, developers in my opinion should put their efforts the same way as well. If one can spare another having to read a function to understand what it does, they should do it.
Alright, you say, how would you call this method? I have a few options for you: ReadCredentialsIfEmpty
, EnsureCredentialsLoaded
, even ReadCredentialsIfNeeded
(the style I generally hate since you never know what this wacky ‘IfNeeded’ really means). All these names highlight the fact that the method does not simply read something, but does it conditionally.
Here’s another example. Suppose you have a fairly simple public function that recursively copies one directory to another place. The name’s ‘CopyDirectory’:
void CopyDirectory(std::string const& sourceDir, std::string const& destinationDir);
Okay, a developer uses it to implement some kind of backup of application settings on a USB stick. Couple of days later a QA engineer complains that the application does not work if these settings are restored from the USB stick.
Let’s take a look and see what’s going on… Ah, a file is missing. How could it be? If something goes wrong the whole operation should be reverted — and everything seems fine. Let’s look a little bit deeper:
void CopyDirectory(std::string const& sourceDir, std::string const& destinationDir)
{
...
find_first_file(...);
while (find_next_file(...))
{
if (found_is_dir)
{
CopyDirectory(...);
continue;
}
auto error = CopyFile(...);
Log << "file copied: '" << filename << "', result is: " << error;
...
}
}
Wait, what? It simply ignores all the errors and does something random. The user has no control over what is going on and will never know what they got as a result. Ignoring the errors is a very dangerous strategy that almost always fails. I’m going to cover my concerns about it in depth in my next article…
While I agree there might be cases where we would like to copy everything we can and skip everything we can’t (like trying to save data from a dying disk), that should not be default behavior. And especially such an ignorant function should never be called CopyDirectory
since it copies random stuff.
How can this be fixed? Well, it depends on what is considered a ‘fix’ here. If you’d like to leave everything as is and let the function copy what’s possible and skip what is not, rename it to something like CopyDirectoryIgnoringErrors
or make a struct like this:
struct CopyDirectoryOptions
{
std::string sourceDir;
std::string destinationDir;
bool ignoreErrors = true;
};
And then, of course, change the prototype:
void CopyDirectory(CopyDirectoryOptions const& options);
and add error handling.
And if you’d like the function to handle errors and have the same name, just add appropriate error handling (exceptions or as a return value, whatever).
One may suggest to change nothing and add a little comment in the header. Something like ‘This function ignores all errors’. While this might be appropriate in some environments, I generally think names should be descriptive. Adding a comment is like a dirty hack: the name is bad and one looks for some kind of quick fix.
Let’s add a twist to this lie. Suppose you have a method that downloads components during the installation of your product, and the code looks like this:
int DownloadAndInstall(const char* componentName, uint componentVersion);
void DownloadAndinstallComponents()
{
if (DownloadAndInstall("MyComponent1", 15) != 0))
throw(...);
if (DownloadAndInstall("NewComponent", 187) != 0))
throw(...);
....
}
And everything was fine until one day a customer complained that the build does not work at all. After a quick investigation it turned out they had MyComponent1
version 151 installed. How could this happen? The code clearly indicates it is either build 15 installed or error. Well, not quite.
int DownloadAndInstall(const char* componentName, uint componentVersion)
{
ComponentPtr component = FindComponent(componentName, componentVersion);
if (!component)
component = FindLatestComponentVersion(componentName);
if (!component)
return COMPONENT_NOT_FOUND;
return InstallComponent(component);
}
Ha. Customer service at your service. If something went wrong, let’s try to fix it. Somehow. Having no idea of what was going on. Please never do such things. Don’t code a function to do something that the client did not ask for. This can lead to completely random consequences.
As you can see there’s an ongoing trend on concealing errors, which is a big topic and I’m going to discuss it someday.
The last type of lying name is probably the least harmful: generic names. They are still names, but one can’t extract a bit of useful information from them. Suppose we’re in the same copying function, but in the main cycle we have something like this:
while (find_next_file(...))
{
Check();
if (found_is_dir)
....
}
Check? Check what? It seems there’s no way to find out except by examining the source code…
void Check()
{
if (IsCancelled())
{
throw OperationCanceledException();
}
}
Here even simple CheckCancel
would be way better. At least it describes what type of logic the function handles. You can go with anything descriptive like ThrowIfCanceled
as well.
I’m often asked why I’m so picky about names. And I always answer that names are all about communication. They communicate intentions, they communicate choices, they communicate problems and ways to solve them. Clear communication drastically improves overall performance.