Write once, copy once, refactor once.

This post is about effectively using your time. It is not a defence of taking on huge technical debt to make your life easier for a single day, and you are hopefully smart enough to avoid doing that. It’s just a simple approach to growing a system you’re working on while avoiding duplication of effort and over-engineering.

Write once.

The first time you have to do it, just get it done. You’ll be in the best position to improve your solution once it actually exists, by knowing real things about how well it solves your problem. Plus, at that point, it’s already solving your problem.

You may be tempted to figure out the best solution before writing anything, which in my experience is a huge waste of time. If you haven’t written it yet, figuring out which bells and whistles to add is time-consuming and error-prone.

Copy once.

You got it done, a week passes, and a new problem comes up that feels almost the same. Assuming you can’t just use the thing you wrote before, you can:

  1. Refactor your code to make the original useable in both spots.
  2. Copy your code to the new spot.

Most people want to refactor now, but I think you should copy. Why?

It is very hard to determine patterns from just two examples.

With two instances, statements about the group as a whole are wild guesses. A simple illustration — consider these two numbers: 1, 2, …

The next number might be 4, 3, 1, 57, or anything else. No clear pattern shows up, because two data points only have one relationship, and patterns are formed from the relationships between things. Try three numbers: 1, 2, 3, …

4 is next, and I’m sure you saw it coming. This example is admittedly a little weak, but it clearly illustrates that patterns can leap out at 3 things, while being unclear at 2. With three things, you can have up to three relationships, and similarities between multiple relationships are way easier to be think about than similarities between … one.

Refactoring after two instances

Let’s say you chose to refactor. You put the shared code somewhere ideal for the two cases you know, and rewrite the tests.

Two weeks later, the problem shows up elsewhere. Is the code useable right away in this new spot? If you’re like me, you gave your abstraction the smallest scope necessary, and you’ll probably have to move it again. You parameterized the minimum amount required, and you may have to add new controls. That was all wasted time and effort.

So just copy it!

Almost no effort needed. You’ll probably have to refactor in two weeks, but you didn’t waste the time today.

Objections to copying things:

“I might forget to update a copy later!”

“Copying and pasting leads to typos!”

These are both valid points. In the first case, making synchronized changes might lead you to refactor (after all, you’ve already written once and copied once), and in the second case, the risks aren’t as big they seem. The second case has a nastier form:

“That’s not DRY!”

Not repeating yourself is a well known rule, and endless repetition is obviously bad, but I caution readers to ensure that they aren’t following DRY blindly. You should consider things like the following:

  • will your code be easier to read with a copy, or with a new abstraction?
  • will the abstraction that makes sense today last beyond tomorrow?
  • will a bad abstraction be harder to remove than repetition is?

As for copies and typos, most people are intelligent and careful enough to deal with a single copy well. The real horror stories occur when you make several copies (3+) of something and they’re all just a little bit different. Don’t make three or more copies: just copy once.

Refactor once.

Back to that choice from earlier: say you’d copied it instead of refactoring. Now it’s two weeks later and you hit a third instance of the problem.

As a quick aside, the “third instance” may just be a need to change your code in both copies. This is a sufficient condition to refactor. Two distant, identical pieces of code needing similar non-trivial changes is a great way to create a bug by messing up minor differences of context. After you’ve written once and copied once, the next step in this approach is always to refactor once.

With three instances, you have a good chance of finding patterns. A quick refactor here is going to yield even better code than a slow, thoughtful one would have with just two instances of the problem. Context rules.

Before we move on, what exactly does “having three instances of the problem” and “the next step is refactoring it” mean? There are two paths upon discovering you need the code a third time:

  1. Make a third copy just to see how it fits and that it works, then refactor.
  2. Jump straight to refactoring the first two, and just apply the solution.

The first choice takes longer, but is safer: your code never leaves a working state. This is a little bit like long-division. In complex situations, or when you’re less experienced, it’s best to write everything down. You’ll know which feels right when you’re in this position.

The Fourth Instance

What if two weeks after your refactor step, you hit the problem yet again, and for some reason you can’t just use your past work? Why not write once, refactor twice? Or write once, copy twice, refactor once?

In my experience, the fourth-instance-that-doesn’t-fit-the-pattern is surprisingly rare. Refactoring at the third instance is usually good enough for a long time, so doing it then minimizes the odds that you’ll have to refactor the same thing twice in a short period.

As for copying twice and refactoring at the fourth instance… this gets back into that “three copies or more” zone, where it gets hard to keep minor differences between your solutions straight. Feel free to try, but I bet you’ll find it to be a serious, bug-causing hassle.

Write once, copy once, refactor once.

  1. It’s faster.
  2. It’s only slightly more error prone, if at all.
  3. It will probably yield better code in the end.

So do what the title says.

Leave a Reply

Your email address will not be published. Required fields are marked *