Most teams realize they are dealing with a legacy repo only when a tiny change turns into an operation. A one-line fix suddenly needs three people, two days of manual checks, and all the usual release shenanigans. The code may look clean. It may even be built with modern tools. That does not matter. The moment a small change requires that much ceremony, your are in the realm of legacy software. At that point, refactoring is the only realistic way to keep shipping features.

The problem, of course, is that you cannot refactor safely unless you know what must not change. That is why behavior is the real contract. Refactoring is restructuring code without changing its observable behavior. If behavior changes unintentionally, you are changing the system. That is exactly where tests come in. They turn that invisible contract into something you can actually verify.

Regret refactoring after ten secondsRegret refactoring after ten seconds

Not sure if code works or if nobody’s touched it since 2012.

Common Symptoms of Untested Code

Untested code has a smell. You can often sense it before you even open the files. The fear of touching anything, the endless manual testing cycles, the “it worked yesterday” conversations. All of these are early signs.

  • Frequent production incidents. When every release feels like rolling dice, you are probably missing test coverage. Teams that fear deployments are not careful, they are blind.
  • Long release cycles. When each change requires hours or days of manual verification, testing debt has already piled up.
  • Developers are afraid to refactor. When even small code changes make people nervous, that is a clear sign there is no safety net.
  • Copy-paste fixes. Instead of improving shared logic, people duplicate it to avoid breaking something else.
  • Silent coupling between modules. When unrelated components break together, you are seeing dependencies nobody understands well enough.

I have dealt with some of these symptoms myself. I once inherited a networking tool from a network engineer. Everything technically worked, but there was no safe way to add anything new because I did not know what I might break. I was not deep in networking either, especially when it came to network devices. The tool ran on tribal knowledge, manual checks, and behavior nobody had written down.

That experience taught me a simple rule. Start from the outside in. Capture real inputs, real outputs, and visible side effects before touching the internals. Then refactor in small, reversible steps until the code becomes safe to change.

How to Start Refactoring Untested Code

Untested code looks terrifying at first, but it can be tamed with patience and structure. Think of it like stabilizing an old bridge while people are still crossing it. You cannot rebuild it all at once, but you can reinforce it piece by piece. The right mindset is closer to archaeology. You uncover the system layer by layer until its intent becomes visible again.

Before touching anything, capture how the system behaves right now. Write down what it does, not what you think it should do. Until proven otherwise, that is the only truth you have. Record inputs, outputs, logs, and metrics. These become your first clues and your temporary validation. Even screenshots, console output, or log samples count at this stage. Anything that preserves current behavior.

Step 1: Observe and Capture Behavior

Then start writing characterization tests. These are tests that describe the current behavior of the system, even if that behavior seems wrong or messy. A characterization test does not prove the code is correct; it proves the code behaves the way it currently does. This step surfaces the hidden contracts your system relies on but never documented. Even if the current behavior is wrong, it’s still your baseline. You can always correct behavior later, but first you must trap it in a repeatable form.

Step 2: Refactor in Small, Safe Steps

Next, refactor in small, safe steps. Avoid large rewrites or ambitious reorganizations. Big refactors without tests is basically self sabotaging. Avoid the temptation to clean everything at once. Instead, find a single file, a small method, or a function that frequently causes pain, and start there. Each change should leave the system slightly better than before. Make commits tiny and reversible.

Step 3: Identify and Exploit Seams

Focus on seams. Seams are where dependencies meet. They are natural points where you can insert control, inject mocks, or observe interactions without rewriting everything. Add logging or dependency injection to make these boundaries visible and controllable. A seam is often just a hard dependency turned into something you control.

Suppose an OrderService sends a confirmation email by instantiating the mailer directly inside the method. That makes the behavior hard to isolate and hard to test.

java
// Before: no seam, hard dependency inside the class
public class OrderService {
    public void placeOrder(Order order) {
        // business logic to save order
        saveOrder(order);
        // Sometimes the email-sending logic is written directly here,
        // with no EmailSender abstraction at all.
        // Similar code often gets duplicated in multiple places.
        EmailSender emailSender = new SmtpEmailSender();
        emailSender.send(order.getCustomerEmail(), "Order confirmed");
    }

    private void saveOrder(Order order) {
        // persist order
    }
}
java
// After: seam introduced through an interface
public interface EmailSender {
    void send(String to, String message);
}

public class SmtpEmailSender implements EmailSender {
    @Override
    public void send(String to, String message) {
        // real SMTP logic
    }
}

public class OrderService {
    private final EmailSender emailSender;

    public OrderService(EmailSender emailSender) {
        this.emailSender = emailSender;
    }

    public void placeOrder(Order order) {
        saveOrder(order);
        emailSender.send(order.getCustomerEmail(), "Order confirmed");
    }

    private void saveOrder(Order order) {
        // persist order
    }
}

EmailSender is the seam. In production, you pass the real SMTP implementation. In a test, you pass a fake and verify that the order flow still triggers the right side effect without sending a real email. The more seams you expose, the easier it becomes to isolate and refactor safely. When you find a stable seam, guard it with tests. Those become your checkpoints for progress.

Step 4: Turn Failures into Knowledge

When something breaks, learn from it. It tells you something you didn’t yet understand about the system. Every failure reveals a hidden dependency or assumption that was missing from your mental model. Add tests that capture that knowledge. This transforms every surprise into documentation. Over time, your test suite becomes a living document of how the system truly works.

Step 5: Focus on Continuous Improvement

Finally, remember that refactoring is not about perfection. Perfection is a trap. Progress is the goal. It’s about making tomorrow’s work easier than today’s. Small, steady improvements accumulate into massive change. It’s just like technical debt once did, only in reverse. The goal is to reach a point where changes feel safe, predictable, and reversible. Once you can make changes with confidence, the rest follows naturally.

Final Thoughts

Untested code is not only messy but orphaned. Nobody fully owns it. That is why some ten year old systems are still productive, while some code written three months ago already feels untouchable. Age is not the issue but lost understanding.

We can generate working code faster than we can understand it. If that code ships without tests, without clear seams, and without someone who can defend its behavior, then you have created legacy on day one. It may look modern. Operationally, it behaves like an old system. People avoid it. Every change later needs heroics.

So I do not think refactoring untested code is cleanup work. I think it is ownership recovery. Every characterization test, every dependency you isolate, every boundary you make visible gives the team back the ability to change the system on purpose. That is the real gain. Not prettier methods. Not clever abstractions. Lower fear, lower coordination cost, and better decisions.

Start there. Do not try to make the whole codebase elegant. Make one risky path understandable. Freeze one important behavior. Turn one hidden dependency into something visible. Refactoring untested code is how a team earns back its engineering judgment. Once the code is explainable again, it becomes improvable again.