How to Check That I Didn't Break Anything When Refactoring

How can I check that I didn't break anything when refactoring?

gcov will give you coverage information for your unit tests.

It's difficult to answer your question in an accurate manner without knowing more about the refactorings you plan to perform.

An advice one might give is to proceed with small iterations instead of refactoring lots and lots of parts of your code base and then realize everything breaks.

Reference: The GNU Coverage Tool - A Brief Tutorial

How to validate that a Refactoring is equal to original code

In "TDD By Example" there is a specific section that talks about it. The problem is that you need unit tests to refactor, but a complicated code is usually non-testable. Thus, you want to refactor to make it testable. Cycle.

Therefore the best strategy is as follows:

Do tiny refactoring steps. When the steps are small it is easier for a human to make sure the overall behavior is intact. Choose only refactorings that increase testability. This is your immediate goal. Don't think about supporting future functionality (or anything fancy like that). Just think about "how can I make it possible for a unit test to test this method/class".

As soon as a method/class becomes testable, write unit tests for it.

Repeating this process will gradually get you to a position where you have tests and thus you can refactor more aggressively. Usually, this process is shorter than one would expect.

How do you refactor?

  1. do not refactor anything non-trivial that does not already have unit tests
  2. write unit tests, then refactor
  3. refactor small pieces and re-run the tests frequently
  4. stop refactoring when the code is DRY* clean

* DRY = Don't Repeat Yourself

Test - Code - Refactor, when should we start a refactoring?

TDD is a great tool to keep you on track/on task. The problem with:

"Write failing Test" -> "Code/Refactor" -> "Write failing Test"

you propose, is it can easily become:

"Write failing Test" -> "Refactor" -> "Code" -> "Write failing Test"

or then

"Write failing Test" -> "Refactor" -> "Refactor" -> "Refactor" -> "Code" -> "Write failing Test"

which is what you want to avoid. By refactoring at the beginning of implementation, you are indulging in speculative development, and, not achieving the goal of the coding session. It's easy to head off on tangents and build things you don't necessarily need. If you have the feature working and tests passing, it's much easier to decide when's the right time to stop refactoring. And you can stop at any time because your tests are passing.

Additionally, you don't want to refactor when your tests aren't green.

A couple other small pts:

  1. I think most of the literature has a slightly different definition what refactoring is. It's not "some changes to the system" or performance enhancements, but specific changes that don't change the behavior but improve the design. If you accept the definition, then performance improvements don't really qualify: they are normal development tasks that need their own acceptance tests. I usually try to frame these as end-user facing stories, where the benefit of doing them is clear. Make sense?

  2. I think you're right that the TDD practice doesn't specifically address the design problems revealed during code reviews. (See reflections and pair programming for other solutions to this.) These tend to be bigger, cross-story issues built up as "code debt", and have to take some time to clean it up periodically. This could be a separate project, but I, personally, always like to do this as part of another "real" story. Last time I did this, we identified we had a problem, but ended up waiting a few weeks until we had a related story to work on it. We followed the TDD practice of implementing the new feature first-- even though we knew it was way wrong. But then we really understood what was going on and why it was messy, and then spent longer than usual on the refactor phase. Worked well.

How to make refactoring less destructive?

What you describe is in fact not refactoring.

Refactoring is a disciplined effort to improve the design of the code without changing its functionality, done in small - even simplistic - steps, safeguarded by unit tests, which ensure that the system is functional after each step. Moreover, it is typically done in small increments over a longer period of time, not in one big whoosh.

This is not to be overly zealous about anything, just to clarify the terms :-) There is less chance of misunderstanding and communication problems when we understand the same words the same way.

Of course, if you have the time to do a lot of refactoring at once, all the better! But before you embark on such an effort, you absolutely need to build a good set of unit tests which cover - ideally - all the functionality in the code parts you are about to change.

Since you are talking about a "major requirement change", it is not clear whether what you call "refactoring" is actually implementing new functionality or only improving the design to prepare for the introduction of new functionality. I strongly recommend to keep the two phases separate: refactor first without changing the existing functionality, to make your design more open to extensions at the right places, which then allow you to incorporate the desired functional changes more easily.

The Refactoring book linked by @Eric is fundamental; I would add Refactoring to Patterns by Josh Kerievsky, which is about the practical application of refactorings.

what to unit test after refactoring common code into separate class

If your refactored class is covered by existing tests, you are probably ok.

Looking at your options, i would also do number 4. If you did some refactoring, you probably made something more generic than it was before. In that case, you could test the generic functionality in a generic manner. So I would do 4 if your refactored solution is more generic. If it is just moving code around to be DRY, i would probably do 1.

Should I go back and fix work when you learn something new/better?

IMO, this depends a bit on a few factors:

  1. Current project or not - Would this be going back to a previous project to make the change and opening a can of worms to do this? Where I work, there is our current project and while I may learn this and that, I'm not about to go into older projects and create new problems in trying to apply something that I thought would be simple. Another thought here would be to take what is in SourceSafe and move it to Subversion, which may be useful to have just one software for version control but would likely not universally seen as a wise investment of time as most of the developers here are used to using both.

  2. Age of code - Is this code that I worked on recently or would I be spending a great deal of time understanding what I did way back when? While this is similar to the above, my current work project has been going on since June of 2008 so there are things I worked on a year or more ago that really may not be as worthwhile to go back and change.

  3. Scale of change - Is this going to take weeks or just a few hours? This is also something to consider as little changes can sometimes lead to hours of work fixing this and that which become new bugs due to side effects in the change.

Those are a few things that would guide me as well as trying to remember that just because something may seem better, it isn't necessarily something to use everywhere. For example, trying to use a drill on a nail isn't a great idea but if you just learned how to use a drill and want to use it everywhere, then there may be problems where other tools like a hammer or wrench may be much more useful, so don't forget that there is the right tool for the job that can also be worthwhile.

What criteria is used to determine whether to refactor a code piece or not?

There are many different situations and reasons when you should do refactoring. For example, your method is doing a lot of things. If method is doing a lot of things, it's very difficult to test it, so you need to break down into smaller and simpler methods.

Usually you should keep that one class is responsible only for one thing, and if it's not, then it's time for refactoring.

Also if method has a lot of parameters, then maybe your method is in a wrong class or maybe can be optimize on some other way.

If you have a lot of if-else conditions, then probably you should take some state/strategy pattern to eliminate if-else.

There are really a lot of cases where you should start doing refactoring, and the best is first to read the book Refactoring of Martin Fowler. In this book he covers a lot of situations and I would highly recommend it.



Related Topics



Leave a reply



Submit