File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes OO, Patterns, UML and Refactoring and the fly likes Refactoring Advice Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "Refactoring Advice" Watch "Refactoring Advice" New topic
Author

Refactoring Advice

Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Amazing as it may seem, I convinced my boss that right after our release (and 2 weeks of fixing the little bugs that we can fix quickly, but simply couldn't get done before the code freeze), we should spend 2 weeks refactoring our code base. So now I've got 2 weeks, 15 developers (mostly junior, many only 6 or 12 months out of school, but all very smart), 280,000 lines of code, "a full tank of gas, half a pack of cigarettes, it's dark out, and wear wearing sunglasses." :-)
What's the state of the code? What's you'd expect. It's not horrible, but there are certainly some areas where some it could stand to be cleaned up. It varies from refactoring a method or two to, this subsystem should be overhauled. (We are considering some overhauls in the upcoming release, so it may make sense to clean up now in preparation for such an activity.)
The good news is, the code is quite reasonably documented (I push my developers very hard to document their code). Every class and method has a javadoc at the minimum. The bad news is, we do not have automatic, regression testing, just a small QA dept.

So my questions are:

  • What should be the goals of my effort?
  • How much can we accomplish? How much is too much?
  • Should we try to tackle a couple big jobs, or just focus on the little issues, figuring management will give us time for the big issues when things break in major ways later.
  • Should people work alone, or in groups?
  • What tools do we need?
  • What process/protocol should we use during these two weeks?

  • I am also currently trying to develop a half day tutorial to teach people what to do. Any suggestions? We have numerous books on Design Patterns, as well as Anti-Patterns and Martin Fowler's [u]Refactoring[/] books lying around the office.

    This should be every developers dream, given soem time to clean up your code, wih no requirement to add new functionality; so let's hear some of those programming fantasies you've been thinking of all these years :-)

    --Mark
    PS I have some ideas, but I'll post those after some initial comments, since I want external input, and don't want to influence the discussion too much early on.
Carl Trusiak
Sheriff

Joined: Jun 13, 2000
Posts: 3340
I'd definitely hit the known bugs. Second, if you have a profiler, get the bottle necks. Once these two are out of the way, my priority would be unit tests!!! It's always better to put them in at the start but, I know it's hard to do with a deadline looming (and so many inexperienced personnel) I've had positive experience with JUnit.
You may be surprised the bugs Unit testing can flush and quickly.
Tools a good profiler like JProbe and a good unit testing package.
With 15 developers, if you have a good concurrent repository like CVS or PCVS, I'd split them. A group on known bugs, a group on profiling and a group on Unit testing.
In my experience, while Management may give you time when things break in small ways, you lose a LOT in their opinion of your group!
If you kept a group with a lot of inexperienced personnel up on the JavaDocs, Your process/protocol is fine, just shift the focus(without losing this, tough I know)
Just my 2 cents
BTW congrats on bringing the project in early!!!
[This message has been edited by Carl Trusiak (edited July 16, 2001).]


I Hope This Helps
Carl Trusiak, SCJP2, SCWCD
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Some random thoughts:
- Don't expect to get too much done, especially if most of the developers don't have any experience with refactoring.
- Don't do big refactorings! Exercise tiny ones first; if you begin to feel comfortable, get to the big ugly parts of the code and apply little tiny refactoring steps on it. You don't have to make it brilliant through the first refactoring; just make it get better and better from hour to hour...
- Always work in pairs! You don't want to break anything, and you want to get a design everybody understands, so a second pair of eyes can help a lot! Also, you will learn a lot from each other... Ah, and switch pairs often. (http://c2.com/cgi/wiki?PairProgramming)
- Try to write unit tests for the parts you want to refactor. You should write them anyway, so you can do it now, as you will need them. Of course, sometimes you will to have to refactor before you can write the test, but trying to write the test beforehand will at least tell you in which direction your refactoring should lead...
- Tools you will need: a unit testing framework (JUnit, for example); a version control tool (cvs or something) - integrate often! A good refactoring browser is essential; the best I know of is instantiations JFactor (it's absolutely worth its money).
- Communicate, communicate, communicate! Work in an open workspace (http://c2.com/cgi/wiki?OpenWorkspace), do stand up meetings (http://c2.com/cgi/wiki?StandUpMeeting). And don't forget to communicate!
- Learn from the experience and apply the learned later on. For example you might want to continue to refactor as you implement new functionality after the two weaks...
Hope, this helps...
[This message has been edited by Ilja Preuss (edited July 17, 2001).]


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Early? What made you think we're finishing it early? We are so far off our original schedule it's not even funny. Our deadline was finally set as end of Q2 (although we're even a few days over that) simply because the company couldn't wait any longer.
--Mark
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
OK, good suggestions. I'm still going to wait for more feedback before I go into my plans, but I want to follow up on some points you guys already raised.
Currently we have a CVS tree. We also use JBuilder and Emacs for our development. We do have soe copies of optimizeIt, which we can sue for profiling.
Ilja, those sound like good suggestions.
I've looked into JUnit. Here's the problem I see using it here. We have a fairly complex framework. I think it's reasonably well done, bot too tangled, but still complex. Needless to say it is not easy to create a single object in isolation, because that object requires another object, and so forth. (Note, one object doesn't directly require, say, another 5, but it may "chain" through to requiring another 5.) Bottom line, getting started to do some basic tests require a lot of setup and overhead. I suspect, more effort than the two weeks we are alloted.
I'm wondering it it's possible/feasible, to put some "unit testin" type code into our code base, i.e. methods at the end of the class, which are normally commented out, when not being used.
I've been skepical of pair progamming, but think this may be a case where it would work well, so I'll give it a try.

--Mark
Guy Allard
Ranch Hand

Joined: Nov 24, 2000
Posts: 776
Mark - I am an ancient programmer. Wrote my first code (FORTRAN) in '68.
Have never done pair programming.
But, I think the concept is an excellent one.
Only today's technologies even allow you to think about it. I don't think you can do it at a keypunch or TTY.
For a cost/benefit analysis of the concept see:
http://members.aol.com/humansandt/papers/pairprogrammingcostbene/pairprogrammingcostbene.htm
Best of luck.


------------------
Guy Allard
SCJP2, IBM-XML
Daniel Dunleavy
Ranch Hand

Joined: Mar 13, 2001
Posts: 276
Mark,
In my opinion, tackle any known bugs and do some testing to find others and clean them up.
Work on getting the big ugly parts under control. The payoff will come during production support. You may not have the time at that point to figure out what's going on in all that code.
The smaller pieces can get cleaned up along the way since its easier to determine its purpose and usually easier to clean up.
Personally I don't think pair programming is good practice. I think it started as a result of everyone coding directly into an IDE, and immedialtely start compiling and testing. I think its always best to start out with some kind of plan, for the system and for a program. It saves time in the long run. Coding and testing over the last 15 years has really gone down hill. Its as if all the coding is now done "on the fly" and maybe we'll test it if we get the chance. Pair programming does cut down on the errors, but it would cost you less to have everyone step back and map out what is done before coding. I do think its a good idea to talk with others about your ideas to perhaps get a better one. If you have a really heavy section of code, getting someone else's opinion is also warrented.
MHO
Dan
[This message has been edited by Daniel Dunleavy (edited July 17, 2001).]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
We also use JBuilder and Emacs for our development.

Then you might want to take a look at JRefactory: http://jrefactory.sourceforge.net/csrefactory.html
I've looked into JUnit. Here's the problem I see using it here. We have a fairly complex framework. I think it's reasonably well done, bot too tangled, but still complex. Needless to say it is not easy to create a single object in isolation, because that object requires another object, and so forth. (Note, one object doesn't directly require, say, another 5, but it may "chain" through to requiring another 5.) Bottom line, getting started to do some basic tests require a lot of setup and overhead.

Seems to me as if mock objects might help. See http://www.junit.org/articles.htm#MockObjects and http://tammofreese.de/easymock/index.html
I'm wondering it it's possible/feasible, to put some "unit testin" type code into our code base, i.e. methods at the end of the class, which are normally commented out, when not being used.

Sometimes it's just the simplest thing you can do to get your tests running. If I had to write them, I would encapsulate these methods in an inner class. And I would think of them as a hefty code smell and always hold an eye open for ways to get rid of them...
I've been skepical of pair progamming, but think this may be a case where it would work well, so I'll give it a try.

I have been skeptical, too. Now I am addicted... :-)
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Dan,
I think you underestimate the value of pair programming. I don't know anybody advocating pair programming as a replacement for planning or testing.
Pair programming is a strategy to

  • increase code quality (you just can't plan this in detail)
  • increase discipline (you are less likely to get sloppy about your code or unit testing)
  • increase knowledge of the team (which gets spread as you switch pairs often)
  • increase moral (as pair programming is fun!)
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Originally posted by Ilja Preuss:
Then you might want to take a look at JRefactory: http://jrefactory.sourceforge.net/csrefactory.html

I'm looking at that as well as Xrefactory: http://www.xref-tech.com/xrefactory/main.html
The latter is only $20 per user!


I read the Mock Objects paper. This idea seems obvious. I thought of that, too, that I would create these dummy fixtures I could use in my tests. However, it, in and of itself does not solve my problem. The interface to the Mock Object may require passing MyClass1. The constructor of MyClass1 may need MyClass2 and MyClass3. Now I need to create 3 Mock Objects for a single unit test! Given that Mock Objects aren't simply isolated stub classes, I would effectively need to maintain a shadow, mock framework, so that any test I could need to do could be run against the mock framework (minus the part being tested). My maintanence work has doubled, since now I must maintain two frameworks. Also, again given that these Mock Objects aren't isolated stub classes, I am likely to make errors while creating them, making my unit tests inaccurate.
The only escape from this morass is the ability to dynamically generate classes. It seems possible, in theory, to use inspection to dynamically generate the necessary Mock Objects, and given them default values from some property file. This would free the developers from having to explicitly maintain two sets of code. Rather, the mock framework gets dynamically regenerated from the current codebase. The developer would need only be sure all new fields and method return types have appropriate default values set in the property file.
EasyMock claims to do something like this, but I wasn't able to follow how "easy" it was.

--Mark
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4446
    
    5

Hi Mark,
The fact that you don't have automated tests is bad and will make refactoring more difficult. I would probably try to focus on refactoring the parts of the application that I think will be involved most in future iterations. Try to do simple refactorings that will be easy to test. The refactorings that have to do with eliminating duplicates are usually pretty straightforward and easy to understand so you might want to do those first. At the same time, you can also start writing small and simple unit tests.

Junilu - [How to Ask Questions] [How to Answer Questions]
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
You may feel that you are in a catch-22 about the likes of JUnit, even using mock objects, but the combination of refactoring and unit testing is so powerful, it can even pull you out of a hole like this.
The trick is to follow these simple steps:

  1. create an interface which offers the same public methods as one of these "difficult classes".
  2. edit the "difficult class" so it implements the interface
  3. refactor the other classes which need it to use the interface instead of the class.
  4. write a mock object class which also implements the interface created in step 1.

  5. Now, if any of the classes you have just refactored have no more dependencies on "difficult classes", you can write unit tests for them. Then, secure in your new found unit-test "safety net", you can go back to step 1 for another class.
    Soon you will have worked your way up to the top of the hierarchy, and tested all of your "difficult" classes.
    Does this make sense?


Read about me at frankcarver.me ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
Daniel Dunleavy
Ranch Hand

Joined: Mar 13, 2001
Posts: 276
Ilja
Actually in the "old days" we would create specs which you could code directly from. That was my initial point, that these interum steps are no longer done which has impacted the quality of software.
I have always believed in cross-training, and for that purpose it would be fine ( as well as the previous points). But do you advocate that everything which is coded needs to be coded by two people?
Dan
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
Daniel,
I think the idea behind pair programming (from what I've read about it in books and articles) is not that it's a replacement for specs, but rather that it prevents coding errors. That is, doulbing your "initial" production cost, is, in the long run cheaper, because towards the end of the project you don't have to spend as much time either

  • Looking for small bugs (oops, I meant !(foo)[/]i NOT [i](foo))
  • Having someone recheck double check your design. Yes, specs can be clear. Even the structure of the code can be well documented, but it's rare to see all classes speced out 100% such that nothing is left to be thought of. Often, on the fly, developers create classes, and these can be mis-designed. Pair programming helps prevent that (in theory).


  • However, I do agree with your IDE point. I first coded professionally doing FORTRAN as a batch job on a Cray. I was more careful between compiles, which today are free. (Sidenote: our Make process actually takes about 2 minues now, and my co-workers complain; I tried pointed out how slowing them how a slow process is good, but no one got it.) But I don't think we can easy reverse that trend. (Unless we all use intentionally slow down our build processes. :-)
    While I think the theory is great, I have concerns about pair programming in general, because I think personality conflicts can arise. But for short periods (like this two week project we're doing), it's worth a try.

    --Mark
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Actually in the "old days" we would create specs which you could code directly from. That was my initial point, that these interum steps are no longer done which has impacted the quality of software.

Of course it's a bad idea to drop these specs without any form of substitute. Nevertheless it doesn't mean that it couldn't be worthwhile to try to find a substitute.
Take a look at Martin Fowlers article "Is Design Dead?":
http://martinfowler.com/articles/designDead.html
I have always believed in cross-training, and for that purpose it would be fine ( as well as the previous points). But do you advocate that everything which is coded needs to be coded by two people?

Well, I think that if you strived for as much pair programming as you could accomplish, you would possibly experience a significant increase in both code quality and productivity.
Alain Ravet
Greenhorn

Joined: Jul 20, 2001
Posts: 1
Mark,

  • What tools do we need?


Just my 2 euro cents : forget JBuilder for refactoring, jump on IDEA (http://www.intellij.com/idea/)
This tool, and the reading of the book, can really make a difference.

Alain Ravet
Brussels, Belgium
Steve Freeman
Greenhorn

Joined: Jul 20, 2001
Posts: 1

I read the Mock Objects paper. This idea seems obvious. I thought of that, too, that I would create these dummy fixtures I could use in my tests. However, it, in and of itself does not solve my problem. The interface to the Mock Object may require passing MyClass1. The constructor of MyClass1 may need MyClass2 and MyClass3. Now I need to create 3 Mock Objects for a single unit test! Given that Mock Objects aren't simply isolated stub classes, I would effectively need to maintain a shadow, mock framework, so that any test I could need to do could be run against the mock framework (minus the part being tested). My maintanence work has doubled, since now I must maintain two frameworks. Also, again given that these Mock Objects aren't isolated stub classes, I am likely to make errors while creating them, making my unit tests inaccurate.

Just as it's hard to retrofit unit tests, it's hard to retrofit mock objects. The most useful effects come from using the technique to drive your coding style. I've found difficulty in mocking up to be a useful code smell. Let the technique drive you towards smaller methods and components, and towards passing behaviour around rather than data.
some points:

  • Keep the mocks really, really dumb. Set any values, rather than trying to construct them.
  • You'll need to refactor your tests to avoid code bloat. Where are you going to put all that stuff? In helper classes for the tests? Hmmm, maybe you could move them into the stubs to which they refer...
  • what's the advantage of setting values in a properties file rather that in test code? The disadvantage of a properties file is that you've broken your test fixture into multiple places.
  • the consistent pattern for unit tests is very effective. I've known people release production code in their first afternoon, because it was so obvious what they were supposed to do.

  • Our experience is that the technique is obvious to describe, but that it takes a little while until it makes sense in practice. The critical part is deciding what to verify in a test, the rest usually follows from that.
    The easy mock package sounds promising (we're all stuck with 1.2, so we can't use it).
    --Steve
Daniel Dunleavy
Ranch Hand

Joined: Mar 13, 2001
Posts: 276
Mark and Ilja,
Thanks for your input.
In my experience I have noticed I was more willing and open about what I was doing. I would seek others help/input where I felt necessary. But other people I have met were (I think) afraid to show that they might have a chink in their armor. I guess pair programming would be good for these people since it would force them to let down their guard.
Dan
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
I posted a link to this thread in the extreme-programming mailing list and got the following reply:
I'm not inclined to register so I can participate directly, but I do
have a comment (feel free to post it there if you like).
The OP talks about needing to manage a whole parallel framework if he
does MockObjects... his example is where Class1's constructor needs
Class2, which needs Class3 and Class4, etc. I want to point out that
the "fix" for this is to have the Mock _NOT_ depend on Class2 etc.
Extract the public interface (not the constructors) of Class1 into
Interface1, then except where it's constructed, use and refer to it by
the interface. If the constructor is called inside the code under
test, you'll need to either move it out of there, or use a factory.
If the constructor is called outside the code under test (e.g., the
object is passed in), then your test code can just construct and pass
in the mock instead of the real object.
One of the huge benefits of Mock objects is that they can cut the
dependency chain short... you only need to mock the objects that the
code under test calls _directly_.
-Matthew
azami@speakeasy.net
Mark Herschberg
Sheriff

Joined: Dec 04, 2000
Posts: 6037
azami@speakeasy.net wrote:
his example is where Class1's constructor needs Class2, which needs Class3 and Class4, etc. I want to point out that the "fix" for this is to have the Mock _NOT_ depend on Class2 etc.
Extract the public interface (not the constructors) of Class1 into Interface1, then except where it's constructed, use and refer to it by the interface. If the constructor is called inside the code under test, you'll need to either move it out of there, or use a factory. If the constructor is called outside the code under test (e.g., the object is passed in), then your test code can just construct and pass in the mock instead of the real object.

But the cost is not always bearable. I may not want interfaces for everything. I may not want factories for my constructors; and I certainly won't move around where I call my constructors for th sake of testing, that defeats the purpose of external testing code.
In my particular case, our code runs on mobile devices. They have small memory and low processing power. I don't want the extra memory footprint of more interfaces. I don't want the extra runtime cost of factories.
Even if I was running on a PC, and not amobile device, I still question the cost of maintaining parallel class and interface hierarchies.
--Mark
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Refactoring Advice
 
Similar Threads
What topics should be in a software dev book?
Wrote SCEA Beta Test Exam Today - Feedback on the Sun Certified Architect Exam Part 1
Working as architects?
eXtreme Programming and Process
Whats are drawbacks of Agile methodology?