Wednesday, January 26, 2005

Query Your Code

Lately, I've been doing a handful of code reviews for clients. I have access to all sorts of sophisticated tools (like OptimizeIt and TogetherJ), but I find that one of the tools I use the most is the *nix "find" utility combined with regular expressions. Sometimes, simplest is best.

If you had to analyze some data that lived in a relational database, would you look at all the records one at a time? Of course not -- you would execute a query that looks for troublesome records. I've been using the same technique against a body of source code. Curious about who is firing constructors on your boundary classes?

find -name "*.java" -exec grep -V "new .*Boundary.*(" {} \;

[Find all files named "*.java" from the current directory down and send each of these files into the grep command, which looks for the regular expression for constructor calls to any class with "Boundary" in the name.]

Or what about this:

find -name "*.java" -not -regex ".*Db\.java" -exec grep -H -n "new .*Db" {} \;

[Find all non-Db Java files that fire constructors on DB classes. In other words, determine the coupling between the boundary classes and all other classes.]

This technique depends on consistent naming patterns for your files. If you can rely on this, you can query your code base for analysis.

One of my common themes is to avoid drowning in impressive but unsuitable tools. If you go rabbit hunting, you can use a shotgun, a bazooka, or an atomic bomb. One of these tools is better than the others. Similarly, using a simple but effective command line tool beats the expensive ones for some tasks.

Treat your code as data and apply the same techniques you use on data analysis on your code base.

No comments: