DRY your code: 2011

2011-12-16

It is a destiny

Every project that I take from previous developers I have to answer the same question: "Was developer who wrote this brain-damaged, just sloppy or this horrifying spaghetti mess reflects some business logic that is actually needed".

The worst part is that there is never easy answer to this. So it always ends up with tedious ongoing refactoring and meticulous analysis.

2011-09-28

Parentheses in Lisp: let's count them.

Parentheses are the first thing any newcomer notices about Lisp program. This is something that put off many who does not want to learn the language just by looking at the syntax. However, If they think that there are too many parentheses, do they actually count them? Let's look at a fragment of java code:

public static void closeSocket(final Socket socket) {
        try {
            socket.close();
        } catch (IOException e) {
            LOG.error("Error closing socket " + socket, e);
        }
}

Total: 14 braces and parentheses.

Now, below is how equivalent Lisp code could look like (let it be in Clojure as an example)

(defn closeSocket [^:Socket socket]
  (try 
    (.close socket)

      (catch IOException e
        (LOG/error (str "Error closing socket " socket) e)))))

Total: 14 braces and parentheses.

Now I would say that those who reject a language judging by it's syntax do not even look at syntax closely.

2011-09-22

More on concise code

"How many lines of code have you deleted in one day?"
My answer is "~2150 of perfectly working code and not a single feature broke".

And one more essay by Michael Feathers: The Carrying-Cost of Code: Taking Lean Seriously

2011-09-16

Why programming language design is hard

It is harder to write reusable code because it is used in more than one context. In particular it is the case with unit-testable code, because you have to write code for use in tests and in production. But programming language is probably an extreme: sooner or later every feature you have will be used in every possible combination.

Quote from Neal Gafter

There's more to my previous post. The quote is actually not about unnecessary code but about unnecessary features but still pertinent http://www.infoq.com/articles/neal-gafter-on-java :

And when you add something that someone doesn't benefit from, it's actually a negative for them. Even though they don't have to use it, or look at it, or care about it, it makes the system more complicated for them.

2011-09-04

Concise code

This subject has been bothering me for a long time. So I decided to collect my arguments in single place, here.
I insist on that good code is concise code (but not necessarily otherwise, to be precise). Objections to this that I heard most often so far are
1. The code is being read more times than it is written. Denser code is hard to read.
2. The code is for human beings and that is why it should be clear. Denser code is harder to read.
3. Copy-past is not bad since in future requirements may change and then copied parts would naturally diverge. Extra code in the project is just extra code - it sits there and costs nothing.

And all of them ARE WRONG.
Yes code is being read more often than it is written. And that is why short code matters. There are numerous observations that human brain can manipulate by limited number of symbols at a time. If you present whole program in a way that brain can grasp at once you get a bonus understanding what is going on here. This is instead of working as meat-based decompiler trying to understand pieces of system, then trying to put pieces together, then trying to understand how this system works as a whole. By writing program that fits in your brain you turn computer into your friend that has interface compatible with you. This is surprising how may program systems where not even one developer understands. Developers working on them just get used to navigating symbol browsers in their IDE without even trying to understand parts of code that are not directly related to task at hand.
Yes code is being read more often than it is written. To an extent. New code almost always needs refactoring as requirements and more appropriate designs become more apparent. The less code is there the more chances rewriting it would be feasible the better chances design would be sound in future. In that very future rewrite is often impossible. So keeping code compact is one of ways to better software. I have seen developers who were reluctant to rewrite code that was first written yesterday even though every one understand that rewrite would significantly decrease complexity. You see? Even yesterday's code might be already too old for rewrite.
Every middle or large size project I participated in contained many hundreds or even thousands lines of unnecessary code. Worse then it's redundancy is often not apparent since only some parts of it have textual similarity. You need time to understand that this large chunk of code is completely unnecessary.
And anyone who will dig this code will have to expend an effort to understand these parts of code. One have to understand code before changing it, right? Instead of immediately seeing that this and that parts of system are essentially the same thing with minor tweak in the middle that is passed as parameter, I have to work as overpaid diff program that scans two large chunks of code for that single dot that is in different place. And after finally finding it still wondering if it is bug introduced by sloppy maintenance or a deliberate change in functionality. This is part of increased support cost due to unnecessary code. Unnecessary code is not jut sitting there on a hard drive and bothers no one. Instead it constantly sucks team's time and energy to support it.

They say that when requirements change those copy-pasted pieces can be changed separately. The problem is that you never know how those requirement will change. And there is big chance at least, say 50%, that requirements will change in a way that would require both of those pieces. And the large project the higher chances that required changes will be cross-cutting. You ain't gonna need it.

Yes, in less denser code it is easier to understand one isolated line of code. But is there a point in understanding line "i++;" or "}"? Is there a point in understanding separate lines of code if you still do not understand what is going on? You need to understand what the program is doing if not as whole but at least a substantial piece of it. Yes in denser code you probably need some times more mental effort to understand it but is it a problem since you need in proportion less lines of code that does the same thing? And remember that you will have better chances to understand and maintain the code.

I have one more observation related to good design. To write concise code you have to understand deeply what your program should do. And this is only way to do it. Often people who unconsciously equate boated code to more approachable one imagine some big system written as 1 megabyte regexp as example of brief code. No you cannot do that. At most you could cram what you have into a few percent less lines of code indeed impairing readability. To have better code one should understand it.

Several years I have been working with Java. And there is one thing that is related to it. Some tools (programming languages) just do not have adequate means for code structuring. Often trying to implement in Java approaches that reduce code significantly in other languages lead to more bloated code. Java just resist being concise. So it is not just about design.

One last piece is somehow related to Java is use of design patterns. As any good idea that gets into popular culture (popular programming culture in this case) it becomes a parody on itself. I find this gem by Sarah A. Sheard to be most memorable text on the subject. They forget that every pattern has not only benefits but also an area of applicability and drawbacks. So one have to ponder if code would be better if a pattern is applied. They think instead that the more patterns used the better. They forget that patterns is not something devised by demigods and given to us mortals to be unquestioningly used. Patterns are extracted from real systems. And ones system might need some pattern that is not in GoF bible. And the best way to extract relevant patterns is refactoring. Seeing how often patterns are overused is upsetting since most of them increase amount of code. How often have you seen factory is used where a plain constructor would suffice? Sigh.

2011-07-10

Logging and bug reports

A useful bug report has all three parts: how to locate or reproduce it, what is actually there and what should there be.
I would extend this to the all error reporting in software. Consider, for example, logging. I have many times seen and written code where log messages are written in ad-hoc way just to help with code debugging. Then such system goes production and something terrible happens. And then you discover that error message was written but there's tiny missing piece of information, say identifier of troubling object, that would help you to diagnose the problem. Developer would not notice this before because he could just start debugger with a breakpoint, and look into actual state to find all interesting values. But in production one usually can not afford this.
So here is a rule I have now: if you write error log message put yourself in a position of someone who needs to diagnose or resolve this problem in a production environment. This normally means that error message is a brief bug report:

You say where it happened so developer might find code in question and sysadmin might find configuration problems. And both may find data that caused problem if there is such thing.
Error message says what's wrong
Error message says what is expected if that is not obvious from above. For example, if some value is inappropriate, say also what are allowed values. Not just "Salary value out of range", but "New salary value 1000 is out of range. Allowed range is [1000000, 5000000).".

PS This classic text is somehow on topic.

DRY your code