This page was originaly published as http://www.numeric-quest.com/news/NQ-comments.html.


Comments on Java Programming Style

This file is a standard part of distribution of our Java classes
	Comments    :	on Java programming style
	Author      :	Jan Skibinski, Numeric Quest Inc.
	Version     :	1996.05.06 for Java 1.0
	See         :	footnote

Comments on comments

Ideally, most of the methods should be short, and therefore it should suffice to place the comments in one place only - at the top of a method. This is what the Smalltalk programmers do, and this is what is being a standard in Eiffel.

We agree with both: comments should be the integral parts of the methods and not just some ornaments floating around. So we place them just below the methods names, inside the main curly brackets in case of the methods, and just below the variable declarations.

If the comments are well thought over, one should be able to extract a good documentation from the source code alone. One such extractor accompanies our nq.net classes, another is a part of our addition to xcoral program - Java class hierarchy browser.

As was nicely proven by the Eiffel environment, this is the only way to prevent the source and the documentation from loosing track of each other -- if the source code and API are kept in human editable forms. The alternative is to follow the Smalltalk approach, and to rely on the tools to enforce the standards.

If the programmers know that their comments will be visible through the documentation, they become more responsible for how they use the comments - making them more logical, avoiding cute remarks, etc. That helps in building well constructed applications. To write good comments is a time consuming task, but we believe that this is a well spent time.

There is no place for the remarks deeply embedded in the source code and bearing some vague statements or queries, such as

"Does anyone know how to make it work better?",

or "I am going to improve it later!".

After all, one should be responsible for what one publishes, and secondly, there is a big chance that such vague comments are never read again and no action will be taken later by the original author. So why to bother with them at all, in the first place?

A common practice is to define dozens of instance or class variables, without commenting them at all.

Obviously, the variables are as important as the methods, and the mere fact that they just occupy so little space in the source code should not relieve the designers from documenting them well.

We also believe that the class comments are very important too. Although Smalltalk and Eiffel provide some mechanisms for the class comments, this fact is very often ignored -- even by some designers of the basic libraries. Java application designers are not exception here, and that's wrong!

The class functionality should be clearly outlined up front, so no one would be forced to browse through the pages and pages of the source code to find out what is a functionality of the class in question. Therefore, we always write the class comments and place them inside the class definition - to stress that those comments really belong to the class being defined.

We think, that no class should be published until its designer has extracted the comments and answered "yes" to the question: "Do these comments, together with the class and method signatures, tell the class story by themselves?".

Comments on excessive verbosity

Well written comments should convey information in a clear, precise and terse manner. Excessive verbosity is almost as bad as lack of any comment at all. Let's demonstrate it on one example from JDK String class:

    	/**
     	* Compares this String to the specified object.
     	* Returns true if the object is equal to this String; that is,
     	* has the same length and the same characters in the same sequence.
     	* @param anObject	the object to compare this String against
     	* @return 	true if the Strings are equal; false otherwise.
     	*/
	public boolean equals(Object anObject) {
		....
	}

Well, the same can be said in much simpler manner:

	public boolean equals (Object anObject) {
	//
	//	True if 'anObject' is equal to this string; that is,
	//	has the same length and the same characters 
	//	in the same sequence.
		....
	}

Notice that when comments are placed after a method signature, by the time we get into the comments we already know few things about the method:

You should always format your comments accordingly to the return type. If a method returns:

boolean
Start with "Is....?", or "True if ...", or "False if ..."
void
Start with imperative sentence, such as: "Connect to Internet Provider.."
any other type or object
Start with a noun, or adjective + noun, describing the object being returned. For example: "New string with all capital letters", or "Server port name"

Comments on naming convention

Briefly, properly chosen names for classes, methods and variables add significantly to readability of code or interface. Some languages are better than others to achieve this goal, and Smalltalk is probably the best: it promotes creating tokens that can nicely fit into long series of almost English-like sentences.

We would place Java behind the Smalltalk and Eiffel, in this respect. Nevertheless, you can improve on code/interface readability by adopting few simple rules:

Verb in imperative for void procedures
Verb can have a noun as complement, as in compareAccounts, or can be qualified by an adverb or adjective, as in deepCopy
Noun for functions or variables denoting some objects
Noun may be qualified by an adjective or another noun, as in decayRate, or firstItem
Adjective or prefix is for boolean query
If there is no adjective available that would suggest a true or false property, such as greedy, then use the prefix is, as in isOnLine.
There are certain words in English, that can be used both as an adjective or as a verb, as in empty -- which would lead to some ambiguity. When in doubt -- always qualify it by is.

There are some words in English, such as get that tend to be overused in daily usage, and this habit trickles down to the programming as well. The result is that almost half of the access routines -- in many packages, begin with get, and thus create two problems:

To illustrate this point, here is a flat list of get-ish methods extracted from java.awt.TextArea. They all could easily be renamed by stripping off the prefix get. The list also shows the return types -- exactly in a format as used by our Java browser for xcoral editor:

One could argue that since there is already an instance variable rows in the class TextArea then we should use getRows () for a method that returns that variable, otherwise there will be a name conflict. Not true! It is perfectly legal in Java to use rows and rows () side by side: one is a protected attribute, the other is a public function returning that attribute. The compiler knows a difference and a human reader should easily see it too!

The only reason for having those two versions is to protect the rows from unathorized modification by outsiders. Otherwise, one single public rows would suffice.

By the way, Eiffel goes even further and downplays the differences between the attributes and the functions returning such attributes. For the outsiders they both look alike, since the signature does not provide a clue whether the designer decided to implement a feature as a function or just as a variable. Hence the Eiffel's parliance feature. Such a flexibility is a good news for program designers.

Comments on "Design By Contract"

Traditional presentation of API concentrates mainly on class and method signatures -- adding some extra comments. A very powerful mechanism -- 'Design by contract', exists in Eiffel, which not only improves on safety of a code, but also adds tremendous readability to the code documentation.

Having missed that in Java we are attempting to simulate it somehow via interface nq.lang.ByContract. The language and the compiler support would be needed to implement such concept in its full extent, so our humble simulation attempt can only give you some taste of a real stuff.

Comments on format

Notice the standarized placement of the curly brackets, and their obligatory presence after "if", "try", etc; the usage of blank lines (not too many, not too few); the usage of spaces between the method names and the parenthesis that follow; avoidance of ornament separator lines, banners, etc. For example, the the ornament lines are good if used sparsingly; but if one places the lines like this:

"------------------------------------------"

before and after every method name, soon they become too important by themselves - thus killing the purpose of a clarity.

This approach might seem too picky, but we believe that it helps threefold:

Headers of all our nq.* classes are short: just a class name, author's name and a version. All other information, such as a copyright information, disclaims and what-not -- can easily be accomodated by the file footers. After all, the purpose of publishing the source code is to get into the bottom of the things in a shortest possible way; we do not want to be distracted by a gazillion number of lines containing completely insignificant information.

Comments on categories

To achieve better readability of a source code, we have organized the class sources into several well separated parts, which we call the 'categories'. This is again a common practise in Objective-C, Smalltalk and Eiffel. This feature might be known under different names -- depending on the language, but the concept is the same -- to group the methods with similar functionalities under some common umbrella names, in order to increase readability of sources and interfaces.

Eiffel, and specificly Interactive Software Engineering (Bertrand Meyer's company), goes even further and suggests using some standarized categories all across the standard libraries: "Initialization", "Access", "Basic operations", "Transformation" -- are just the few examples. Their definitions can be found in the basic books on Eiffel.

Categories are part of the language structure in Eiffel; in Smalltalk they are being enforced by certain tools. Java does not have either one, so the only way to implement them is via comments. We do it this way:

	//	category: Access

When our class hierarchy browser prepares flat source or interface it blends together all the inherited lists of methods and attributes, but it respects the boundaries of the categories. Here is a sample of programmer interface as we use and recommend.

Here is also a sample of browser list of methods and attributes -- exactly as taken from our class hierarchy browser. Notice the presence of categories and the Eiffel-like style to display the return type information.

Comments on abbreviations and shortenings

There is a sick trend amongst almost every group of people - small or big, to abbreviate almost everything. It looks like abbreviations give them some sense of belonging, a nobilitation of some sort. Good for them, but if they want to share some information with others, they better have good justification for using those abbreviations, and if they do - they should have some translations exposed in few strategic places.

Imagine a proverbial rocket scientist who encounters the abbreviation ISFNI in some document. Very likely, there is no explanation available. Being a nice guy and feeling some sense of guilt for not knowing the meaning of ISFNI, he spends plenty of his valuable time searching the Internet, and finally discovers that ISFNI really means: It Stands For Nothing Important .

Shortening of the plain English words is a case of yet another sickness, typical for some programmers.

What good would it be to use the proper word received, available in any English dictionary, if we could use instead one of our own inventions, such as:

, etc.

This way, our readers have first to find out, what the rcvd stands for, then memorize it, and if they use our library to develop their own programs - always double check whether they did not make a spelling mistake.

Better yet, let us make our method recvd () a part of public interface, or some abstract class. This way more people will be exposed to our invention, and they will also have to copy it to their implementation code.

We, as the originators of this fantastic trap, would also enjoy the same benefits as others when returning few months later to our own code.

Comments on fame and vanity

Once we muster some area of knowledge -- no matter how important it is, we are eager to make a name for ourselves by naming some algorithm, or something else after our own name. Or, if we are rather shy, some of our collegues will do it for us - that gives them some sense of belonging to the same association of mutual adoration .

The heck with the readers - let them bite a bullet and find out what Brown's algorithm is! Chances are that they will never find out - there are gazillion entries for Brown in all world libraries.

Aside from a plain plagiarism, quite a few proverbial Brown's algorithms could be questioned on the simple basis, that they existed long before Mr Brown was born, but no one bothered to check it out before claiming the autorship. Some such algoritms date back to XIX or even XVIII century.

On the other hand, even if Brown is really famous for something, we should avoid decorating our code with trivial functionalities named after that famous man -- even if he was the author of those little tricks. What good does it do? Brightens our own code?

To demonstrate the above, here is what we found in one, otherwise good, library. This is the entire classs, not just an excerpt:


	public class Goldberg 
		extends Vector {

		public final Object get (int num) {
			return elementAt (num);
		}

		public final void put (Object o) {
			addElement (o);
		}
	}
	//	End of class

Comments on blaming the users

This is not an unusual practice that some -- even most prestigious code producers, make stupid design mistakes or introduce serious implementation bugs, and then try to blame the users for not knowing how to properly use their code, or their product. No apologies to anyone!

Let us not use it as an excuse or a pattern to follow.

We will illustrate the above using quite neutral example, so no one would shoot us from behind. The example dates back to the beginnings of Unix, and there are few left who might feel responsible for the bug. But the bug persist to these days, and we felt into its trap several months ago on one of our Linux machines.

Standard Unix copy command cp accepts switch -R for recursive copying of directory trees. Sounds innocent, doesn't it? Not until you witness a tremendous activity on your machine and discover to your horror that your free disk space vanishes with the lightening speed!

So you kill the process, remove the garbage created and finally discover that one of the files in the directory being copied is /dev/null. This device has variety of usages -- the most popular is to create a file with one zero embedded, an empty file. Well, the bug in cp -R program activates this mechanism recursively -- producing in effect a humongous file with gazillion zeros.

Bug as a bug. Dangerous, but we can live with it. But few weeks later, when reading quite unrelated documentation on one of news readers, we came across some explanation beginning with the verse like this:

Some unexperienced Unix administrators...

Poor soul must have felt obliged to defend the old bug, which has not been fixed for decades, and what's worse - put the blame on unexperienced Unix administrators .


What is so different from Sun standard?

Placement of comments

Our comments do not float as separate ornaments - they are the parts of class definition, and definitions of methods or attributes. Since you first see the signatures (what is it?) and then the comments (how is it done?) you follow a very natural path of reasoning and acquiring information. The resulting chunks are self-contained and add more fuel to the concept of OOP and reusability.

No formatting information in source code

It is our opinion that by adding the html information to the comments of the source code, they become obfuscated, difficult to read and mantain. We understand why Sun started all of this: to help their tools to extract appropriate information needed to create API. We beg to disagree with putting the tools ahead of a human-programmer. The tools should be intelligent enough to extract the interface without any extra help - as we have illustrated in the examples above.

Promotion of a good style

No special effort is required to mantain good documentation to the source code, because the documentation is a part of the design. Anyone can afford it, and everyone would benefit if a little more effort was added to creation of readable source code: the designers, the code mantainers, those who learn and those who teach.

One source for code and documentation

Good class hierarchy browser should be able to extract good documentation from the source code. Even better, the same browser should be able to handle pure API -- if the source code is not available because of commercial protection. But to make it possible a consistent style should be adopted -- identical for both: code and documentation.


Jan Skibinski, Numeric Quest Inc., Huntsville, Ontario, Canada
jans@numeric-quest.com
http://www.numeric-quest.com