Tuesday, December 23, 2008

My tablet wish-list (software)

I broke down and ordered a tablet -- the Lenovo X61 tablet (with the SXGA+ lcd, which is unfortunately no longer in production, so it's a bit hard to find--I ordered from the Lenovo outlet).

This is the (growing) list of features I'm hoping to have:

* Some form of hand-writing recognition for general text input.

Saturday, November 01, 2008

Treat your mailing lists like reference documents, please.

I desperately needed to find out why the tutorials I've been following for an Eclipse PDE task today kept referencing a startup.jar file that I could not locate.

A couple google searches later turned up this link:

http://dev.eclipse.org/newslists/news.eclipse.platform/msg62159.html

The poster in that thread had the same problem (back in Feb. 2007), and found the answer, but none of the content in that thread makes it trivial to locate the answer again.

The responder (with the answer) simply included a link to another mailing list:

http://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/maillist.html

Notice that that page is *not* constant. Today, it shows the most recent posts as of October 31st, 2008. In order to figure out what had happened to startup.jar, I had to take into account the OP's response ("Ok so this is *very* recent."), the timestamp on the messages (Mon, 12 Feb 2007) and then navigate the mailing list archives to find that time period, and start reading.

Please don't put people through this sort of crap. It's generally not difficult to find permlinks to a given email, or include a quick note with the actual answer. I have the answer now (startup.jar was replaced with org.eclipse.equinox.launcher in 3.3), but there is no way that I can tie that answer to the conversation I've linked to above.

For the purposes of Google:

If you're having this problem:


I'm trying to do some automation, but I'm running into a problem with the 3.3 integration build.


java -cp plugins\org.eclipse.platform_3.2.100.v20070126\startup.jar org.eclipse.core.launcher.Main


doesn't do anything. It doesn't say anything. The only information I'm getting is an exit status of 13.


Then you need to use "java -jar plugins/org.eclipse.equinox.launcher_1.0.0.v20070207.jar" (adjusting the version numbers for your installation).

Tuesday, October 21, 2008

Auto-documenting OSGi CommandProviders

(**Edit:** If you're reading this after OSGi R4.2, then there is almost certainly a better way to accomplish the same thing)

I've been digging into OSGi a bit over the last week or so inorder to
create some Eclipse plugins that will automatically discover
eachother, and I've been generally impressed with the framework on the
whole. The documentation is a bit lacking, but there are some good
blog posts about it. (Specifically Neil Bartlett's introduction to
OSGi
.)

One thing that bugged me is the repetition needed when you implement
the CommandProvider interface to add commands to the OSGi console.
CommandProvider defines one method you must supply:

public String getHelp()

OSGi then uses reflection to extract each of the methods that starts
with an "_", and supplies those methods to the command environment as
new commands. (The "_" is trimmed, and the name of the method becomes
the command name.) General practice is to include the name of the
method in the return value of `getHelp()`, along with a description of
what the method does, eg:

public class SampleCommandProvider implements CommandProvider {

public synchronized void _run(CommandInterpreter ci) {
   // do stuff.
}

public String getHelp() {
   return "\trun - execute a Runnable service";
   }
}

This seems like a pain to maintain, so I took a quick look at
annotations, and propose a new syntax:

public class SampleCommandProvider extends 
   DescriptiveCommandProvider {

   @CmdDescr(description="execute a Runnable service")
   public synchronized void _run(CommandInterpreter ci) {
      // do stuff.
   }
}

Here we've extracted the `getHelp()` method into a new superclass, so
our SampleCommandProvider now extends an abstract class instead of
implementing an interface. It also makes use of an Annotation, which
we need to define:

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface CmdDescr {
   String description();
}

Finally, we just need to define the superclass that implements
`getHelp()`:

import java.lang.reflect.Method;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.eclipse.osgi.framework.console.CommandProvider;

public abstract class DescriptiveCommandProvider implements CommandProvider {

   private static final Pattern CMD_PATTERN = Pattern.compile("^_(.*)");
   private String help = null;

   public String getHelp() {
      if (null == help){
         help = buildHelp();
      }
      return help;
   }

   private String buildHelp() {
      StringBuilder helpBuff = new StringBuilder();

      for (Method m : this.getClass().getMethods()){
         if (methodIsCmd(m)){         
            if (0 != helpBuff.length()){
               helpBuff.append("\n");
            }
            helpBuff.append(getDocumentation(m));            
         }
      }
      return helpBuff.toString();
   }

   private boolean methodIsCmd(Method m) {
      return CMD_PATTERN.matcher(m.getName()).matches();
   }

   private String getDocumentation(Method m) {
      StringBuilder methodHelp = new StringBuilder();

      Matcher matcher = CMD_PATTERN.matcher(m.getName());
      if(matcher.matches()){
         methodHelp.append("\t"+matcher.group(1));

         CmdDescr description = m.getAnnotation(CmdDescr.class);

         if (null != description){
            methodHelp.append(" - "+description.description());
         }
      }
      return methodHelp.toString();
   }
}

Note that the actual reflection on the class only happens once -- all
subsequent calls to `getHelp()` use a cached copy of the documentation.

Monday, September 22, 2008

It's called a docking station, Joel :)

The venerable Joel Spolsky asked recently why [someone hasn't made a device][1] that clips to the back of a desk and:


* It's a power strip
* It's a network hub
* It's a USB hub
* You clamp it onto the back of any desk


The idea being that:

This would make it easy to plug in laptops, USB peripherals, and all your rechargers at your desk without crawling around on the floor.


He links to a device that does some of this, and runs ~$150/device. At that price, I think a better solution is a docking station--when you get down to it, I don't want to plug in 4 things every time I sit down even if it doesn't involve crawling under the desk (power, video, usb, ethernet and possibly audio). I think it's unlikely that all the features needed above are really necessary when you just show up for a meeting, or hop over to your coworker's office for a short hacking session. Many conference rooms these days already have tables wired for ethernet / power and svga video to a projector.

[1]: http://www.joelonsoftware.com/items/2008/09/20.html

Tuesday, September 16, 2008

StackOverflow: Endorsing(?) content theft from day one

[Joel Spolsky][8] and [Jeff Atwood][9] just launched the public beta of [Stackoverflow][1] today, with the intent of building a community for high-quality technical questions and answers. I've been using the site for about three weeks now, during the closed beta, and I've noticed a disturbing trend that was outlined in Joel's [announcement post][2] today:


Want to know an easy way to earn reputation? Find a question somewhere with several good, but incomplete, answers. Steal all the answers and write one long, complete, detailed answer which is better than the incomplete ones.


The site presents an interface where much of the functionality is hidden from new users. You can't comment on posts, for example, until you've earned 50 "rep". Voting up takes 15 rep, voting down takes 100 rep, and each downvote you place will cost you one rep. You gain rep by posting questions and answers that other users vote up, or accept. The result is an addictive system that, in theory, prevents ["Griefing"][3] (the system does *NOT* prevent griefing, by the way. It is extremely easy to game.)

Because of this, it is tempting to re-post successful content from other sources, and nothing the creators (Atwood and Spolsky) have incorporated into the site, or the recent announcements, has indicated that this is objectionable. Afterall, good content on Stackoverflow will improve their service, regardless of where it came from, and regardless of whether it is properly credited.

After using stackoverflow for a couple weeks, I think that they have created a useful service, but I also want to call them out for providing an environment that is encouraging plagiarism. Duplication/copying of content *within* stackoverflow does not set easy with me, but I'm willing to accept that the content I create for stackoverflow is public domain, and is free to be copied at will. **However** [these][7] [posts][4] [are][5] [not][6].

[1]: http://www.stackoverflow.com
[2]: http://www.joelonsoftware.com/items/2008/09/15.html
[3]: http://en.wikipedia.org/wiki/Griefer
[4]: http://stackoverflow.com/questions/17512/computer-language-puns-and-jokes#45355
[5]: http://stackoverflow.com/questions/17512/computer-language-puns-and-jokes#48783
[6]: http://stackoverflow.com/questions/35809/why-are-vi-and-emacs-popular#36007
[7]: http://stackoverflow.com/questions/17512/computer-language-puns-and-jokes#47867
[8]: http://www.joelonsoftware.com
[9]: http://www.codinghorror.com

Thursday, September 11, 2008

Breaking away from Visio

The 'proper' way to do user interface design is hotly contested in the OSS software development world, and the discussions usually boil down to three suggestions:

1. "Just write it -- it's not that hard"
2. "Use [glade|qt designer|netbeans|...] -- all the widgets are there"
3. "Just use pencil/pen/whiteboard/etc -- it's faster"

I don't agree with any of these -- (1) is completely unreasonable. Some people may be able to hack out a UI in their favorite language quickly, but when you suddenly need to move half the UI into a new dialog, or out of a dialog and into the main window, *and* change the tabs into a check list, with sample data, you're screwed. Once you finish manhandling your code to account for that change, you'll need to add a component that is shaped like a septahedron with 7 distinct clickable areas and a tooltip that includes the latest stock quotes, and that tooltip needs to be scrollable. The widget set places unreasonable restrictions on the design phase of your project.

Option (2) suffers from the same issues as (1) -- you're constrained by the widgets available, although this is less of an issue because it's generally easier to add images, and the images can look like the widgets you're missing. However, since GUI UI development tools are, well, for development, they require you to do lots of irrelevant (at this stage) tasks, like specifying how objects will behave when they're resized, and dealing with layout managers.

(3) is the closest fit for my needs, and I *do* do lots of paper/whiteboard prototyping, but eventually, I need to show something that looks *real*, and sketches don't cut it. There simply isn't enough resolution there to convey everything that needs to be included in the mock-ups I create.

Hm.. perhaps I should go into what I *do* need, since my needs may be pretty esoteric. If you've made it this far, you're probably seething with anger, or you've got some idea of where I'm coming from. I'm frequently in need of electronic versions of UI prototypes for remote collaboration, "wizard of oz" testing, or for inclusion in presentations and reports. These mock-ups need to look "real" or the is a substantial risk of biasing any experiments, and there is an expectation of polish that can't be reached with hand-drawn interfaces. Since a lot of what I create is to solve novel problems in (at times) esoteric domains, we often need to use a mix of existing and novel widgets.

Generally, we use Visio to create these interfaces. It offers a good balance between vector drawing capabilities and shape templates for common UI widgets / forms / etc. You are also able to import images, which is fairly critical when updating or adding to the UI of an existing tool. (It's easy to take a screenshot, clear out the details with the [Gimp][1], and import as a background layer in Visio.)

Unfortunately, I've been unable to find any OSS tools that can fill this niche as well as visio. There are a few, as recent posts from [slashdot][2] and the old [Joel on Software forums][3] show:

* [DENIM][4]: Lets you sketch out interfaces with a tablet / mouse and create navigable web sites from those sketches. Lacks in the "polish" area.
* [Pencil][5]: Firefox Plugin. Peformance has been poor, in my experience. There are very few widgets (currently) available, and no image import capabilities (this is a huge flaw, IMHO). Pencil could turn into something great, though.
* [DIA][6]: Last release was in March, 2007, but the svn repo does show some activity. DIA lets you create things like network diagrams, UML, and flow charts, much like Visio, however, there are no UI stencils. Instructions for creating new stencils ('shapes') [exist][8], but the svg support for shapes is very limited (no gradients, no rounded rectangles, etc..) and the documentation is even worse.
* [Kivio][7]: Much like dia, with essentially the same failings.
* QT Designer | Glade | etc.: see above comments about GUI development tools.
* [Inkscape][9]: Nominally a vector drawing tool, much like Adobe Illustrator, Inkscape has an active community, good documentation, and it is quite stable. Unfortunately, it is not possible to customize the pallets / shapes available, and there is not much community support to make it a good UI design tool (aside from what can be done with any vector drawing app of this quality).
* [Yahoo! UI Stencils][10] (YUI): Not really a tool, but rather a collection of svg images of common interface widgets.

None of these, on their own, do the job. However, with nothing else looking bright, I've been digging into [Inkscape][9] more over the last few days, and I think I've figured out a workflow that will do.

First off, the [YUI][10] stencils are critical -- but they are not in a format that can be easily imported and used as "widgets". Ideally, Inkscape would let me define custom shapes, complete with constraints on the sub-components of those shapes to influence resize and translation behaviors, but that isn't yet available (to my knowledge). You *can* get around this, somewhat, by using the open dialog as a pallet of sorts:

"If you have a number of small SVG files whose contents you often reuse in other documents, you can conveniently use the Open dialog as a palette. Add the directory with your SVG sources into the bookmarks list so you can open it quickly. Then browse that directory looking at the previews. Once you found the file you need, simply drag it to the canvas and it will be imported into your current document." (From the *Tips and Tricks* tutorial in Inkscape)


This would work reasonably well, if the open dialog were not modal! (Ranting about modal dialog is another post, or two, at least.) Thankfully, you *can* drag from the dialog into an inkscape instance even if they are running on different processes :). Therefore, you can start up two inkscape processes (NOT via the "new document" option on the toolbar or file menu -- you have to actually start up two instances separately or the dialog's modality will still interfere with your work). Once you have the processes going, and two inkscape windows, open the open dialog on one of them, go to the directory with your widgets, minimize the (now useless) inkscape window you opened the dialog from, and rock on with the YUI stencils & whatever other tools you need to hack out your UI in the other inkscape instance.

There are a couple of things to keep in mind:

* Inkscape supports layers, so you can create stub data in a separate layer from the UI structure, and set the background content in another layer, etc.. so you don't have to worry (as much) about grabbing the wrong thing and moving it out of place.
* The drag-and-drop action from the open dialog will include everything in the dragged svg file -- so the YUI stencils (or any custom shapes you make) need to be broken out into separate files. (I've done this for some of the components, and you can download those files here: (Broken out YUI stencils). They are released under a [Creative Commons Attribution 2.5 License][11].

Pencil (or one of the other options) may work better for you -- many people have complained that their clients think an app is nearly finished because the UI looks "real", and there are numerous ways to address that. (eg: NapkinLAF for Swing apps.) I haven't had this problem, and something like NapkinLAF doesn't address the problems I have, which are all related to pre-coding UI design.

[1]: http://www.gimp.org/
[2]: http://ask.slashdot.org/article.pl?sid=05/11/19/2234228
[3]: http://discuss.joelonsoftware.com/default.asp?joel.3.218003.15
[4]: http://dub.washington.edu:2007/denim/
[5]: http://www.evolus.vn/Pencil/Home.html
[6]: http://www.gnome.org/projects/dia/
[7]: http://www.koffice.org/kivio/
[8]: http://www.togaware.com/linux/survivor/Walkthrough_Creating0.html
[9]: http://www.inkscape.org/
[10]: http://developer.yahoo.com/ypatterns/wireframes/
[11]: http://creativecommons.org/licenses/by/2.5/

Friday, September 05, 2008

Wrestling Python

With the launch of the StackOverflow beta I posed a question about python static analysis tools, as I have been playing with python and django recently for some side projects. The responses at Stack Overflow quickly pointed to PyChecker, PyFlakes and PyLint.

Over all, it was a disappointing experience. My experiences are outlined below, and they (more or less) reflect this more extensive review by Doug Hellman.

Here are my first impressions of pyflakes, pychecker and pylint:

* **pychecker**: It crashes frequently, most of the runs I tried resulted in Errors that originated in the pychecker code (eg: AttributeError or IndexError: list index out of range were the most common). For some reason I had to set the DJANGO_SETTINGS_MODULE environment variable before it would even run on any of the app code, and the documentation is very sparse.

* **pyflakes**: 'pyflakes --help' throws a TypeError -- erm... Documentation is also very sparse, and pyflakes is very forgiving (as far as I can tell, it only reports compile errors, warnings, redefinitions, and some concerns about imports--such as unused and wildcards). pyflakes also seems to repeat itself:
> eventlist/views.py:4: 'Http404' imported but unused
> eventlist/views.py:4: 'Http404' imported but unused
> eventlist/views.py:5: 'from eventlist.models import *' used; unable to detect undefined names
> eventlist/views.py:59: 'authenticate' imported but unused
> eventlist/views.py:61: redefinition of unused 'login' from
> line 59
> eventlist/views.py:5: 'from eventlist.models import *' used;
> unable to detect undefined names
> eventlist/views.py:4: 'Http404' imported but unused

* **pylint**: This seems to be the most capable of the tools suggested. It has the best documentation. LogiLab provides a tutorial, pylint has a help screen, and there is a (broken) link to a user manual, which would be extremely helpful. There are some issues with applying pylint to django, since pylint doesn't know about the django classes (such as models.Model). This means that a fair number of otherwise valuable errors are generated about missing class fields. eg:
> `E:105: get_events_by_tag: Class 'Tag' has no 'objects' member`

Parsing these out automatically will be very difficult without some additional knowledge of the classes in use. I'm not sure adding that is feasible, but it does seem likely that pylint is capable of dealing with this in the "right" way. (I probably just need to point it to the django source, but there are no command line params that look likely, and, as mentioned earlier, the user manual is inaccessible.)

For the moment, I'm still looking into pylint -- pychecker and pyflakes need better documentation and they need to become more robust.

Tuesday, August 19, 2008

Patra / Rio

The cityscapes and suburbs of Patra and Rio seem to exist in a state suspended between gentrification and derelict obsolescence. The architecture of this part of Greece is mixed--the older elegant facades have a dusty appearance, particularly when showcased next to the gleaming sheen of new glass and concrete that appears from place to place. However, on the opposite side of the street there is likely to be at least one structure that appears to have been abandoned as soon as the concrete of it's floors and supports dried. These ever-present skeletons must be the result of some period of improvement that was cut short--I have yet to find out the cause. (Could it be a seasonal phenomenon?)

Tuesday, July 22, 2008

Traveling to Patras

Coffe and clouds in Seattle.

A groggy morning in Seattle started with the typical regional sunshine forcing its presence through heavy cloud cover--the first overcast day in nearly a week of clear, scorching weather.

The hills across the Gulf of Patras (looking North)

Seattle to Newark, hustle off the plane, bad coffee, hustle to the next gate, and then encamp for the next 9 hours of travel across the Atlantic. This leg was on a relatively empty 767, but I still can't really sleep on planes. I have become an overnight advocate for noise-canceling headphones though--they make the difference between muffling a sound and turning it off. I found the Athens airport to be considerably less hyper than most US airports (with the exception of Syracuse, which is not much of a surprise), although I think it is universal that no one is particularly happy when at an airport, and that feeling seems to be contagious. It's unfortunate that it is so difficult to visit a place without first visiting an airport there.

The morning view from the balcony

I few hours later I hopped on a tour bus to take everyone from the recent batch of flights off to our hotels in Patras--a wild 4-5 hour ride along the shoulders of "one of the worst roads in Greece" as some one here put it. The careening along coastal roads was punctuated by hairpin turns in seaside villages with roads small enough to make any driver feel claustrophobic, and we were riding in a bus that would seat 30-40, with luggage. It was much like watching a game of Fifteen Puzzle, with a key twist: every tile (car) was self-aware and initially greedy.

We did eventually make it to the Hotel Tzaki almost exactly 24 hours after leaving Seattle.

Friday, July 18, 2008

Cracking down on application clutter (or: my ${HOME} is my castle!)

There was once a time when your home directory was treated as a nearly sacred place, a safe haven where you had near complete control. This trust was only breached for very special reasons: user specific settings and background storage for applications could go in "dot-files"--the hidden files or directories that begin with a "." and therefore don't show up in normal directory listings.

Unfortunately, things began to change. I don't know what kicked it off, but soon there was a Desktop (or desktop) folder. It was glaring--many XFree86 window managers don't even have the *concept* of a desktop, but the defaults environments were (and still are) often set to Desktop Managers. Web browsers took after the DM's, and soon we all had these glaring "Desktop" directories hanging out whether we wanted them or not. I've managed to tolerate this infraction for years, and aside from the occasional frustration (eg: Ecilpse and NetBeans, with their request for a ~/workspace and ~/NetBeansProjects directories).

However, today things changed.


$ ls ~
bin/ PDF/
Desktop/ Pictures/
development/ Public/
documents/ src/
Documents/ shared/
downloads/ Templates/
Mail/ Videos/
Music/ virtualMachines/
myapps/ workspace/


Documents? (Ok, I can sort of understand that one.) Music? Pictures? Templates, PDF, Public, and Videos?! Did I suddenly become a master of multimedia? Keep in mind here, I'm a java hacker on a Linux box--this isn't exactly a fine-tuned rendering/desktop publishing platform. And of course, every one of those directories is empty. Thankfully, I checked before deleting `documents` vs. `Documents` (I've been bitten there before--on a mac due to case conflation--but that's another story).

Why would I want a directory called PDF? I can understand (possibly) wanting to *tag* files with "PDF", but as part of a single-dimensional sorting criteria? (Hey, lets store all my .h files in ~/H/ and all my .cpp files in ~/CPP/! It'll be great!)

Needless to say, I've removed the offending directories, and this time I'm ready:


$ kernel-filesystem-monitor-daemon-cat -v watch ${HOME} |
perl -ne '{
if( /^CREATE/ ) { # only report create events
s|.*URL:\./||g;
if ( !/^\./ ) { # don't report new dot-files
print "$_ created @ ";
print `date`;
}
}
}' > ~/whenCrapWentDown.txt

(newlines and comments introduced to improve clarity -- if you're pasting this into a shell, you'll need to either add \'s or remove newlines.)

KFSMD (kernel-filesystem-monitor-daemon) is an app that does exactly what it's 32-character name says. Whenever a filesystem change occurs, it knows about it. The `-cat` part just tells it to print to stdout, and the hunk of perl does some minor processing, and introduces time stamps.

I'm actually running this in a sticky terminal that's pinned to my E17 desktop, so if/when something starts building an empire in my home directory, I'll be able to compare with what apps are running, and hopefully track it down. (It would be nice to collect the PIDs of the process that actually issued the system call to touch the file system, but this is good enough for now.)

fsWatcher.png

Now we wait...

(This article got me going with KFSMD.)

Tuesday, July 01, 2008

Creating Wizards in Java

A recent project at work required building a multi-step dialog to manage the interface between a user and an expert system (and some fairly advanced NLP to boot). On the surface this looked like a fairly standard Wizard problem -- design a bunch of screens with questions, and then collect the answers as the user proceeded through the dialogs. However, the Wizard APIs I found were either not very mature or (in the case of the Java.net Wizards) it was very difficult to create complex branching behaviors, and those branches were extremely resistant to change. Both things are essentially show-stoppers when working with prototypes that frequently need modification.

In the end, I spent a weekend and a couple evenings building a new Wizard API for Java, called CJWizard. The library is released under the Apache V.2 license, so it should work for just about anything you want to use it for. I would like to know if you're using it, and what you're using it for, just to sate my own curiosity :). The project is hosted on code.google.com, so please submit issues, and feel free to contribute to the project.

CJWizard provides the structure needed to quickly create simple dialogs by implementing an abstract class (WizardPage) for each page of the dialog, and adding them to a PageFactory that can generate pages on-demand, as they are required. This puts the programmer in full control of how the wizard proceeds. The CJWizard architecture also makes it easy to add a wizard to an existing application (either via an additional JDialog, or embedding in some other component), and/or insert custom wrapper widgets around the dialog pages--meaning that you can quickly add customized navigational controls aside from the standard Previous/Next/Finish/Cancel buttons.

Some aspects were taken from the Java.Net wizard API, such as auto-detecting named components, and automatically collecting the values from them, but CJWizard takes a much simpler approach (and in some ways, a less powerful one -- CJWizard does not listen to every key event, only collecting values when the user navigates away from a WizardPage). In most cases, you only need to name widgets prior to adding them to the WizardPage, and their values will be collected in a settings map automatically.

CJWizard was meant to provide a flexible way to generate professional-looking multi-step dialogs very quickly.

Monday, January 28, 2008

Day to day Memoization

Memoization (not **memorization**) is the process of remembering the
results of a computation for use later. (I think of it as "making a
memo" to look back on later.) Memoization is the core to any dynamic
programming implementation, and allows many simple algorithms to run
in linear or polynomial time when they would otherwise take an
exponential number of operations to complete. This is most obvious in
the typical recursive Fibonacci example. Consider the code:

[cc lang="java"]
public class Fib{
public static void main(String[] args){
System.out.println("done: fib of "+args[0]+"="+
fib(Integer.parseInt(args[0])));
}

public static int fib(int n){
int rval = 1;
if (n >= 2){
rval = fib(n - 1) + fib(n - 2);
}
System.out.println("fib("+n+") = "+rval);
return rval;
}
}[/cc]
This is a straight-forward recursive implementation of fib. When run
with `n=4`, we see this:
[cc lang="bash"]
$ javac Fib.java && java Fib 4
fib(1) = 1
fib(0) = 1
fib(2) = 2
fib(1) = 1
fib(3) = 3
fib(1) = 1
fib(0) = 1
fib(2) = 2
fib(4) = 5
done: fib of 4=5
[/cc]

**9** invocations of `fib(n)`, but only 5 **unique** invocations. Lets
memoize the results, and try this again:

[cc lang="bash"]
$ javac Fib.java && java Fib 4
fib(1) = 1
fib(0) = 1
fib(2) = 2
fib(3) = 3
fib(4) = 5
done: fib of 4=5
[/cc]

**Much** better -- 5 invocations, 5 unique sets of parameters.

Here's the source with memoization:
[cc lang="java"]
public class Fib{
static Map<Integer, Integer> memos = new HashMap(); // new

public static void main(String[] args){
System.out.println("done: fib of "+args[0]+"="+
fib(Integer.parseInt(args[0])));
}

public static int fib(int n){
if (memos.containsKey(n)) // new
return memos.get(n); // new

int rval = 1;
if (n >= 2) {
rval = fib(n - 1) + fib(n - 2);
}
System.out.println("fib("+n+") = "+rval);
memos.put(n, rval); // new
return rval;
}
}[/cc]
Notice that we only needed to add 4 new lines of code in order to
memoize the results. When `fib(n)` is called, it simply checks to see
if it has previously been called with n, and if so, that result is
used again. If the parameter has never been seen before, the method
continues as normal, storing the computed result before returning.
Memoization turns this naive (and exponential) implementation of `fib(n)`
into an efficient (linear) operation.

## Memoization in the real world ##

So, (un?)fortunately we don't spend all day implementing cool new ways
of computing ever increasing entries of the fibinocci sequence -- how
can memoization be put to use? After all, many algorithms are already
implemented in some fairly optimal fashion by the language APIs, and
you'd be a fool not to use those implementations. What opportunity
will you have to memoize functions?

It turns out that you can memoize *anything*, as long as the function
is *pure* with respect to the memos (meaning: the function doesn't depend on any thing that is not used to key the hash of memos). If the function is not pure, then you can still use memoization, but either the memo hash must key on all the state and parameters that can affect the results of the function. On the other hand, if f depends on some state that changes very rarely, then it may make more sense to simply discard all the stored memos each time that aspect of state is altered.

Memoization is extremely handy when you have very common operations that are
fairly expensive. I recently needed to optimize some code that
compares strings based on the case-insensitive stems of the words,
with stopwords removed. So the strings "he wanted an apple" and "he
wants apples" should be equal. ("an" is a stopword, and ignored)

This meant doing many, many calls to a string stemmer, each of which
is a fairly expensive operation. Fortunately, hashing strings as
extremely cheap (on the order of 1/4th the time it took to stem a
string of the same length), and I had plenty of memory to store the
parameters and the results in a `Map`. Adding memos to
the two primary time-hoggers (the stemmer and a tokenizer) cut the
execution time of the application down from 2 hours to just over 7
minutes.

## Summary ##

You can memoize any function that only depends on it's parameters and
constant state (or near-constant state -- just don't forget to discard your
memos when the state changes!). If the function is invoked multiple
times you will probably see a performance improvement.

If you need to memoize a function with multiple arguments, then you
just need to nest Maps, or create a unique key by combining the
parameters in some way.

Memoization is an extremely easy way to improve performance under
certain circumstances, particularly if you have a solid grasp on when
state changes outside of your methods / functions, or program in a
functional style. It can be memory intensive, however. If the
results of your functions are large, or maintain references to large
objects, then memoization may **penalize** performance if you run out of
memory and have to make use of swap space.