Bigger Wrench: February 2010

Tuesday, February 23, 2010

PIM Manfesto

Personal Information Manager Manifesto

A manifesto is a declaration of principles. It may sound grandiose, but this declaration of principles is important to me because it is the distillation of at least 7 years of thinking about what the future I want to create really is.

Principal: The PIM is an expression of how people live their lives.

The things users talk about in real life correspond to the things they manipulate in their PIM. For example if a person thinks in terms of anniversaries, holidays, meetings than those are the things they schedule rather than something called an event.

The actions users talk about in real life correspond to the actions they take in their PIM. If they think in terms of making and keeping promises than the PIM makes it possible to make and keep promises.

Principle: The PIM has a very minimalist design.

It displays what the user needs when it is needed rather than overwhelming them with all the information they might need.

Principal: The PIM is designed to use and make available connections between the things the user deals with.

The PIM makes available the connections between the data when the user needs them. It is easy for the user to lay out the communities in their lives the way they think of them. Adding and subtracting people from a community should be easy.

Principal: The PIM allows users to recognize what is available from where they
are rather than remembering.

Having the user be able to get the information they need when they need it and recognize its availability rather than remember where it would be.

I will probably add to or simplify this over time but I wanted to get it down in writing and out there.

Saturday, February 13, 2010

Build Tools - An incomplete paradigm - part 2

I am in the midst of making my build system for the saltations project usable by everyday developers. i.e. I am working to make it a turnkey system and in the process came across this entry in Kent Spillner's blog and nearly split my side laughing.

I don't know is he meant it to be funny but it was an accurate summation of most of my complaints about Maven and I liked his expression of it:

http://kent.spillner.org/blog/work/2009/11/14/java-build-tools.html

Ah, well, back to work.

Tuesday, February 9, 2010

If a language contains a hole, programmers will fall into it. All languages contain holes.

The above title is a quote from an article on error rates in scientific software.

The article is a straightforward and easy read with fascinating results for error rates in both C and FORTRAN code. This study used both static analysis of the code as well as a runtime comparison of 2 implementations of the same algorithms acting on the same input data with the same parameters. It is arguable that the results may not be limited to those 2 languages.

The error rates were not a surprise, similar error rates have been demonstrated over and over again in typical software. The types of errors were interesting as well as the impact that the sum total of the errors can have. In effect

"...
these 2 experiments suggest that the results of scientific calculations involving significant amounts of software should be treated with the same measure of disbelief as an unconfirmed physical experiment"

.
That is not a cheap dilemma.

We start talking about independent verification of complex software calculations, the costing man-hours and money goes up drastically. Yet I think it is obvious that the results of this research points strongly to that being the case.

Another interesting result is that there appears to be a clear relationship between the complexity of the language specification and the number of holes for a programmer to fall into. Probably not a surprise, but clearly not something that many language developers pay much attention to. In general, most programming languages produced these days have a much larger number of language rules than preceding languages.

The author makes an argument that program standards adherence in scientific software is laughable. I leave to your imagination and judgment how applicable that conclusion is to your own workplace.

Here is a pointer to the PDF article: THE T-EXPERIMENTS: ERRORS IN SCIENTIFIC SOFTWARE. Enjoy !

Build Tools - An incomplete paradigm - part 1

Dependency Management, Provisioning, and Repositories

I love automated build systems. I like letting the computer do the same thing over and over so I don't have to. In some respects I think we have started to come out of the dark ages of programming in that more people think in terms of continuous builds and unit testing then don't.

I didn't say that everybody does it, but it is a vocabulary that everybody has and can speak, even if they choose not to use it. 5 to 7 years ago the average opinion was that automated builds were overkill that only the wealthiest companies could waste time on and unit testing, while laudable, was considered something that most people didn't have time for.

Nowadays, when I interview, it is rare for me to encounter company that does not have an automated build system and some flavor of testing the code.

And given the quality of open source tools available for most of these tasks is rare that an enterprise has to bother spending money on the tools.

In one of my previous posts Ivy vs Maven, I mentioned that I do most of my dependency management using Ivy. It allows me to simply specify a dependency such as version 7.12 of DB4O And it will retrieve all the other dependencies that DB4O needs. A few put pointers to Ivy related material are below:

Automation for the people: Manage dependencies with Ivy

http://ant.apache.org/ivy/

It also has an eclipse plug-in so that the dependencies that you specify in Ivy are used by Eclipse in your projects.

IvyDE plugin. http://ant.apache.org/ivy/ivyde/download.cgi

In my world the advantage that Ivy has over Maven is that it's not tied to Maven's project structure the way Maven's dependency manager is. I can use it with Ant easily and powerfully. Maven on the other hand has a number of built in design assumptions ( such as thou shalt only generate one artifact (jar, zip, etc...) per project. If you are able to design your project from scratch and have it fit into the Maven project structure and design assumptions AND you don't have any need to do any coding of extensions to Maven then I would recommend using Maven. If not, I would recommend Ivy and Ant together!

These are all things that myself and others have all said before.

There are two areas of automated build systems though that often get overlooked. One I consider a solved problem and the other I consider a a royal pain in the butt. Those two problems are Artifact/Metadata repositories and the build system provisioning.

Let's talk about the solved one. Artifact/Metadata repositories are the storage area that Maven and Ivy go to to get metadata on dependencies as well as actually retrieves the artifacts themselves. The Apache Ivy project does not maintain any repositories themselves but they are coming able to talk with the Maven repositories. The Apache Maven project (or some related group of people) do maintain a repository. Of course, like many volunteer manned projects, the coverage of metadata and artifacts can be spotty at times. Overall though, it is an incredible gift that these volunteers give us.

Now we get to the steamy underside: Many open source projects for one reason or another, are unable or unwilling their artifacts and metadata in the Maven repositories. For some, like Google, it appears that many of the projects are not published to the Maven repository simply because their build system doesn't mesh well with the Maven toolset so additional work would be required to post these artifacts and metadata during each release. For others, it is ideological. For those projects they are avoiding going to Maven for a build system and avoid doing anything to support the Maven "ecosystem".

So this means that many artifacts and metadata about those artifacts are not available out of the box when you are using the Maven or Ant+Ivy build system.

In addition , even if all the artifacts are in the Maven run repositories, there is no guarantee that they will be available. There are many times in a week where the Maven repositories may experience slowdowns.

So I started by describing this is a solved problem. This is why: there are a set of companies out there that have put together repository software such as Nexus and Artifactory that act as repositories as well as proxies for other repositories.

At home on my build server I am running a copy of Nexus. I use Nexus primarily because when I first tried out Maven repositories Nexus was much more mature than Artifactory, I haven't revisited them in a while simply because I haven't run into anything that Nexus can't handle well.

Installation and running of Nexus is straightforward (At least on my Ubuntu server). It is already pointed to the key Maven repositories in typical use. All that needs to be done otherwise is to point your Maven or Ivy installation at the Nexus repository rather than the individual Maven repositories.

Pointing your Ivy installation at Nexus is as simple as adding the following line to the ivy-settings.xml:

<ibiblio name="nexus" m2compatible="true" root="http://kukri:8081/nexus/content/groups/public">

Which tells Ivy to use the ibiblio resolver (Ibiblio was the website that provided the first maven repository) and to assume that the repository is Maven 2 compatible.

After the 1st time a build system is used, those artifacts are downloaded by Nexus to the local Nexus repository and are available from then on without regard to whether or not the original Maven repositories are available.

I think that it is obvious that if you are going to use dependency management as part of the build process in the enterprise, you need something like Nexus that you are not at the mercy of Internet connectivity and website availability for your builds.

Does it cure everything? No. It is still a minor annoyance to deal with those artifacts that are not managed as part of any Maven repository and have to be manually uploaded to Nexus. It is not the upload process that is annoying, Nexus makes that easy, the real headache is keeping track of those artifacts that you may need to do this with.

Of course, once you have an enterprise artifact repository like Nexus installed you can just back that up.

Overall, in the business/commercial enterprise world, this solution works well as is. In the next posting I will discuss where this set of solutions is inadequate for a real world open source problem.

Bigger Wrench