Tuesday, February 9, 2010

Build Tools - An incomplete paradigm - part 1

Dependency Management, Provisioning, and Repositories

I love automated build systems. I like letting the computer do the same thing over and over so I don't have to. In some respects I think we have started to come out of the dark ages of programming in that more people think in terms of continuous builds and unit testing then don't.

I didn't say that everybody does it, but it is a vocabulary that everybody has and can speak, even if they choose not to use it. 5 to 7 years ago the average opinion was that automated builds were overkill that only the wealthiest companies could waste time on and unit testing, while laudable, was considered something that most people didn't have time for.

Nowadays, when I interview, it is rare for me to encounter company that does not have an automated build system and some flavor of testing the code.

And given the quality of open source tools available for most of these tasks is rare that an enterprise has to bother spending money on the tools.

In one of my previous posts Ivy vs Maven, I mentioned that I do most of my dependency management using Ivy. It allows me to simply specify a dependency such as version 7.12 of DB4O And it will retrieve all the other dependencies that DB4O needs. A few put pointers to Ivy related material are below:

Automation for the people: Manage dependencies with Ivy


It also has an eclipse plug-in so that the dependencies that you specify in Ivy are used by Eclipse in your projects.

IvyDE plugin. http://ant.apache.org/ivy/ivyde/download.cgi

In my world the advantage that Ivy has over Maven is that it's not tied to Maven's project structure the way Maven's dependency manager is. I can use it with Ant easily and powerfully. Maven on the other hand has a number of built in design assumptions ( such as thou shalt only generate one artifact (jar, zip, etc...) per project. If you are able to design your project from scratch and have it fit into the Maven project structure and design assumptions AND you don't have any need to do any coding of extensions to Maven then I would recommend using Maven. If not, I would recommend Ivy and Ant together!

These are all things that myself and others have all said before.

There are two areas of automated build systems though that often get overlooked. One I consider a solved problem and the other I consider a a royal pain in the butt. Those two problems are Artifact/Metadata repositories and the build system provisioning.

Let's talk about the solved one. Artifact/Metadata repositories are the storage area that Maven and Ivy go to to get metadata on dependencies as well as actually retrieves the artifacts themselves. The Apache Ivy project does not maintain any repositories themselves but they are coming able to talk with the Maven repositories. The Apache Maven project (or some related group of people) do maintain a repository. Of course, like many volunteer manned projects, the coverage of metadata and artifacts can be spotty at times. Overall though, it is an incredible gift that these volunteers give us.

Now we get to the steamy underside: Many open source projects for one reason or another, are unable or unwilling their artifacts and metadata in the Maven repositories. For some, like Google, it appears that many of the projects are not published to the Maven repository simply because their build system doesn't mesh well with the Maven toolset so additional work would be required to post these artifacts and metadata during each release. For others, it is ideological. For those projects they are avoiding going to Maven for a build system and avoid doing anything to support the Maven "ecosystem".

So this means that many artifacts and metadata about those artifacts are not available out of the box when you are using the Maven or Ant+Ivy build system.

In addition , even if all the artifacts are in the Maven run repositories, there is no guarantee that they will be available. There are many times in a week where the Maven repositories may experience slowdowns.

So I started by describing this is a solved problem. This is why: there are a set of companies out there that have put together repository software such as Nexus and Artifactory that act as repositories as well as proxies for other repositories.

At home on my build server I am running a copy of Nexus. I use Nexus primarily because when I first tried out Maven repositories Nexus was much more mature than Artifactory, I haven't revisited them in a while simply because I haven't run into anything that Nexus can't handle well.

Installation and running of Nexus is straightforward (At least on my Ubuntu server). It is already pointed to the key Maven repositories in typical use. All that needs to be done otherwise is to point your Maven or Ivy installation at the Nexus repository rather than the individual Maven repositories.

Pointing your Ivy installation at Nexus is as simple as adding the following line to the ivy-settings.xml:

<ibiblio name="nexus" m2compatible="true" root="http://kukri:8081/nexus/content/groups/public">

Which tells Ivy to use the ibiblio resolver (Ibiblio was the website that provided the first maven repository) and to assume that the repository is Maven 2 compatible.

After the 1st time a build system is used, those artifacts are downloaded by Nexus to the local Nexus repository and are available from then on without regard to whether or not the original Maven repositories are available.

I think that it is obvious that if you are going to use dependency management as part of the build process in the enterprise, you need something like Nexus that you are not at the mercy of Internet connectivity and website availability for your builds.

Does it cure everything? No. It is still a minor annoyance to deal with those artifacts that are not managed as part of any Maven repository and have to be manually uploaded to Nexus. It is not the upload process that is annoying, Nexus makes that easy, the real headache is keeping track of those artifacts that you may need to do this with.

Of course, once you have an enterprise artifact repository like Nexus installed you can just back that up.

Overall, in the business/commercial enterprise world, this solution works well as is. In the next posting I will discuss where this set of solutions is inadequate for a real world open source problem.

1 comment:

  1. Hi Jim,

    You may want to look again at Artifactory today - especially when using Ivy. Artifactory offers a couple of unique Ivy integrations - at present these include: searching inside the content of Ivy modules (equivalent to searching inside POMs), deploying from the UI to any path (unbounded by Maven conventions), Ivy module views and Ivy dependency snippets.