Monday, April 30, 2012

Critical failures of Apache Maven Central

by Richard Vowles

Today's failure of the meta-data of logback really hit us hard and resulted in significant downtime. What was the issue? Its an issue that appears to happen relatively frequently - someone releases a new version of their library and the release process blows away the Maven metadata - making it appear to anything checking the meta data that there is only one version available - the current one.

As at time of writing, the Maven metadata for logback-classic looks like this:


<metadata>

<groupId>ch.qos.logback</groupId>

<artifactId>logback-classic</artifactId>

<versioning>

<latest>1.0.2</latest>

<release>1.0.2</release>

<versions>

<version>1.0.2</version>
</versions>

<lastUpdated>20120426133004</lastUpdated>
</versioning>
</metadata>

We are using the version range [0.9.17] for Grails 1.3.7 projects and [1.0.1] for Grails 2.0.3 projects. This is good Maven hygiene - we want the build to fail if these specific versions are not available for a good reason (such as no repository available) - not for a bad reason like someone broke the repository. We don't want Maven to choose a different version, we want that specific version.

When you use version ranges, you need the meta data - that tells Maven what versions are available. If you don't specify a range, it shouldn't use the meta data. However... With Maven 3.0.4 today, that turned out not to happen for some reason.

Whats worse, is that when we wanted to work around the problem by relaxing the version ranges, it didn't work. It just got worse and weirder. Eventually, on the brainwave of Michael McCallum - installing the artifact in our 3rd party repository and moving it ahead of Central in the Nexus allowed us to get back to work.

Now if this was the first time this happened, it wouldn't ring such alarm bells, but it isn't. Its at least the third (SOLR being the first time we hit it). I'm now in the unenviable position to have to discuss with my team whether direct Apache Maven Central access will be banned. If the artifact isn't taken from Central and put into our 3rd party repo, it can't be used. We just cannot afford the downtime.

What I don't understand is why there is no automatic process for repair for Central. Its a critical resource, which we enormously appreciate but its value in accessing directly has become much lessened after today's extreme waste of productive, valuable time.

Update: Another one - http://repo1.maven.org/maven2/woodstox/stax2/2.1/



No comments: