Tuesday, May 27, 2014

Groovy fun with @CompileStatic

by Richard Vowles

Groovy is one of the few languages that allow both static and dynamic compilation, and among the "big" languages on the JVM (Java, Groovy, Scala, JRuby and Clojure), the only one as far as I am aware.

Unless I am doing Grails (we stopped at 2.1.13 because of the crazy buggy-ness of that platform, its dependence on Hibernate and just random magic insanity that happens). So just using Groovy in a basic Spring + AngularJS web framework has been empowering - I've only been swapping back to non type safe only really for tests. I still look at Mockito with pain.

One of the things that has bugged me however is Closures. If I wanted a callback to work, I lost the type safety. Consider this method:

  void listProfiles(User user, boolean favourite, String filter, Closure callback)

So I now am not telling the compiler what types are being passed. I thought about this problem tonight and remember that an interface with a single method is treated as a closure - so I thought I'd try it:

    interface ListProfileCallback {
       void profile(Profile profile, boolean favourite)
    }

    void listProfiles(User user, boolean favourite, String filter, ListProfileCallback callback)
and sure enough - the type system kicks in and tells me I'm missing a parameter!


(image taken from http://www.luvyababes.co.uk/)

Friday, May 23, 2014

Gerrit, +2 Looks Good To Me and self plussing

by Richard Vowles

One of the weird things about Gerrit is that it appears to operate by default in nonsense mode.

What do I mean? Well, Gerrit is a code review tool. But the default setup of the tool is to allow you to accept your own reviews and submit  them directly - you can push for review, +2, and then submit.

I mean, this blows my mind. Why on earth would this be allowed, by default, in a code review tool? I know some projects might see a good reason for it - I found a few discussions about them on the Issue 308 that discusses this very topic. Two people saying "I do this" seems to make it the default behaviour.

So in my case, it was important that I could set this on the Permissions Project - it happens across all our projects. So the example from the Gerrit Cookbook - which uses Prolog (that hurt my brain the last time I tried to learn it at University) only allows you to do it on a per project basis. So I needed a submit_filter rather than a submit_rule.

So to put this somewhere so I don't forget it or lose it, here is my series of commands:

git clone ....
git fetch origin refs/meta/config
git branch refs/meta/config
git checkout refs/meta/config
vi rules.pl
git add rules.pl
git commit -a -m "your message"
git push origin HEAD:refs/meta/config

in rules.pl you put:

submit_filter(In,Out) :-
  In =.. [submit | Ls],
  add_non_author_approval(Ls, R),
  Out =.. [submit | R].

add_non_author_approval(S1, S2) :-
  gerrit:commit_author(A),
  gerrit:commit_label(label('Code-Review', 2), R),
  R \= A, !,
  S2 = [label('Non-Author-Code-Review', ok(R)) | S1].


add_non_author_approval(S1, [label('Non-Author-Code-Review', need(_)) | S1]).

Enjoy.

PS this is about preventing gerrit users from moderating their own commits or reviews and allowing them to submit. Just in case this sentence makes it easier for Google Search to find it.



Saturday, May 17, 2014

Dealing with bugs in open source libraries

by Richard Vowles


One thing you have to do when dealing with open source libraries a lot is sorting out buggy libraries. Or libraries where you hit edge cases that you need to get fixed but you can't get it done in a timely fashion. Sometimes you hit fundamental technical problems where you know your fork won't be included in the upstream version and you just have to make a decision.

As +Robert Watkins points out in the comments, at least you can deal with these bugs. In closed source libraries (yes, some still use them), the problem is much harder to deal with, but I don't discuss that here.

This post is not about how to use Github to contribute back. There are lots of posts that do that. Its about what to do when you have to fix the problem right now.

This problem has particularly hit us in our Grails libraries - people scratch their own itch and then abandon the projects, they never wrote any tests for it and out of its basic functionality its painful or they just don't get updated. The Grails Resources plugin had a problem where it didn't detect the scheme (http vs https) of an incoming request and re-write external javascript resources properly. We had to fix that, and years later it still hadn't been resolved.

But just as often, we hit problems in other libraries - and they take time to resolve. People hit problems in libraries that I make ( +Peter Cummuskey for example) and although he makes a pull request, I may not have time to deal with it or potentially prefer to fix it another way (which also takes time). But generally the people on the ground just want it fixed.  The software is usually free, so people are not under any obligation to take your fix or spend any time on it. Thats not much comfort - when you need the fix.

The mechanism we have evolved for dealing with this is helped because we use Apache Maven. Most of our third party (not written in house) libraries are locked up what are called "composite poms". If they have problems we need to resolve, they immediately go into a composite and the problem is isolated at that point.

The tl;dr of this is simply:

  • put your third party dependencies in a composite-pom that you control
  • fork the dependency
  • fix the dependency, rename the groupId
  • release it to your own version to your own hosted Nexus or equivalent
  • update the composite-pom to point to your version and release it
  • version ranges will automatically update your projects, problem solved
  • put your fork in technical debt, 
  • when the developer maintaining the main artifact solves the problem, try using their new version, if it works, change your composite, remove technical debt
  • profit (ok, too many steps for Gnomes perhaps)


Composite Maven Projects

Simply put, a Composite Maven Project (or composite for short) is simply a Maven Artefact that is one file - a pom.xml file that contains a listing of dependencies. We don't use multi-module builds with parent dependencies, we put them in composites and use version ranges.

We usually lock down versions of third party libraries [2.4] for commons-io for example - it allows us to ensure that no other version leaks in without the build breaking (which can tend to happen). And yes, we want the build to break if someone starts fiddling versions.

Composites are bought into projects via version ranges, so changes in them automatically flow down. We upgrade the version of Jackson in composite-jackson from 1.9.8 to 1.9.9, everyone gets it automatically. If we change to 2.x of Jackson (a big change), we change the major version number of the pom so it doesn't automatically flow down.

And no, this isn't my idea - as far as I'm aware, I'm 3rd hand for this. It works really well. If you use version ranges extensively, like we do - check out my release-pom plugin (which at release tells you exactly what you were using) and the bounds plugin (which brings up the lower bounds of the minor version ranges, Maven resolves much more quickly when you do this).

Forking

Many of the plugins or libraries we have traditionally used have been hosted in Subversion. This had made forking and keeping track of the changes relatively difficult. You have to check them out, check them in locally and then apply your set of patches on top. If a new version comes out you have to deal with that situation again. Submitting a patch is time-consuming and a painful process with Subversion.

Git, Github and Bitbucket have made this situation easier - forking is encouraged, if something is a patch that should fix a bug, it can be submitted easily, you can track your changes against the other repository and rebase against it. Even working with subversion libraries can be easier - being able to create a new Github or Bitbucket repository from a Subversion one is easy - contributing the changes back are more difficult however.

However, timeliness is a problem - you need to be able to get your code into a tracked, versioned environment that allows you to reproduce repeatable builds and still deliver your software. And you need to do it now. Or within a reasonable period of time. It creates technical debt, but that can go on the technical debt register.

Maven Repository

Having a hosted Maven repository (we use Sonatype's Nexus) solves many of these issues. It allows us to fork these libraries, we typically rename the groupId so put it into our thirdparty heirarchy, and then re-release it to our Nexus. This can be done by hand (for a one off - effectively duplicating the Release plugin) or for a longer term solution, taking our top level parent (which contains Release information) and pushing it into the pom.xml so it tags and pushes to our Nexus third party repository directly.

Tying it all together

Having forked the source, released the new artifact to our "third party" repository on our hosted Nexus, we then just have to change the composite. The composite points to the new, fixed artifact and the fixed artifact will, of course, automatically flow into all downstream projects that depend on it.

Attributions


  • +Michael McCallum for introducing me to composites and version range use and making Maven wonderful to use again
  • Post documenting this because I asked +Irina Benediktovich to do it for the logstash logback logger - which uses Jackson 2.x and our Grails apps use 1.9.9. She had to get rid of all that OSGi crap.
  • Photo from Jon Watson @ Flickr 

Thursday, May 15, 2014

Of conferences and things

by Richard Vowles

Planning boards from Illegal Argument Conference
I am notorious about how I feel about Conferences. In fact, I have gotten into trouble with people over being outspoken particularly about face-forward conferences, and with the recent Code Mania, I thought I would take the time to explain my point of view.

Now don't get me wrong - people who organise tech conferences are showing the best side of humanity in my opinion, they almost never get rewarded for the effort they put in and there are always critics, like me, who complain, so they do a a job that few people, except those who have organised tech conferences understand.

What prompted me to post this is that I was listening to a podcast of In Beta on the way home - now that they have stopped talking about Twitter they actually have some interesting topics, but I asked myself - would I rather listen to a  podcast on the way home, or would I rather have an person experienced in a topic I was interested in in the car with me on the way home. Would I rather have a dialog or passively listen?

I asked the same question of my wife who loves Science oriented podcasts - would you prefer the to listen to the Naked Scientists or would you prefer to have them in your vehicle as you drive so you could have discussions with them?

Face forward conferences - where the speaker is up front sharing slides and/or code and talking to them, these to me are like podcasts you can't fast forward through. Sometimes that works when someone is an expert about a topic you know nothing about, but want to (e.g. distributed logging). however, isn't a video a better use of your time? You can fast forward through the rhetoric, through the justification and get to the meat. Take a post from +Michael Mahemoff - who shared a link to what appears to be a face-forward presentation from Open World Forum on a particular open source monitoring stack (or combination of tools is probably a better phrase) based on Logstash and a few others. Now in itself, this to me is a five minute presentation - less the rhetoric.

Sometimes however, you want the justifications - they help with convincing those in charge of purse strings that something is worth doing.

Given the "watch a video" delivery vs attending "someone delivering a video to you in real life", who goes to these kind of conferences? There appear to be two types (obviously a generalization) - the corporate employee who has an allocated training budget and the socialiser who primarily goes for the hall-way conversations. We could say all sorts of people who chose to spend their training budgets on conferences at least 50% filled with content they can not apply to their jobs, but really, isn't the second type of person really better served by a conference where all you actually do is talk in the hallway? Wouldn't a mechanism to allow people who want to talk in the hallway about some particular topic work better with a little better organisation? Oh, that would be an Unconference.

Unconferences tend to allow groups of people who want to talk about similar kinds of topics to get together and talk about those topics. They tend to work well when they are wildly generic around a certain area and well-known solutions to the problems are not well known. How vague is that? Lets go to where they don't tend to work - or they work, but not for a certain kind of person. Unconferences tend to focus on sharing of experiences and showing tidbits of work - this is good for people who haven't solved those problems or on the way to solve those problems, but it is terribly dull for those who have solved those problems. Experts in a more most fields tend not to attend a session in an unconference on a topic they are expert in because they aren't learning anything, so you get the almost solved or generally interested attending such sessions. Sometimes you have people who don't perceive themselves as experts but have a wealth of background and are passionate (e.g. when talking about testing or CI) still attend and there are many unexperienced people, and those are often gold

Unconferences can work, but they don't when people are just trying to bolster numbers vs actually focus on a topic or set of topics where people haven't solved those problems yet. The best unconferences actually decide their topics up front based on who is attending (such as Citcon - thats pronounced Kitkon) so they get focus, and that lets people bail if they aren't interested in those topics rather than wasting a day. It also lets people who are interested in those topics really get value out of it. However when the topic is fairly narrow (Citcon is continuous integration and testing, a topic that largely hasn't changed in the last five years) then going to more than two of these conferences can get pretty dull.

But hold on? Aren't those the only two models?

No, not really, you have code camps (1+ days), micro code camps (like the Code Lounge that I run) which usually last half a day to a day, there are all sorts of tech focused events.

I run the Code Lounge like I do because it allows me to suggest topics that I am interested in, list them, garner interest and then schedule them. They are focused, short and to the point. And the only people to attend are those that actually are interested in the topic.

If I were to run a conference again (and I ran two, a face forward and then an unconference), what would it look like? I've talked quite a bit about that with +Mark Derricutt and +Greg Amer on +Illegal Argument when I was still on that podcast. For me it would go something like this:


  • Organiser(s) determine general topics and call for suggested papers
  • People vote on papers and others offer to present based on remuneration, etc
  • Papers once agreed on by Organiser(s) are put out as a schedule - if its free, people just vote, otherwise they have to stump up some cash - or the supporting companies have to for the people producing papers to create videos of their content
  • Rounds happen where Organisers help coach presenters on how best to present their content, material is uploaded and conference attendees are able to see.
  • People are allowed to post questions for discussion during conference, they can be voted on
  • On the day, the presenter gives a 5 minute overview of their presentation and starts going through the questions from top to bottom discussing and taking points from the attendees
  • Ample free corridor and work-together time is to be had

Organisers can always sell on (with a cut to the presenters) the video and Q&A session of the conference.

Thats possibly a lot more work, but it would be really interesting to go to such a blended conference.

Saturday, November 2, 2013

Finding well known resources in the JVM

by Richard Vowles

A lot of people don't seem to know this, but it is actually extremely easy to find resources in the JVM without class path scanning.

Quite a few people use a scanning library or implement one themselves (which given the number and variety of ways you can specify a classpath is always going to miss something out). But if you have a well known resource, then its pretty easy to gather up all instances of it in all of the different jars that make up your application, and this is true if you are using Groovy, Java, Scala, JPython or JRuby (or any of the others, as long as you can get access to a class loader).

Since 1.2 of Java, there has been this method in the ClassLoader interface:

    public Enumeration<URL> getResources(String name) throws IOException;

So all you need to do is call:

    getClass().getClassLoader().getResources("/META-INF/myfiles.properties")

for example and all copies of that will be returned. It may not be the most efficient implementation, but it is the most well supported and it comes straight in the JVM, no extra dependencies required.

Sunday, August 25, 2013

Groovydoc and Maven

by Richard Vowles

Groovydoc and Maven


One of the things we discovered this last week, when trying to release to Apache Maven Central, was that javadoc (which is required for every artifact that has code in it) was not being generated for our Groovy artifacts.

Now in the past, we have used GMaven, which generated stubs (albeit poorly) and those caused the javadoc to trigger off, but since we moved to the Eclipse Maven plugin, this no longer worked - it compiling them both together. So we needed to start using actual Groovydoc - and it turned out, on a brief search, that there was no Maven plugin for Groovydoc.

There was however an Ant plugin, but having an Ant script, although nice that it can be dropped back to, is usually somewhat error prone and can lead to difficult configuration problems. That seemed to be the case for the Ant plugin as well, with quite a few people not being able to get it working. One of the good things however that I discovered was that the real work was in fact done outside of the Ant Groovydoc task - it just collected the information and passed it on.

The one thing I learned however, and the reason I am writing this up, is that source paths must be relative. If you pass a path that ends up having the full path to the file in it, Groovydoc will treat it as being in the Default package - and you will get a whole lot of classes in your DefaultPackage in the generated documentation. If you encounter this problem when using Ant or Gradle, then this will be the reason - make sure you use offsets from the directory where you run the build script from.

I have sorted this problem out in the Maven build, and it will pick up all source directories that get added and include them - this means generated sources and anything you add with the build helper Maven plugin.

The documentation and source is over on Github, and the artifact is in Maven Central.

Saturday, August 10, 2013

Latest plugins on Apache Maven Central

by Richard Vowles

Blue Train Software Plugins


I've made a couple of plugins lately for the projects that we use at Group Applications at the University of Auckland. I decided they would be better as open source plugins, so I did them on my own time, and they address two aspects of the lifecycle of Maven projects that we build these days.

The Release POM Plugin

The first one,  the Release POM plugin is specifically designed to generate a single pom with all transitive dependencies resolved. All dependendencies also add an exclusion clause for all dependencies of their own.

Why? Because sometimes, particularly when you are patching a production artifact, you need to make sure all dependencies stay exactly the same, except for the one you are changing. That is to me what a patch is, and that is very hard to do, as Maven doesn't actually store the versions of the artifacts you use. 

This is exacerbated because we use version ranges, which greatly aids development, bug fixing, feature enhancement and general working within the team, but it also means you need to make sure versions are locked down once you do an actual release intended for production. 

The release pom plugin and its documentation are on Github.

The Karma Runner Plugin


This one is for our Javascript, as we use AngularJS, however it can be used with any Javascript library, it just works very well with AngularJS. It requires the use of Node JS as the Karma Runner project uses that framework. 

Much of what the Karma Runner Plugin does could be done with a considerable amount of manual setup in a Maven pom - it just assumes you are using a war or (preferably) Servlet 3 JAR setup, it scans your dependencies, unpacks the javascript, re-writes the Karma config file, brings in local developer overrides (e.g. browser setups) and runs your tests. It means with minimum fuss, and maximum compatibility, we can run our Jasmine tests for Angular JS. In our case, the definition of the plugin is in the parent of every Servlet 3 jar project, nothing else needs to be configured for an individual project.

If you use Karma in a Maven project lifecycle, it really is a useful plugin.

The Karma Runner plugin for Karma is on Github.