Distributed version control systems

Distributed version control systems (DVCS) has been around for many years already, and is increasing in popularity all the time. There are however many projects that are still using a traditional version control system (VCS), such as Subversion. I have until recently, only been working with Subversion as a VCS. Subversion sure has its flaws and problems but mostly got the job done over the years I’ve been working with it. I started contributing to the JBoss ShrinkWrap project early this spring, where they use a DVCS in form of Git. The more I’ve been working with Git, the more I have been aware of the problems which are imposed by Subversion. The biggest transition for me has been to adopt the new mindset that DVCS brings. Suddenly I realized that my daily work has many many times been influenced on the way the VCS worked, rather than doing things the way that feels natural for me as a developer. I think this is one of the key benefits with DVCS, and I think you start being aware of this as soon as you start using a DVCS.

While a traditional VCS can be sufficent in many projects, DVCSs brings new interesting dimensions and possibilites to version control.

What is a distributed version control system?
The fundamental of a DVCS is that each user keeps an own self-contained repository on his/her computer. There is no need to have a central master repository, even if most projects have one, e.g. to allow continuous integration. This allows for the following characteristics:

* Rapid startup. Install the DVCS of choice and start committing instantly into your local repository.
* As there is no need for a central repository, you can pull individual updates from other users. They do not have to be checked in into a central repository (even if you use one) like in Subversion.
* A local repository allows you the flexibility to try out new things without the need to send them to a central repository and make them available to others just to get them under version control. E.g. it is not necessary to create a branch on a central server for these kind of operations.
* You can select which updates you wish to apply to your repository.
* Commits can be cherry-picked, which means that you can select individual patches/fixes from users as you like
* Your repository is available offline, so you can check in, view project history etc. regardless of your Internet connection status.
* A local repository allows you to check in often, even though your code might not even compile, to create checkpoints of your current work. This without interfering with other peoples work.
* You can change history, modify, reorder and squash commits locally as you like before other users get access to your work. This is called rebasing.
* DVCSs are far more fault-tolerant as there are many copies of the actual repository available. If a central/master repository is used it should be backed up though.

One of the biggest differences between Git and Subversion which I’ve noticed is not listed above and is the speed of the version control system. The speed of Git has really been blowing me away and in terms of speed, it feels like comparing a Bugatti Veyron (Git) with an old Beetle (Subversion). A project which would take minutes to download from a central Subversion repository is literally taking seconds with Git. Once, I actually had to investigate that my file system acutally contained all the files Git told me it downloaded, as it went so incredibly fast! I want to emphasize that Git is not only faster when downloading/checking out source code the first time, it also applies to commiting, retrieving history etc.

Squashing commits with Git
To be able to change history is something I’ve longed for in all these years working with Subversion. With a DVCS, it is possible! When I’ve been working on a new feature for instance, previously I’ve ususally wanted to commit my progress (as checkpoints, mentioned above) but in a Subversion environment this would screw things up for other team members. When I work with Git, it allows me the freedom to do what I’ve wanted to do during all these years, committing small incremental changes to the code base, but without disturbing other team members in their work. For example, I could add a new method to an interface, commit it, start working on the implementation, commit often, work some more on the implementation, commit some more stuff, then realize that I need to rethink some of the implementation, revert a couple of commits, redo the implementation, commit etc. All this without disturbing my colleagues working on the same code base. When I feel like commiting my work, I don’t necessarily want to bring in all small commits I’ve made at development time, e.g. just adding javadoc to a method in a commit. With Git I can do something called squash, which means that I can bunch commits together, e.g. bunch my latest 5 commits together to a single one, which I then can share with other users. I can even modify the commit message, which I think is a very neat feature.

Example: Squash the latest 5 commits on the current working tree
$ git rebase -i HEAD~5

This will launch a VI editor (here I assume you are familiar with it). Leave the first commit as pick, change the rest of the signatures to “squash”, such as:

pick 79f4edb Work done on new feature
pick 032aab2 Refactored
pick 7508090 More work on implementation
pick 368b3c0 Began stubbing out interface implementation
pick c528b95 Added new interface method


pick 79f4edb Work done on new feature
squash 032aab2 Refactored
squash 7508090 More work on implementation
squash 368b3c0 Began stubbing out interface implementation
squash c528b95 Added new interface method

On the next screen, delete or comment all lines you don’t want and add a more proper commit message:

# This is a combination of 5 commits.
# The first commit's message is:
Added new interface method
# This is the 2nd commit message:
Began stubbing out interface implementation


# This is a combination of 5 commits.
# The first commit's message is:
Finished work on new feature
#Added new interface method
# This is the 2nd commit message:
#Began stubbing out interface implementation

Save to execute the squash. This will leave you with a single commit with the message you provided. Now you can just share this single commit with other users, e.g. via push to the master repository (if used).

Another interesting aspect of DVCSs is that if you use master repository, it won’t get hit that often since you execute your commits locally before squashing things together and send them upstream. This makes DVCSs more attractive from a scalability point of view.

A DVCS does not enforce you to have a central repository and every user has its own local repository with full history. Users can work and commit locally before sharing code with other users. If you haven’t tried out DVCS yet, do it! It is actually as easy as stated earlier: Download, install and create your first repository! The concepts of DVCS may be confusing for a non-DVCS user at first, but there are a lot of tutorials out there and “cheat sheets” which covers the most basic (and more advanced) tasks. You will soon discover many nice features with the DVCS of your choice, making it harder and harder to go back to a traditional VCS. If you have experience from DVCSs, please share your experiences!

Tommy Tynjä

An introduction to Java EE 6

Enterprise Java is really taking a giant leap forward with its latest specification, the Java EE 6. What earlier required (more or less) third party frameworks to achieve are now available straight out of the box in Java EE. EJB’s for example have gone from being cumbersome and complex to easy and lightweight, without compromises in functionality. For the last years, every single project I’ve been working on has in one way or another incorporated the Spring framework, and especially the dependency injection (IoC) framework. One of the best things with Java EE 6 in my opinion is that Java EE now provides dependency injection straight out of the box, through the CDI (Context and Dependency Injection) API. With this easy to use, standardized and lightweight framework I can now see how many projects can actually move away from being dependent on Spring just for this simple reason. CDI is not enabled by default and to enable it you need to put a beans.xml file in the META-INF/WEB-INF folder of your module (the file can be empty though). With CDI enabled you can just inject your dependencies with the javax.inject.Inject annotation:

public class MyArbitraryEnterpriseBean {

   private MyBusinessBean myBusinessBean;


Also note that the above POJO is actually a stateless session bean thanks to the @Stateless annotation! No mandatory interfaces or ejb-jar.xml are needed.

Working with JSF and CDI is just as simple. Imagine that you have the following bean, where the javax.inject.Named annotation marks it as a CDI bean:

public class ControllerBean {

   public void happy() { ... }

   public void sad() { ... }

You could then invoke the methods from a JSF page like this:

<h:form id="controllerForm">
      <h:commandButton value=":)" action="#{controller.happy}"/>
      <h:commandButton value=":(" action="#{controller.sad}"/>

Among other nice features of Java EE 6 is that EJB’s are now allowed to be packaged inside a war package. That alone can definitely save you from packaging headaches. Another step in making Java EE lightweight.

If you are working with servlets, there are good news for you. The notorious web.xml is now optional, and you can declare a servlet as easy as:

public class MyServlet extends HttpServlet {

To start playing with Java EE 6 with the use of Maven, you could just do mvn archetype:generate and select one of the jee6-x archetypes to get yourself a basic Java EE 6 project structure, e.g. jee6-basic-archetype.

Personally I believe Java EE 6 is breaking new grounds in terms of enterprise Java. Java EE 6 is what J2EE was not, e.g. easy, lightweight, flexible, straightforward and it has a promising future. Hopefully Java EE will from now on be the natural choice when building applications, over the option of depending on a wide selection of third party frameworks, which has been the case in the past.


Tommy Tynjä