Docker on Mac OS X using CoreOS

Docker is on everybodys lips these days. It’s an open-source software project that leverages from Linux kernel resource isolation to allow independent so called containers to run within a single Linux instance, thus avoiding overhead of virtual machines/hypervisors while still offering full container isolation. Docker is therefore a feasible approach for automated and scalable software deployments.

Many developers (including myself) are nowdays developing on Mac OS X, which is not Linux. It is however possible to use Docker on OS X but one should be aware of what this implies. As OS X is not based on Linux and therefore lacks the kernel features which would allow you to run Docker containers natively on your system, you still need to have a Linux host somewhere. Docker provides you with something called boot2docker which essentially is a Linux distribution (based on Tiny Core Linux) built specifically for running Docker containers.

In case you want to use a more general Linux VM, if you want to use it for other tasks than just running Docker containers for instance, an alternative for boot2docker is to use CoreOS as your Docker host VM. CoreOS is quite lightweight (but is obviously bigger than boot2docker) and comes bundled with Docker. Setting up a fresh CoreOS instance to run with Vagrant is easy:

mkdir ~/coreos
cd ~/coreos
echo 'VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "coreos"
config.vm.box_url = "http://storage.core-os.net/coreos/amd64-generic/dev-channel/coreos_production_vagrant.box"
config.vm.network "private_network",
ip: "192.168.0.100"
end' > Vagrantfile
vagrant up
vagrant ssh
core@localhost ~ $ docker --version
Docker version 0.9.0, build 2b3fdf2

Now you have a CoreOS Linux VM available which you can use as your Docker host.

If you want to mount a directory to be shared between OS X and your CoreOS host, just add the following line with the proper existent paths in the Vagrantfile:
config.vm.synced_folder "/Users/tommy/share", "/home/core/share", id: "core", :nfs => true, :mount_options => ['nolock,vers=3,udp']

Happy hacking!

Past and upcoming events

We have been unusually busy at Diabol during the past few months, speaking at various software conferences on a variety of Continuous Delivery related topics. We enjoy sharing our experiences and to meet new and familiar faces to discuss topics that we’re passionate about, such as Continuous Delivery, DevOps and automation.

We have also arranged a Continuous Delivery seminar of our own, which attracted 20 top IT-management professionals from various well known Swedish enterprises. The seminar was a great success, with interesting presentations and good discussions among the attendees.

The next upcoming event where we will be presenting is the first edition of the Continuous Delivery Conference in Bussum, Netherlands on December 4th. Andreas Rehn will present “From dinosaur to unicorn in 12 months: how to push continuous delivery maturity to the next level”.

Past events where we have been presenting lately, together with video recording or presentation material:

If you plan to attend a conference where we’re speaking or just attending, come by and say hi! We look forward talking to you!

A comment on ‘Continuous Delivery Pipelines: GoCD vs Jenkins’

In Continuous Delivery Pipelines: GoCD vs Jenkins, there are some good points on modeling continuous delivery, but to make his point, that Go is better (at CD) than Jenkins, he chooses to represent Jenkins with the Build Pipeline Plugin and not Delivery Pipeline Plugin by Diabol with superior visualization. I haven’t used Go and I would like to get the time to have a deeper look at it. As far as I gather, it’s a competent tool, quite possibly better at CD than Jenkins, but I have to say that it is quite possible to do CD and pipelines in Jenkins, I am doing it as I write.

His post brings up some interesting points like, do we need pipelines as first class citizens rather than just ‘visual doodles’ (in my tools)? I am a bit skeptical. Pipelines are indeed central in CD, but my thoughts on what’s lacking from Jenkins goes a bit differently. I think we need a set of tools to do CD and pipelines.

Pipelines should probably be expressed outside of the (build/CI/deploy) tool because they will span several domains and abstractions levels. There will be need for different visualizations of the pipeline state and configuration. In fact what matters in run-time (as opposed to design-time) is not the pipeline itself, but questions like:

  • Where is my commit?
    • Example: It’s passed commit phase and has been deployed to the first environment where automated tests are running
  • What is the state of this environment?
    • Example: It has version 1.4.123 of app X stemming from commit Y and version 1.3.567 of the VM template stemming from commit Z.

The pipeline will also be subjected to continuous improvements. Then, in design-time (and debug-time), there are questions like:

  • What are the up-stream dependencies of this job?
  • Where is this artifact produced?

Your tools should help you answer questions like that, simply and quickly, by good visualization and design.

Diabol proudly presents Continuous Delivery seminar

Diabol is proud to arrange a seminar completely dedicated to Continuous Delivery, to be kicked off in less than a week on September 30th in Stockholm. This event is an exclusive invite-only event where the top IT-management attendees will learn how Continuous Delivery can help their organization in becoming more efficient in developing and delivering software. Our hand-picked speakers will present how Continuous Delivery and delivery process automation have changed their respective organizations in becoming lean business machines. Instead of dealing with painful manual repetitive tasks which are commonly associated with a traditional release and deploy process, their employees can now focus on innovation and to create business value.

Event speakers:

  • Stefan Berg, former CIO at Com Hem will present: “From average to top performer in less than a year!”
  • Tomas Riha, Agile Architect at Volvo Group Telematics will present: “From hobby project to Continuous Delivery as a Service for the entire organization”

Make sure you keep visiting this channel for more news on Continuous Delivery!

Diabol now a Cloudbees partner

Diabol is now a Jenkins gold service partner to Cloudbees: Diabol AB partner page Cloudbees.com

Cloudbees, the ‘Jenkins Enterprise company’, is a continuous delivery (CD) leader. They provides solutions that enable IT organizations to respond rapidly to the software delivery needs of the business. Their offerings are powered by Jenkins CI, the world’s most popular open source continuous integration (CI) server. The CloudBees CD Platform provides a range of solutions for use on-premise and in the cloud that meet the security, scalability and manageability needs of enterprises. Their solutions support many of the world’s largest and most business-critical deployments.

Diabol is proud to collaborate with Cloudbees.

Agile Configuration Management – part 1

On June 5 I held a lightning talk on Agile Configuration Management at the Agila Sverige 2014 conference. The 10 minute format does not allow for digging very deep. In this series of blog posts I expand on this topic.

The last year I have lead a long journey towards agile and devopsy automated configuration management for infrastructure at my client, a medium sized IT department. It’s part of a larger initiative of moving towards mature continuous delivery. We sit in a team that traditionally has had responsibility to maintain the test environments, but as part of the CD initiative we’ve been pushing to transform this to instead providing and maintaining a delivery platform for all environments.

The infrastructure part was initiated when we were to set up a new system and had a lot of machines to configure for that. Here was a golden window of opportunity to introduce modern configuration management (CM) automation tools. Note that nobody asked us to do this, it was just the only decent thing to do. Consequently, nobody told us what tools to use and how to do it.

The requirement was thus to configure the servers up to the point where our delivery pipeline implemented with Jenkins could deploy the applications, and to maintain them. The main challenge was that we need to support a large amount of java web applications with slightly different configuration requirements.

Goals

So we set out to find tools and build a framework that would support agile and devopsy CM. We’re building something PaaS-like. More specifically the goals we set up were:

  1. Self service model
    It’s important to not create a new silo. We want the developers to be able to get their work done without involving us. There is no configuration manager or other command or control function. The developers are already doing application CM, it’s just not acknowledged as CM.
  2. Infrastructure as Code
    This means that all configuration for servers are managed and versioned together as code, and the code and only the code can affect the configuration of the infrastructure. When we do this we can apply all the good practices we know well from software development such as unit testing, collaboration, diff, merge, etc.
  3. Short lead times for changes
    Short means minutes to hours rather than weeks. Who wants to wait 5 days rather than 5 minutes to see the effect of a change. Speeding up the feedback cycle is the most important factor for being able to experiment, learn and get things done.

Project phases

Our journey had different phases, each with their special context, goals and challenges.

1. Bootstrap

At the outset we address a few systems and use cases. The environments are addressed one after the other. The goal is to build up knowledge and create drafts for frameworks. We evaluate some, but not all tools. Focus is on getting something simple working. We look at Puppet and Ansible but go for the former as Ansible was very new and not yet 1.0. The support systems, such as the puppet master are still manually managed.

We use a centralized development model in this phase. There are few committers. We create a svn repository for the puppet code and the code is all managed together, although we luckily realize already now that it must be structured and modularized, inspired by Craig Dunns blog post.

2. Scaling up

We address more systems and the production environment. This leads to the framework expanding to handle more variations in use cases. There are more committers now as some phase one early adopters are starting to contribute. It’s a community development model. The code is still shared between all teams, but as outlined below each team deploy independently.

The framework is a moving target and the best way to not become legacy is to keep moving:

  • We increase automation, e.g. the puppet installations are managed with Ansible.
  • We migrate from svn to git.
  • Hiera is introduced to separate code and data for puppet.
  • Full pipelines per system are implemented in Jenkins.
    We use the Puppet dynamic environments pattern, have the Puppet agent daemon stopped and use Ansible to trigger a puppet agent run via the Jenkins job to be able to update the systems independently.

The Pipeline

As continuous delivery consultants we wanted of course to build a pipeline for the infrastructure changes we could be proud of.

Steps

  1. Static checks (Parse, Validate syntax, Compile)
  2. Apply to CI  (for all systems)
  3. Apply to TEST (for given system)
  4. Dry-run (–noop) in PROD (for given system)
  5. PROD Release notes/request generation (for given system)
  6. Apply in PROD (for given system)

First two steps are automatic and executed for all systems on each commit. Then the pipeline fork and the rest of the steps are triggered manually per system)

Complexity increases

Were doing well, but the complexity has increased. There is some coupling in that the code base is monolithic and is shared between several teams/systems. There are upsides to this. Everyone benefits from improvements and additions. We early on had to structure the code base and not have a big ball of mud that solves only one use case.

Another form of coupling is that some servers (e.g. load balancers) are shared which forces us to implement blocks in the Jenkins apply jobs so that they do not collide.

There is some unfamiliarity with the development model so there is some uncertainty on the responsibilities – who test and deploy what, when? Most developers including my team are also mainly ignorant on how to test infrastructure.

Looking at our pipeline we could tell that something is not quite all right:

Puppet Complex Pipeline in Jenkins

 

In the next part of this blog series we will see how we addressed these challenges in phase 3: Increase independence and quality.

Feature switches in practice

Feature switches (or feature flags, toggles etc) is a programming technique which has gained a lot of attention through the concepts of Trunk Based Development and Continuous Delivery. Feature switches allows you to shield not yet production ready code while still being committed to mainline in version control. This allows you to work on development tasks on mainline and to continuously integrate your code while avoiding the burdens of branching. Another useful benefit is that you can decide which functionality to run in production by switching functionality on/off. The best thing is that this technique is very easy to implement, you basically just need to start doing it! In this blog post I’ll show you how easy it is to do this in Java.

In my current project we are integrating to a third party service which our system depends heavily on. While our system will continue to work if that third party service becomes unavailable, it still means a loss in revenue to the business. Therefore we want to be able to monitor this integration point closely and provide mechanisms to be able to troubleshoot it efficiently. As the communication between these systems are web service based through SOAP, we found it very useful to be able to log the entire payloads sent and received between the two systems. This feature is an ideal candidate for feature switching.

I implemented a feature which allows us to decide in runtime whether we should log every SOAP message sent and received to a file system. This would also happen asynchronously to not affect application throughput too much. This feature would be switched off in production by default, but would allow us to turn it on if we needed to troubleshoot integration failures.

The most basic feature switch to implement would just be a simple if-statement:

boolean xmlLogFeatureIsEnabled = false;
if (xmlLogFeatureIsEnabled) {
	logToFile(xml);
}

But instead of hardcoding the feature switch state, we want this to be dynamically evaluated so we can change the behavior on a running system without the need for restarts or too much manual labor. To be able to do this we use a small framework called Togglz, which allows you to very easily create feature switches which you then can manage in runtime.

First, we create a feature definition enumeration which implements org.togglz.core.Feature:

public enum FeatureDefinition implements Feature {

    @Label("Log XML to file")
    LOG_XML_TO_FILE;

    public boolean isActive() {
        return FeatureContext.getFeatureManager().isActive(this);
    }
}

Then, we implement org.togglz.core.manager.TogglzConfig which will keep track of the feature states:

@ApplicationScoped
public class FeatureConfiguration implements TogglzConfig {

    @Resource
    private Datasource datasource;

    public Class<? extends Feature> getFeatureClass() {
        return FeatureDefinition.class;
    }

    public StateRepository getStateRepository() {
        return new CachingStateRepository(new JDBCStateRepository(datasource), 10, TimeUnit.MINUTES);
    }

    public UserProvider getUserProvider() {
        return new NoOpUserProvider();
    }
}

We use dependency injection in our project, so this allows us to easily inject a datasource in our feature configuration which Togglz can use to store the feature states in. We then apply a 10 minute cache for the feature state reload so that Togglz won’t have to look up the state in the database for each time a feature state is evaluated. Please note that you might want to implement the configuration a bit more robust than in the example above. When we want to switch a feature on/off it is merely a matter of updating a database column value.

At last, we just change the if-statement encapsulating the feature method call to:

if (FeatureDefinition.LOG_XML_TO_FILE.isActive()) {
    logToFile(xml);
}

And that’s it! This is all we need to do to be able to dynamically switch features on/off in a running Java system. This technique is very useful when exercising Continuous Delivery ways of working where each commit is a potential production release. As you can see, feature switches allows you to commit your changes to version control without necessarily expose them to your end users.

To see this in action, feel free to check out my Togglz example project which uses a simple servlet to demonstrate the behavior.

 

Tommy Tynjä
@tommysdk

Slimmed down immutable infrastructure

Last weekend we had a hackathon at Diabol. The topics somehow related to DevOps and Continuous Delivery. My group of four focused on slim microservices with immutable infrastructure. Since we believe in automated delivery pipelines for software development and infrastructure setup, the next natural step would be to merge these two together. Ideally, one would produce a machine image that contains everything needed to run the current application. The servers would be immutable, since we don’t want anyone doing manual changes to a running environment. Rather, the changes should be checked in to version control and a new server would be created based on the automated build pipeline for the infrastructure.

The problem with traditional machine images running on e.g. VMware or Amazon is that they tend to very large in size, a couple of gigabytes is not an unusual size. Images of that size become cumbersome to work with as they take a long time to create and ship over a network. Therefore it is desirable to keep server images as small as possible, especially since you might create and tear down servers ad-hoc for e.g. test purposes in your delivery pipeline. Linux is a very common server operating system but many Linux distributions are shipped with features that we are very unlikely to ever be using on a server, such as C compilers or utility programs. But since we adopt immutable servers, we don’t even need things as editors, man pages or even ssh!

Docker is an interesting solution for slimmed down infrastructure and full stack machine images which we evaluated during the hackathon. After getting our hands dirty after a couple of hours, we were quite pleased with its capabilities. We’ll definitely keep it on our radar and continue with our evaluation of it.

Since we’re mostly operating in the Java space, I also spent some time looking at how we could save some size on our machine images by potentially slimming down the JVM. Since a delivery pipeline will be triggered several times a day to deploy, test etc, every megabyte saved will increase the pipeline throughput. But why should you slim down the JVM? Well the JVM also contains features (or libraries) that are highly unlikely to ever be used on a server, such as audio, the awt and Swing UI frameworks, JavaFX, fonts, cursor images etc. The standard installation of the Java 8 JRE is around 150 MB. It didn’t take long to shave off a third of that size by removing libraries such as the aforementioned ones. Unfortunately the core library of Java, rt.jar is 66 MB of size, which is a constraint for the minimal possible size of a working JVM (unless you start removing the class files inside it too). Without too much work, I was able to safely remove a third of the size of the standard JRE installation, landing on a bit under 100 MB of size and still run our application. Although this practice might not be suitable for production use of technical or even legal reasons, it’s still interesting to see how much we typically install on our severs although it’ll never be used. The much anticipated project Jigsaw which will introduce modularity to Java SE has been postponed several times. Hopefully it can be incorporated into Java 9, enabling us to decide which modules we actually want to use for our particular use case.

Our conclusion for the time spent on this topic during the hackathon is that Docker is an interesting alternative to traditional machine image solutions, which not only allows, but also encourages slim servers and immutable infrastructure.

Tommy Tynjä
@tommysdk

Recent blogs about the Delivery Pipeline plugin

The Delivery Pipeline plugin from Diabol is getting some traction. Now over 600 installations. Here’s some recent blogging about it.

First one from none less than Mr Jenkins himself, Kohsuke Kawaguchi, and Andrew Phillips, VP of Products for XebiaLabs:

InfoQ: Orchestrating Your Delivery Pipelines with Jenkins

Second is about the first experience with the Jenkins/Hudson Build and Delivery Pipeline plugins:

Oracle SOA / Java blog: The Jenkins Build and Delivery Pipeline plugins

Marcus Philip
@marcus_phi

Test categorization in deployment pipelines

Have you ever gotten tired of waiting for those long running tests in CI to finish so you can get feedback on your latest code change? Chances are that you have. A common problem is that test suites tend to grow too large, making the feedback loop an enemy instead of a companion. This is a problem when building devilvery pipelines for Continuous Delivery, but also for more traditional approaches to software development. A solution to this problem is to divide your test suite into separate categories, or stages, where tests are grouped according to similarity or type. The categories can then be arranged to execute the quickest and those most likely to fail first, to enable faster feedback to the developers.

An example of a logical grouping of tests in a deployment pipeline:

Commit stage:
* Unit tests
* Component smoke tests
These tests execute fast and will be executed by the developers before commiting changes into version control.

Component tests:
* Component tests
* Integration tests
These tests are to be run in CI and can be further categorized so that e.g. component tests that are most likely to catch failures will execute first, before more thorough testing.

End user tests:
* Functional tests
* User acceptance tests
* Usability/exploratory testing

As development continues, it is important to maintain these test categories so that the feedback loop can be kept as optimal as possible. This might involve moving tests between categories, further splitting up test suites or even grouping categories that might be able to run in parallel.

How is this done in practice? You’ve probably encountered code bases where all these different kind of tests, unit, integration, user acceptance tests have all been scattered throughout the same test source tree. In the Java world, Maven is a commonly used build tool. Generally, its model supports running unit and integration tests separately out of the box, but it still expects tests to be in the same structure, differentiated only with a naming convention. This isn’t practical if you have hundreds or thousands of tests for a single component (or Maven module). To have a maintainable test structure and make effective use of test categorization, splitting up tests in different source trees is desirable, for example such as:

src/test – unit tests
src/test-integration – integration tests
src/test-acceptance – acceptance tests

Gradle is a build tool which makes it easy to leverage from this kind of test categorization. Changing build tool is something that might not be practically possible for many reasons, but it is fully possibile to leverage from Gradles capabilities from your existing build tool. You want to use the right tool for the job, right? Gradle is an excellent tool for this kind of job.

Gradle makes use of source sets to define what source code tree is production code and which is e.g. test code. You can easily define your own source sets, which is something you can use to categorize your tests.

Defining the test categories in the example above can be done in your build.gradle such as:

sourceSets {
  main {
    java {
      srcDir 'src/main/java'
    }
    resources {
      srcDir 'src/main/resources'
    }
  }
  test {
    java {
      srcDir 'src/test/java'
    }
    resources {
      srcDir 'src/test/resources'
    }
  }
  integrationTest {
    java {
      srcDir 'src/test-integration/java'
    }
    resources {
      srcDir 'src/test-integration/resources'
    }
    compileClasspath += sourceSets.main.runtimeClasspath
  }
  acceptanceTest {
    java {
      srcDir 'src/test-acceptance/java'
    }
    resources {
      srcDir 'src/test-acceptance/resources'
    }
    compileClasspath += sourceSets.main.runtimeClasspath
  }
}

To be able to run the different test suites, setup a Gradle task for each test category as appropriate for your component, such as:

task integrationTest(type: Test) {
  description = "Runs integration tests"
  testClassesDir = sourceSets.integrationTest.output.classesDir
  classpath += sourceSets.test.runtimeClasspath + sourceSets.integrationTest.runtimeClasspath
  useJUnit()
  testLogging {
    events "passed", "skipped", "failed"
  }
}

task acceptanceTest(type: Test) {
  description = "Runs acceptance tests"
  testClassesDir = sourceSets.acceptanceTest.output.classesDir
  classpath += sourceSets.test.runtimeClasspath + sourceSets.acceptanceTest.runtimeClasspath
  useJUnit()
  testLogging {
    events "passed", "skipped", "failed"
  }
}

test {
  useJUnit()
  testLogging {
    events "passed", "skipped", "failed"
  }
}

Unit tests in src/test will be run by default. To run integration-tests located in src/test-integration, invoke the integrationTest task by executing “gradle integrationTest”. To run acceptance tests located in src/test-acceptance, invoke the acceptanceTest task by executing “gradle acceptanceTest”. These commands can then be used to tailor your test suite execution throughout your deployment pipeline.

A full build.gradle example file that shows how to setup test categories as described above can be found on GitHub.

The above example shows how tests can be logically grouped to avoid waiting for that one big test suite to run for hours, just to report a test failure on a simple test case that should have been reported instantly during the test execution phase.


Tommy Tynjä
@tommysdk

We Continuously Deliver!