Tag Archives: delivery pipeline

Delivery pipelines with GoCD

At my most recent assignment, one of my missions was to set up delivery pipelines for a bunch of services built in Java and some front-end apps. When I started they already had some rudimentary automated builds using Hudson, but we really wanted to start from scratch. We decided to give GoCD a chance because it pretty much satisfied our demands. We wanted something that could:

  • orchestrate a deployment pipeline (Build -> CI -> QA -> Prod)
  • run on-premise
  • deploy to production in a secure way
  • show a pipeline dashboard
  • handle user authentication and authorization
  • support fan-in

GoCD is the open source “continuous delivery server” backed up by Thoughtworks (also famous for selenium)

GoCD key concepts

A pipeline in GoCD consists of stages which are executed in sequence. A stage contains jobs which are executed in parallel. A job contains tasks which are executed in sequence. The smallest schedulable unit are jobs which are executed by agents. An agent is a Java process that normally runs on its own separate machine. An idling agent fetches jobs from the GoCD server and executes its tasks. A pipeline is triggered by a “material” which can be any of the following the following version control systems:

  • Git
  • Subversion
  • Mercurial
  • Team Foundation Server
  • Perforce

A pipeline can also be triggered by a dependency material, i.e. an upstreams pipeline. A pipeline can have more than one material.

Technically you can put the whole delivery pipeline (CI->QA->PROD) inside a pipeline as stages but there are several reasons why you would want to split it in separate chained pipelines. The most obvious reason for doing so is the GoCD concept of environments. A pipeline is the smallest entity that you could place inside an environment. The concepts of environments are explained more in detailed in the security section below.

You can logically group pipelines in ”pipelines groups” so in the typical setup for a micro-service you might have a pipeline group containing following pipelines:

example-server-pl

A pipeline group can be used as a view and a place holder for access rights. You can give a specific user or role view, execution or admin rights to all pipelines within a pipeline group.

For the convenience of not repeating yourself, GoCD offers the possibility to create templates which are parameterized workflows that can be reused. Typically you could have templates for:

  • building (maven, static code analysis)
  • deploying to a test environment
  • deploying to a production environment

For more details on concepts in GoCD, see: https://docs.go.cd/current/introduction/concepts_in_go.html

Administration

Installation

The GoCD server and agents are bundled as rpm, deb, windows and OSX install packages. We decided to install it using puppet https://forge.puppet.com/unibet/go since we already had puppet in place. One nice feature is that agents auto-upgrades when the GoCD server is upgraded, so in the normal case you only need to care about GoCD server upgrades.

User management

We use the LDAP integration to authenticate users in GoCD. When a user defined in LDAP is logging in for the for the first time it’s automatically registered as a new user in GoCD. If you use role based authorization then an admin user needs to assign roles to the new user.

Disk space management

All artifacts created by pipelines are stored on the GoCD server and you will sooner or later face the fact that disks are getting full. We have used the tool “GoCD janitor” that analyses the value stream (the collection of chained upstreams and downstream pipelines) and automatically removes artifacts that won’t make it to production.

Security

One of the major concerns when deploying to production is the handling of deployment secrets such as ssh keys. At my client they extensively use Ansible as a deployment tool so we need the ability to handle ssh keys on the agents in a secure way. It’s quite obvious that you don’t want to use the same ssh keys in test and production so in GoCD they have a feature called environments for this purpose. You can place an agent and a pipeline in an environment (ci, qa, production) so that anytime a production pipeline is triggered it will run on an agent in the production environment.

There is also a possibility to store encrypted secrets within a pipeline configuration using so called secure variables. A secure variable can be used within a pipeline like an environment variable with the only difference that it’s encrypted with a shared secret stored on the GoCD server. We have not used this feature that much since we solved this issue in other ways. You can also define secure variables on the environment level so that a pipeline running in a specific environment will inherit all secure variables defined in that specific environment.

Pipelines as code

This was one of the features GoCD were lacking at the very beginning, but at the same there were API endpoints for creating and managing pipelines. Since GoCD version 16.7.0 there is support for “pipelines as code” via the yaml-plugin or the json-plugin. Unfortunately templates are not supported which can lead to duplicated code in your pipeline configuration repo(s).

For further reading please refer to: https://docs.go.cd/current/advanced_usage/pipelines_as_code.html

Example

Let’s wrap it up with a fully working example where the key concepts explained above are used. In this example we will set up a deployment pipeline (BUILD -> QA -> PROD ) for a dropwizard application. It will also setup an basic example where fan-in is used. In that example you will notice that downstreams pipeline “prod” won’t be trigger unless both “test” and “perf-test” are finished. We use the concept of  “pipelines as code” to configure the deployment pipeline using “gocd-plumber”. GoCD-plumber is a tool written in golang (by me btw), which uses the GoCD API to create pipelines from yaml code. In contrast to the yaml-plugin and json-plugin it does support templates by the act of merging a pipeline hash over a template hash.

Preparations

This example requires Docker, so if you don’t have it installed, please install it first.

  1. git clone https://github.com/dennisgranath/gocd-docker.git
  2. cd gocd-docker
  3. docker-compose up
  4. go to http://127.0.0.1:8153 and login as ‘admin’ with password ‘badger’
  5. Press play button on the “create-pipelines” pipeline
  6. Press pause button to “unpause” pipelines

 

 

Diabol hjälper Klarna utveckla en ny plattform och att bli experter på Continuous Delivery

Klarna har sedan starten 2005 haft en kraftig tillväxt och på mycket kort tid växt till ett företag med över 1000 anställda. För att möta den globala marknadens behov av sina betalningstjänster behövde Klarna göra stora förändringar i både teknik, organisation och processer. Klarna anlitade konsulter från Diabol för att nå sina högt satta mål med utveckling av en ny tjänsteplattform och bli ledande inom DevOps och Continuous Delivery.

Utmaning

Efter stora framgångar på den nordiska marknaden och flera år av stark tillväxt behövde Klarna utveckla en ny plattform för sina betalningstjänster för att kunna möta den globala marknaden. Den nya plattformen skulle hantera miljontals transaktioner dagligen och vara robust, skalbar och samtidigt stödja ett agilt arbetssätt med snabba förändringar i en växande organisation. Tidplanen var mycket utmanande och förutom utveckling av alla tjänster behövde man förändra både arbetssätt och infrastruktur för att möta utmaningarna med stor skalbarhet och korta ledtider.

Lösning

Diabols erfarna konsulter med expertkompetens inom Java, DevOps och Continuous Delivery fick förtroendet att stärka upp utvecklingsteamen för att ta fram den nya plattformen och samtidigt automatisera releaseprocessen med bl.a. molnteknik från Amazon AWS. Kompetens kring automatisering och verktyg byggdes även upp i ett internt supportteam med syfte att stödja utvecklingsteamen med verktyg och processer för att snabbt, säkert och automatiserat kunna leverera sina tjänster oberoende av varandra. Diabol hade en central roll i detta team och agerade som coach för Continuous Delivery och DevOps brett i utvecklings- och driftorganisationen.

Resultat

Klarna kunde på rekordtid gå live med den nya plattformen och öppna upp på flera stora internationella marknader. Autonoma utvecklingsteam med stort leveransfokus kan idag på egen hand leverera förändringar och ny funktionalitet till produktion helt automatiskt vilket vid behov kan vara flera gånger om dagen.

Uttömmande automatiserade tester körs kontinuerligt vid varje kodförändring och uppsättning av testmiljöer i AWS sker också helt automatiserat. En del team praktiserar även s.k. “continuous deployment” och levererar kodändringar till sina produktionsmiljöer utan någon som helst manuell handpåläggning.

“Diabol har varit en nyckelspelare för att uppnå våra högt ställda mål inom DevOps och Continuous Delivery.”

– Tobias Palmborg, Manager Engineering Support, Klarna

 

 

Diabol migrerar Abdona till AWS och inför en automatiserad leveransprocess

Abdona tillhandahåller tjänster för affärsresehantering till ett flertal organisationer i offentlig sektor. I samband med en större utvecklingsinsats vill man också se över infrastrukturen för drift och testmiljöer för att minska kostnader och på ett säkert sätt kunna garantera hög kvalité och korta leveranstider. Diabol anlitades för ett helhetsåtagande att modernisera infrastruktur, utvecklingsmiljö, test- och leveransprocess.

Utmaning

Abdonas system består av en klassisk 3-lagersarkitektur i Java Enterprise och sedan lanseringen för 7 år sedan har endast mindre uppdateringar skett. Teknik och infrastruktur har inte uppdaterats och har med tiden blivit förlegade och svårhanterliga. Manuellt konfigurerade servrar, undermålig dokumentation och spårbarhet, knapphändig versionshantering, ingen kontinuerlig integration eller stabil byggmiljö, manuell test och deployment. Förutom dessa strukturella problem var kostnaden för hårdvara som satts upp manuellt för både test- och driftmiljö var omotiverad dyr jämfört med dagens molnbaserade alternativ.

Lösning

Diabol började med att kartlägga problemen och först och främst ta kontroll över kodbasen som var utspridd över flera versionshanteringssytem. All kod flyttades till Atlassian Bitbucket och en byggserver med Jenkins sattes upp för att på ett repeterbart sätt bygga och testa systemet. Vidare så valdes Nexus för att hantera beroenden och arkivera de artifakter som produceras av byggservern. Infrastruktur migrerades till Amazon AWS av både kostnadsmässiga skäl, men också för att kunna utnyttja moderna verktyg för automatisering och möjligheterna med dynamisk infrastruktur. Applikationslager flyttades till EC2 och databasen till RDS. Terraform valdes för att automatisera uppsättningen av resurser i AWS och Puppet introducerades för automatisk konfigurationshantering av servrar. En fullständig leveranspipeline med automatiskt deployment implementerades i Jenkins.

Resultat

Migrering till Amazon AWS har lett till drastiskt minskade driftkostnader för Abdona. Därtill har man nu en skalbar modern infrastruktur, fullständig spårbarhet och en automatisk leveranskedja som garanterar hög kvalitet och korta ledtider. Systemet är helt och hållet rekonstruerbart från kodbasen och testmiljöer kan skapas helt automatiskt vid behov.

Top class Continuous Delivery in AWS

Last week Diabol arranged a workshop in Stockholm where we invited Amazon together with Klarna and Volvo Group Telematics that both practise advanced Continuous Delivery in AWS. These companies are in many ways pioneers in this area as there is little in terms of established practices. We want to encourage and facilitate cross company knowledge sharing and take Continuous Delivery to the next level. The participants have very different businesses, processes and architectures but still struggle with similar challenges when building delivery pipelines for AWS. Below follows a short summary of some of the topics covered in the workshop.

Centralization and standardization vs. fully autonomous teams

One of the most interesting discussions among the participants wasn’t even technical but covered the differences in how they are organized and how that affects the work with Continuous Delivery and AWS. Some come from a traditional functional organisation and have placed their delivery teams somewhere in between development teams and the operations team. The advantages being that they have been able to standardize the delivery platform to a large extent and have a very high level of reuse. They have built custom tools and standardized services that all teams are more or less forced to use This approach depends on being able to keep at least one step ahead of the dev teams and being able to scale out to many dev teams without increasing headcount. One problem with this approach is that it is hard to build deep AWS knowledge out in the dev teams since they feel detached from the technical implementation. Others have a very explicit strategy of team autonomy where each team basically is in charge of their complete process all the way to production. In this case each team must have a quite deep competence both about AWS and the delivery pipelines and how they are set up. The production awareness is extremely high and you can e.g. visualize each team’s cost of AWS resources. One problem with this approach is a lower level of reusability and difficulties in sharing knowledge and implementation between teams.

Both of these approaches have pros and cons but in the end I think less silos and more team empowerment wins. If you can manage that and still build a common delivery infrastructure that scales, you are in a very good position.

Infrastructure as code

Another topic that was thoroughly covered was different ways to deploy both applications and infrastructure to AWS. CloudFormation is popular and very powerful but has its shortcomings in some scenarios. One participant felt that CF is too verbose and noisy and have built their own YAML configuration language on top of CF. They have been able to do this since they have a strong standardization of their micro-service architecture and the deployment structure that follows. Other participants felt the same problem with CF being too noisy and have broken out a large portion of configuration from the stack templates to Ansible, leaving just the core infrastructure resources in CF. This also allows them to apply different deployment patterns and more advanced orchestration. We also briefly discussed 3:rd part tools, e.g. Terraform, but the general opinion was that they all have a hard time keeping up with features in AWS. On the other hand, if you have infrastructure outside AWS that needs to be managed in conjunction with what you have in AWS, Terraform might be a compelling option. Both participants expressed that they would like to see some kind of execution plan / dry-run feature in CF much like Terraform have.

Docker on AWS

Use of Docker is growing quickly right now and was not surprisingly a hot topic at the workshop. One participant described how they deploy their micro-services in Docker containers with the obvious advantage of being portable and lightweight (compared to baking AMI’s). This is however done with stand-alone EC2-instances using a shared common base AMI and not on ECS, an approach that adds redundant infrastructure layers to the stack. They have just started exploring ECS and it looks promising but questions around how to manage centralized logging, monitoring, disk encryption etc are still not clear. Docker is a very compelling deployment alternative but both Docker itself and the surrounding infrastructure need to mature a bit more, e.g. docker push takes an unreasonable long time and easily becomes a bottleneck in your delivery pipelines. Another pain is the need for a private Docker registry that on this level of continuous delivery needs to be highly available and secure.

What’s missing?

The discussions also identified some feature requests for Amazon to bring home. E.g. we discussed security quite a lot and got into the technicalities of AIM-roles, accounts, security groups etc. It was expressed that there might be a need for explicit compliance checks and controls as a complement to the more crude ways with e.g. PEN-testing. You can certainly do this by extracting information from the API’s and process it according to your specific compliance rules, but it would be nice if there was a higher level of support for this from AWS.

We also discussed canarie releasing and A/B testing. Since this is becoming more of a common practice it would be nice if Amazon could provide more services to support this, e.g. content based routing and more sophisticated analytic tools.

Next step

All-in-all I think the workshop was very successful and the discussions and experience sharing was valuable to all participants. Diabol will continue to push Continuous Delivery maturity in the industry by arranging meetups and workshops and involve more companies that can contribute and benefit from this collaboration.  

 

Is your delivery pipeline an array or a linked list?

The fundamental data structure of a delivery pipeline and its implications

A delivery pipeline is a system. A system is something that consists of parts that create a complex whole, where the essence lies largely in the interaction between the parts. In a delivery pipeline we can see the activities in it (build, test, deploy, etc.) as the parts, and their input/output as the interactions. There are two fundamental ways to define interactions in order to organize a set of parts into a whole, a system:

  1. Top-level orchestration, aka array
  2. Parts interact directly with other parts, aka linked list

You could also consider sub-levels of organization. This would form a tree. The sub-level of interaction could be defined in the same way as its parents or not.

My question is: Is one approach better than the other for creating delivery pipelines?

I think the number one requirement on a pipeline is maintainability. So better here would mean mainly more maintainable, that is: easier and quicker to create, to reason about, to reuse, to modify, extend and evolve even for a large number of complex pipelines. Let’s review the approaches in the context of delivery pipelines:

1. Top-level orchestration

This means having one config (file) that defines the whole pipeline. It is like an array.

An example config could look like this:

globals:
  scm: commit
  build: number
triggers:
  scm: github org=Diabol repo=delivery-pipeline-plugin.git
stages:
  - name: commit
    tasks:
      - build
      - unit_test
  - name: test
    vars:
      env: test
    tasks:
      - deploy: continue_on_fail=true
      - smoke_test
      - system_test
  - name: prod
    vars:
      env: prod
    tasks:
      - deploy
      - smoke_test

The tasks, like build, is defined (in isolation) elsewhere. TravisBamboo and Go does it this way.

2. Parts interact directly

This means that as part of the task definition, you have not only the main task itself, but also what should happen (e.g. trigger other jobs) when the task success or fails. It is like a linked list.

An example task config:

name: build
triggers:
  - scm: github org=Diabol repo=delivery-pipeline-plugin.git
steps:
  - mvn: install
post:
  - email: committer
    when: on_fail
  - trigger: deploy_test
    when: on_success

The default way of creating pipelines in Jenkins seems to be this approach: using upstream/downstream relationships between jobs.

Tagging

There is also a supplementary approach to create order: Tagging parts, aka Inversion of Control. In this case, the system materializes bottom-up. You could say that the system behavior is an emerging property. An example config where the tasks are tagged with a stage:

- name: build
  stage: commit
  steps:
    - mvn: install
    ...

- name: integration_test
  stage: commit
  steps:
    - mvn: verify -PIT
  ...

Unless complemented with something, there is no way to order things in this approach. But it’s useful for adding another layer of organization, e.g. for an alternative view.

Comparisons to other systems

Maybe we can enlighten our question by comparing with how we organize other complex system around us.

Example A: (Free-market) Economic Systems, aka getting a shirt

1. Top-level organization

Go to the farmer, buy some cotton, hand it to weaver, get the fabric from there and hand that to the tailor together with size measures.

2. Parts interact directly

There are some variants.

  1. The farmer sells the cotton to the weaver, who sells the fabric to the tailor, who sews a lot of shirts and sells one that fits.
  2. Buy the shirt from the tailor, who bought the fabric from the weaver, who bought the cotton from the farmer.
  3. The farmer sells the cotton to a merchant who sells it to the weaver. The weaver sells the fabric to a merchant who sells it to the tailor. The tailor sells the shirts to a store. The store sells the shirts.

The variations is basically about different flow of information, pull or push, and having middle-mens or not.

Conclusion

Economic systems tends to be organized the second way. There is an efficient system coordination mechanism through demand and supply with price as the deliberator, ultimately the system is driven by the self-interest of the actors. It’s questionable whether this is a good metaphor for a delivery pipeline. You can consider deploying the artifact as the interest of a deploy job , but what is the deliberating (price) mechanism? And unless we have a common shared value measurement, such as money, how can we optimize globally?

Example B: Assembly line, aka build a car

Software process has historically suffered a lot from using broken metaphors to factories and construction, but lets do it anyway.

1. Top-level organization

The chief engineer designs the assembly line using the blueprints. Each worker knows how to do his task, but does not know what’s happening before or after.

2. Parts interact directly

Well, strictly this is more of an old style work shop than an assembly line. The lathe worker gets some raw material, does the cylinders and brings them to the engine assembler, who assembles the engine and hands that over to …, etc.

Conclusion

It seems the assembly line approach has won, but not in the tayloristic approach. I might do the wealth of experiences and research on this subject injustice by oversimplification here, but to me it seems that two frameworks for achieving desired quality and cost when using an assembly line has emerged:

  1. The Toyota way: The key to quality and cost goals is that everybody cares and that the everybody counts. Everybody is concerned about global quality and looks out for improvements, and everybody have the right to ‘stop the line’ if their is a concern. The management layer underpins this by focusing on the long term goals such as the global quality vision and the learning organization.
  2. Teams: A multi-functional team follows the product from start to finish. This requires a wider range of skills in a worker so it entails higher labour costs. The benefit is that there is a strong ownership which leads to higher quality and continuous improvements.

The approaches are not mutually exclusive and in software development we can actually see both combined in various agile techniques:

  • Continuous improvement is part of Scrum and Lean for Software methodologies.
  • It’s all team members responsibility if a commit fails in a pipeline step.

Conclusion

For parts interacting directly it seems that unless we have an automatic deliberation mechanism we will need a ‘planned economy’, and that failed, right? And top-level organization needs to be complemented with grass root level involvement or quality will suffer.

Summary

My take is that the top-level organization is superior, because you need to stress the holistic view. But it needs to be complemented with the possibility for steps to be improved without always having to consider the whole. This is achieved by having the team that uses the pipeline own it and management supporting them by using modern lean and agile management ideas.

Final note

It should be noted that many desirable general features of a system framework that can ease maintenance if rightly used, such as inheritance, aggregation, templating and cloning, are orthogonal to the organizational principle we talk about here. These features can actually be more important for maintainability. But my experience is that the organizational principle puts a cap on the level of complexity you can manage.

Marcus Philip
@marcus_phi