Top class Continuous Delivery in AWS

Last week Diabol arranged a workshop in Stockholm where we invited Amazon together with Klarna and Volvo Group Telematics that both practise advanced Continuous Delivery in AWS. These companies are in many ways pioneers in this area as there is little in terms of established practices. We want to encourage and facilitate cross company knowledge sharing and take Continuous Delivery to the next level. The participants have very different businesses, processes and architectures but still struggle with similar challenges when building delivery pipelines for AWS. Below follows a short summary of some of the topics covered in the workshop.

Centralization and standardization vs. fully autonomous teams

One of the most interesting discussions among the participants wasn’t even technical but covered the differences in how they are organized and how that affects the work with Continuous Delivery and AWS. Some come from a traditional functional organisation and have placed their delivery teams somewhere in between development teams and the operations team. The advantages being that they have been able to standardize the delivery platform to a large extent and have a very high level of reuse. They have built custom tools and standardized services that all teams are more or less forced to use This approach depends on being able to keep at least one step ahead of the dev teams and being able to scale out to many dev teams without increasing headcount. One problem with this approach is that it is hard to build deep AWS knowledge out in the dev teams since they feel detached from the technical implementation. Others have a very explicit strategy of team autonomy where each team basically is in charge of their complete process all the way to production. In this case each team must have a quite deep competence both about AWS and the delivery pipelines and how they are set up. The production awareness is extremely high and you can e.g. visualize each team’s cost of AWS resources. One problem with this approach is a lower level of reusability and difficulties in sharing knowledge and implementation between teams.

Both of these approaches have pros and cons but in the end I think less silos and more team empowerment wins. If you can manage that and still build a common delivery infrastructure that scales, you are in a very good position.

Infrastructure as code

Another topic that was thoroughly covered was different ways to deploy both applications and infrastructure to AWS. CloudFormation is popular and very powerful but has its shortcomings in some scenarios. One participant felt that CF is too verbose and noisy and have built their own YAML configuration language on top of CF. They have been able to do this since they have a strong standardization of their micro-service architecture and the deployment structure that follows. Other participants felt the same problem with CF being too noisy and have broken out a large portion of configuration from the stack templates to Ansible, leaving just the core infrastructure resources in CF. This also allows them to apply different deployment patterns and more advanced orchestration. We also briefly discussed 3:rd part tools, e.g. Terraform, but the general opinion was that they all have a hard time keeping up with features in AWS. On the other hand, if you have infrastructure outside AWS that needs to be managed in conjunction with what you have in AWS, Terraform might be a compelling option. Both participants expressed that they would like to see some kind of execution plan / dry-run feature in CF much like Terraform have.

Docker on AWS

Use of Docker is growing quickly right now and was not surprisingly a hot topic at the workshop. One participant described how they deploy their micro-services in Docker containers with the obvious advantage of being portable and lightweight (compared to baking AMI’s). This is however done with stand-alone EC2-instances using a shared common base AMI and not on ECS, an approach that adds redundant infrastructure layers to the stack. They have just started exploring ECS and it looks promising but questions around how to manage centralized logging, monitoring, disk encryption etc are still not clear. Docker is a very compelling deployment alternative but both Docker itself and the surrounding infrastructure need to mature a bit more, e.g. docker push takes an unreasonable long time and easily becomes a bottleneck in your delivery pipelines. Another pain is the need for a private Docker registry that on this level of continuous delivery needs to be highly available and secure.

What’s missing?

The discussions also identified some feature requests for Amazon to bring home. E.g. we discussed security quite a lot and got into the technicalities of AIM-roles, accounts, security groups etc. It was expressed that there might be a need for explicit compliance checks and controls as a complement to the more crude ways with e.g. PEN-testing. You can certainly do this by extracting information from the API’s and process it according to your specific compliance rules, but it would be nice if there was a higher level of support for this from AWS.

We also discussed canarie releasing and A/B testing. Since this is becoming more of a common practice it would be nice if Amazon could provide more services to support this, e.g. content based routing and more sophisticated analytic tools.

Next step

All-in-all I think the workshop was very successful and the discussions and experience sharing was valuable to all participants. Diabol will continue to push Continuous Delivery maturity in the industry by arranging meetups and workshops and involve more companies that can contribute and benefit from this collaboration.  

 

A kind of Scrum

When I talk to different companies in the software industry about how they work I often hear the expression that we use “a sort of scrum” or “we have our own version of scrum”. I hear a warning bell from those kind of statements. Agile and Scrum are buzzwords that most companies like to boast that they use, especially since many system developers prefer to work that way. But how much can you tweak Scrum and still get the benefit out of it that we proponents promise?

It is true that “agile” means easy to change and that agile development is based largely on changing the approach from the feedback you receive. In all agile methods, the aim is to have frequent and short feedback loops. But it is a common misunderstanding that Agile is so lightweight that you easily can change the methods as you like. It is not the frameworks that are agile, they make you agile if applied properly.

It is easy to introduce scrum meetings each morning. It’s easy to have planning and retrospective meetings. It is easy to put up a Scrum board for all to see. But it’s hard to get all the pieces to work together as a whole, and to get the entire organization to be permeated by the agile values:

  • Deliver often
  • Respect people
  • Responding quickly to changes

Scrum is often implemented in isolation in a development department and the change is often driven by the developers on the floor. It is perhaps not surprising because the movement has been built by developers and there is an inherent power shift from traditional managers downward in the organization to the developers.

Such initiatives from below often encounter obstacles and resistance when trying to fit the agile way of working into an organization that is not prepared for it. That’s when you easily begin to stretch the agile values and create ”our own variant of scrum”. What happens is that the transparency of scrum exposes dysfunctions in the organization, but instead of resolving the root causes, you change the process and make workarounds and thus continues to hide the root causes. This pattern is so common that it has a name, scrumbut and the effect is often that the team doesn’t deliver a “potentially deliverable product” each sprint.

To less mature organizations, the best advice is to adhere strictly to a methodology such as Scrum to have something to hold on to and the result can be relatively good anyway. Scrum as it is described in The Scrum Guide is a very mature approach that has evolved and adapted over two decades. More mature organizations knows what effects changes to the processes will cause and therefore will have more freedom to stretch the methods to their own advantage.

Change can be costly, not least in the form of altered balance of power, but there is a lot to gain by getting everybody involved and pull together. A good agile organization is like a Formula 1 car driving fast and responds to the slightest input pulse from the driver, but also has a trimmed team in the pits that are willing to fix anything that might happen during a race.

If you intend to introduce Scrum in your organization and unless you’re really mature, you would do well to stick to Scrum by the book and be responsive to all the temptations of deviation. Take them as a signal that there are some things in the organization that are not working optimally and try to fix the root causes. We are all children at the beginning and to mimic can be an effective way at the beginning of a change.