How tools have their own workflow

When we develop libraries & tools we define various means to interact with them. We do this by thinking about use cases and so we build in assumptions in how we expect our target users to typically interact with our work. In this sense everything built has an intended way of being used, a workflow.

When we want to achieve a goal, we should be asking ourselves how we want to solve the problem and looking for things which enable that approach. If we are constrained in the tools/libraries we can use, we should focus on how those tools are designed to solve the problem.

The problem I've found is a lot of teams chase workflows which make no sense when combined with their tool choice.

For me a great example of this was GitFlow branching strategy, because I have seen so many teams independently repeat exactly the same mistake. I'm going to take you through a bit of history to explain the issues.

Trunk Based Development

Before git existed, every source control management solution implemented a variation of 'Trunk Based Development'. This means there was a single branch called 'trunk', typically held on a server. Every developer would check out trunk on to their machine, they would perform changes and then try to merge them into the 'trunk'. At a set point you would then try to create a release from 'trunk'.

Tools to manage merge conflicts between files were fairly primitive, this encouraged developers to check their code in fairly regularly. The issue here is developers would check in partially completed work which often left 'trunk' in a unbuild-able state.

To get a release out you would have to dedicate time to debugging where the solution was broken and applying hot-fixes so you could cut out the broken new functionality or solve the new bugs it introduced. This is how I got started with DevSecOps, I was the "build guy" on a team and each Friday my job was to get a working build out of Hudson & SVN and deploy to our test rig (reference environment).

There was a second problem, to perform a release you might have to ban other developers checking in for a short time. Once the release is performed, they can then push in their changes. If the release failed when tested, the affected code base areas might have changed enough that simple fixes were impossible.

GitFlow (original)

I've noticed GitFlow documentation has evolved over time, I'm going to start by explaining how it was first defined.

GitFlow builds on 'trunk' based development, we have a 'develop' branch that developers can immediately check into. This branch operates in a similar way to 'trunk'.

Developers perform regular git pulls, merge their changes into 'develop' and then push the result back into the remotely held repository.

The 'main' branch is an attempt to fix the issues in trunk based development. It introduces the principle that the 'main' branch should always be releasable. When you are ready to perform a release you merge the contents of the 'develop' branch into the 'main' branch. The merge step is a gate to ensure you don't break the 'main' branch.

This allows developers to keep pushing into develop and you have a snapshot of code from the release, so if the resulting artefact fails, you can apply hot fixes directly on to the branch.

For anyone with a background in trunk based development, GitFlow makes a great deal of sense. For new developers this workflow is very straight forward, however with Git there is another branching strategy...

Feature Branch

Feature branch workflow, has all new work performed in a "feature" branch.

The idea is a developer will create a branch from a point of the main branch, the developer will then make the changes as required in their branch. At some point a "Pull Request" is made (GitLab cause is a "merge request").

A pull request typically involves peer review of the changes and 'build verification'. Build verification involves a CI attempting to build the feature branch, typically running a number of code linters, unit and integration tests.

The fact each developer works in a branch, means you don't have to freeze check-ins, while releasing. The gates (peer review & build verification) mean the 'main' branch will always be releasable.

Feature branch workflow solves many of the flaws with trunk based development, it does have one flaw.

Developers branch from a point in main, if two developers modified the same lines in a file, one of those developers will have to rebase their branch on a new point in main in order to merge their changes in. It can have a knock on effect on every single commit within their branch. The mitigation is to keep feature branches small and short lived.

The Tools!

So at this point I've outlined a few different branching strategies to work with source control management. The question we should ask is do any of the tools support these branch strategies?

Issue Trackers

No matter what development methodology you work under (e.g. agile scrum, kanban, waterfall, etc..). They all write tasks down within an issue tracker, those tasks are assigned to a developer. If we go into various issue trackers, you notice one button exists on all of them;

The create branch option has been embedded into all issue trackers, it is normally placed near the assignment section of a ticket. This prominent positioning means a lot of developers will start creating branches because it is clearly what you are supposed to do. This creates feature branches for the task.

Source Control Management

The next question is what options are given prominence in the source control tools layout?

As we can see Pull Request menu is provided by all 3 and placed near the top, this tells us the source control management solutions are expecting developers to follow the feature branch branching strategy and have focussed on making that process accessible.

At this point we've looked at Bitbucket, Jira, Github & Gitlab and all are herding us towards a Feature Branch approach. Now assuming I am biased towards Feature Branch, lets think about what we would want to support a GitFlow way of working.

Branch Tracking

GitFlow has two branches, 'develop' and 'main', these branches are different from feature branches as they will live for the duration of the project. We would expect a means to easily track the points we can merge across from each branch, or possibly a way to indicate their status.

Yet when we look into Bitbucket, Github & Gitlab, there is nothing. All 3 SCM's will allow you to define a default branch (e.g. 'develop'). All 3 SCM's allow you to build branch rules (e.g. only X user can push to y branch). Each tool has a 'branches' view which lists the commits ahead/behind the default branch, but there is no concept of the need to track against multiple branches.

Pre-commit Hooks

Whenever I have worked in a team following trunk based development, we would have a pre-commit hook in place. Typically this was a script which added a small box to the commit screen asking for the Issue Identifier. The script would confirm the issue identifier was valid and append it to the commit message. When bisecting the commit history for a breaking change, or trying to understand why a specific block of code was written in a specific way this information allowed us to see the issue which drove the change.

I joined a team using Gerrit and they had a similar pre-commit hook, this was because Gerrit will break a feature branch into its constitute commits and perform a pull request for each one. Keeping track of what each commit relates to can get tricky and including the issue identifier mitigated the problem.

Looking into tools, Bitbucket provides a 'Hooks' feature which can provide some protection features (e.g. gpg signing). GitHub lacks a 'hooks' capability, but places the functionality we find in Bitbucket under has a 'branch protection'.

Gitlab supports 'Server Hooks' specifically to add this kind of functionality.

What does this mean?

The tools will encourage their users to follow a feature branch branching strategy. While the tools don't actively block you from following a GitFlow branch strategy, the only functionality to support Gitflow is 'server hooks' found on Gitlab. Any team following Gitflow will be bending features designed for the feature branch branching strategy.

I originally wrote a presentation in 2016 on this topic when I kept consulting with teams which claimed to be following Gitflow and yet feature branches had appeared in the workflow.

Using pull requests as a gate into 'develop' ensures the 'develop' branch will always remain buildable. Since the 'develop' branch is buildable the 'main' branch becomes a superfluous extra step. Someone has to setup a pull request into 'main', then ask a CI to release the 'main' branch and then someone has to setup a pull request to go back into 'develop'.

Inevitably whoever controls this process starts performing a release directly on 'develop', the 'main' branch soon falls hundreds of commits behind 'develop'. Everyone on the team remains adamant they are following Gitflow and not following a Feature Branch branching strategy.

GitFlow (Revised)

When I started writing this blog I was surprised to find out, that SCM's guides from Atlassian actually describe Gitflow as including feature branches. In fact every description I could find included the concept of feature branches.

Hopefully at this point you know understand that mixing feature branches with the original GitFlow design has created a double set of gates, which are both designed to achieve the same thing. This is additional complexity (technical debt) and the need to merge code from 'develop' into 'main' adds more manual steps into the release process.

Some other great examples..

Using bower/grunt.js for front end dependencies
Using the terraform 'userdata' to define an EC2 configuration
CI to run Chef/Puppet/Ansible tasks

Search This Blog

Thoughts from a life in software