Using a Kaizen Blitz to improve security automation

At my current work with Pfizer we are looking at automating as much as possible.

One of the ways we keep our applications secure is by scanning dependencies for vulnerabilities and updating them. We've put steps in place to automate as much of this as possible, but there are still improvements that we want to make.

Last week we made some changes to make more automations during a  Kaizen Blitz a new way of working we adopted to focus on improving out automations.

Kaizen Blitz

The idea of a Kaizen Blitz is to put aside your current work projects on pause and focus solely on a project that can improve your process. This work is time-boxed - that is to say it has to be completed within a specific amount of time. In our instance, two days.

A group of us were selected and we had 2 days to work on our automations. We were working on separate issues but were available to each other to help clash heads together and leverage each others skills on our own projects.

One of the security items I've been looking at recently is improving the automation around dependencies. It's a big project but I decided to break it down and just apply the items to one repository.

The big plan is to transform how we handle dependencies for all repositories.

So for these 2 days the plans was to move our dependency automations to Github Dependabot and adjust the Github Actions accordingly.

Let's take a look at the specific improvements we were looking to make by moving from Snyk to Github. Then we'll see how it went and did we achive it within the 2 days.

Moving from Snyk to GitHub Security

We use Snyk across all our dependencies to scan for dependency vulnerabilities.

Ongoing proactive scanning of your repositories finds vulnerabilities in your dependencies. Your dependencies are referenced against a vulnerability database to find insecurities. The aim is to reduce the load off the development team and increases the security of the application.

Whether it's a project under active development or a legacy project, a proactive scan spots vulnerabilities all the same and provides remediation to keep the application secure.

Let's take a look at how we've been setting this up, some of the lingo, and processes. Then we'll look at new features GitHub introduced in late 2020 and how they can improve our security.

Automating dependency updates with Snyk

Snyk is the security service we have used that scans the manifest files for vulnerabilities. Any of the dependencies that have a vulnerability are flagged and if a fix (remediation) is available then this is offered to us.

For our Laravel and Vue projects the manifest files are:

  • package.json
  • package-lock.json
  • composer.json
  • composer.lock

If step one is to discover the vulnerability, the second step is to create a Pull Request (PR) that bumps the version of the package up to a secure version (if a fix is available).

This isn't complicated and can be applied to other setups with different manifest files.

We use a Laravel instance with a cron task scheduled to populate Snyk with our GitHub Repositories. Snyk uses their GitHub integration to create PRs with suggested remediations

This is a robust setup but there were a few workarounds that were not ideal:

Snyk doesn't have full composer support - currently, vulnerabilities are found but their GitHub integration does not create the PR.

For any PHP project, this is a big issue.

We used GitHub Actions as a workaround and could update our composer dependencies using an action, but it was far from the integrated system that we wanted.

Another consideration was improving the remediation rate - that is, how many vulnerabilities were fixed and how can we assess this.

Snyk has an API so you can get the data and get your insights from there. Ideally, there would be a more integrated organizational overview.

One reason for measuring this was our concern that some PRs were getting blocked. Often a PR would not provide enough detail and would require a login to the unwieldy User Interface (UI) of the Snyk dashboard to learn about the vulnerability.

On smaller projects, this is not such a problem, but when working with large enterprise teams it is not ideal.

It's all very well having your vulnerabilities discovered, but the whole process from discovery through merging a fix needs to be effective for your security to be strong.

Well, it seems GitHub paid attention to these very snags and had ideas on how to improve security.

Exciting things happening as GitHub moves into Security in 2020

In Q1 2020 GitHub acquired Dependabot and partnered with WhiteSource - two established companies in the security space. GitHub was now looking seriously at security and providing tools natively within their platform and now Dependabot is free for all.

Let's look at some of the benefits:

  • Supports Composer remediation PRs
  • Increased remediation rate as coupled to GitHub UI
  • Actionable security data within the GitHub UI can improve the remediation rate
  • Singular GitHub API for repositories and security
  • Can use both graphQL and REST API to query security data
  • Github Actions
    * The automerge actions that exist for Dependabot provide more granular control, such as only allowing minor version patches to be automerged.
    * There is a more extensive ecosystem of actions/workflows which can be leveraged ongoing.
  • Compatibility Check
  • GitHub leverage the success rates of their remediations across all the Open Source projects to determine if a bump is successful. It's a powerful way to assess whether remediation should be merged.
  • Organisational visibility of security

Cons:

  • Not as a mature Vulnerability database
  • Using GitHub as a single point of failure

I think this list above shows clearly how they resolve some of the areas lacking with the Snyk integration.

It's a low friction high-value solution which we can expect to see further innovation as the team further integrates security into the platform.

GitHub Actions

Security isn't the only thing moving to GitHub. We are also in the process of moving from Travis to GitHub Actions to manage the continuous integration process.

GitHub Actions are super flexible and can be used in several ways. Let's quickly look at how we can use them to auto-merge our Dependabot PRs.

We enforce high test coverage so we can be confident that code should be shipped if the following conditions are met:

  • All tests pass
  • Dependencies are only bumped from minor versions of bug-fixes

With this in place, we can proactively and automatically update and deploy security fixes, ensuring secure applications whilst keeping PR overload to a minimum.

Setting ourselves up for further security improvements

Alongside Dependabot there is also Advanced GitHub Security. This is a paid addition with more thorough tooling.

By taking the steps to use Dependabot we are putting down foundations to roll out these more advanced features into our workflow.

These include:

  • Static semantic/files analysis
  • Secret scanning
  • Dependency reviews

We'll write more on these in an upcoming blog. In the meantime, we hope you'll enjoy the benefits of fewer PRs and more secure applications throughout 2021 and beyond.

How did the Kaizen Blitz go?

The move from Snyk to Github Dependabot was very seamless. The new Github Actions were written and live but not all of them were working.

On the day of the blitz Github pushed a change to how Github Actions could handle Secrets. It has been announced on their blog at the start of the month but only implemented on the day we started the blitz and had some big ramifications to how we had to structure our actions.

Our timing was not great! Unfortunately alongside the change they also introduced a new bug which caused us to chase our tails a bit but by Friday morning the Dependabot Product Manager announced that a fix had been released. Result!

However we hadn't been the only ones affected and the actual Github website started to falter. The site just was not loading and there came a point where we had to sit on our hands and wait for Github to stablise.

We worked together in Teams together sharing out frustrations and helping each other see new ways to resolve our problems.

I would say this was the key benefit of working in a Kaizen Blitz, the ability to rally your coworkers together when you needed them. We already work in a collaborative and supportive way, but because we were all briefed on what each of us were doing and had a dedicated teams room to chat it really helped us complete our respective tasks.

When the site was stable again we were able to push our fixes. I'm happy the new setup is now stable and operational.

There are still some issues in getting the automerge Github Action to work and we are watching the relevant issue for Dependabot for any available fixes.

Once that final step has been completed we can then look at pushing this change out across the platform.