Organisations
There is a natural human drive towards consolidation over exploration. Inherently we can feel that successes are something to be protected, perhaps to build on, but never to take risks with. This is a easy mindset to develop, but why is it dangerous? At first glance, there are many positive aspects, especially for close-horizon time scales. But across longer periods of time, the negatives significantly outweigh those positives. We find ourselves favouring entrenchment over mobility, and change and innovation become suppressed. So how can we optimise what we have, explore new possibilities, and stay future-focused?
December 1, 2021
Organisations
There is a natural human psychology for consolidation over exploration. Inherently we can feel that successes are something to be protected, perhaps to build on, but never to take risks with. This is a easy mindset to develop, but why is it dangerous? At first glance, there are many positive aspects, especially for close-horizon time scales. But across longer periods of time, the negatives significantly outweigh those positives. We find ourselves favouring entrenchment over mobility, and change and innovation become suppressed. So how can we optimise what we have, explore new possibilities, and stay future-focused?
April 1, 2021
Organisations
Our technology growth has taken us on a fantastic journey. Inside of the last 25 years we have gone from wired phones in our houses, to powerful computers strapped to our wrists. We can transfer money across to the other side of the world in fractions of a second. Whole autonomous organisations can be contructed from smart contracts that live distributed across the digital world. I can authenticate to my bank through facial recognition. My favourite websites know me better than I know myself, and can act as smart agents, predicting what I would like, where I would want to go, how I might vote… This digital world is one of data and machines, locations, behaviours, machine learning predictions. automated interactions. These abilities give us the power to do great things, but in doing so we have also achieved the power to do immense damage. How do we navigate this new world and maintain integrity? How do we cross ever-more amazing frontiers without losing our ethical direction?
March 31, 2021
Organisations
One of the most common challenges as businesses transform more of their traditional capabilities into digital ones is the breadth and depth of the change itself. Core changes to the organisational structure, processes, and culture. The functional components and interactions of these aspects of a large organisation help to define what we mean when we talk about complex systems. But it is also when we look through the lens of complex systems that we can get a different vision of change. Seeing it not as a disruption in so much as a possible instrument of stability and predictability. The rapidity of change not being something that is to be feared but instead something that can be embraced as a stabilising force. Complex systems concepts span a broad horizon as an abstraction of the behaviours of many disparate areas such as biological, computational and societal systems composed of many parts. Here we look at one small technological part and how understanding more of its behaviours as a complex system affects the ways we can view it and work in digital environments. To do this we’ll start by looking at the humble CICD pipeline…
September 20, 2018
Organisations
If anything defines the business landscape in the modern world over the last few years it is the increasing sophistication of technology, the ever-quickening pace, complexity, scale of data, and dropping of costs. The power of the tools now available to organisations is incredible. With one click we can add massive data lakes, machine learning, and personal AI assistants, let alone the day-to-day underlying traditional compute uses we are more familiar with. What also seems clear however is that we aren’t able to keep pace with the availability of technologies in the majority of cases. Amazon and Google scale companies, and others whose businesses are essentially reliant on staying at the front-edge of technology, are able to harness the power of new functionality, their survival requires it. But what of the rest of the businesses out there, those who’s primary driver is not necessarily technological?
January 10, 2018
security
“I’ve got news for Mr. Santayana: we’re doomed to repeat the past no matter what. That’s what it is to be alive.” - Kurt Vonnegut Jr Whether its passwords to access external service, API keys, or other forms of credentials, we not only know that our applications need them, but we also know that they are in reality, highly likely to be exposed beyond the security boundaries we define for them. Most commonly the exposure will come from a human error. Keys committed to a GitHub repository 1,2, incorrect permissions on an S3 bucket 3,4,5 and so on.
November 14, 2017
Cloud
Practical AWS for Large Organisations Table of Contents Overview 1.1. Service Catalogs 1.2. Automated Push Security 1.3. Standardised Support Wrapper Patterns 1.4. Alignment to Industry Standards 1.5. Scalable Account Management Accounts Structure 2.1. Landing Zone Master Organisation Account Cross-Account Management Account Shared Services Account Security and Audit Account Billing Account Pipelined Data Flows and Reactive Architecture Central Services 4.1. Base Infrastructure 4.2. Persistence Infrastructure 4.3. Service Infrastructure Deploy pipeline DNS Routing Load Balancer Routing Compute Pluggable Compute Solutions 5.1. Standard Quick Start Products 5.2. Commercial Marketplace Products 5.3. Custom In-House Products CI/CD Pipelines Solutions Overview There are a number of components that enable business AWS management at scale. Key is to bring them together in a consistent and coherent way through combining and enhancing AWS templates and professional services solutions. Those that are most useful for scaled management such as multi-account landing zones and cross-account management are available but not necessarily fully automated. However, using these as the basis of our scaled strategy allows building on good foundations for future improvement. Looking at the AWS account landscape as a whole we can break it down into a handful of important areas.
November 14, 2017
Cloud
Automated Credential Token with Cloudformation Custom Resource Lambdas The automated token template, [][1]
October 10, 2017
Cloud
Developing Cross Organisational Cloud Solutions at Scale with AWS Service Catalog What the problem looks like Large organisations can often develop into isolated fragments of technical development over time. Teams working in one part may not be aware of what others are doing, even in the same building. From a technical standpoint the result is at best sub-optimal, resulting in duplication of work and reinvention of the wheel. Velocity is low as disparate teams build cloud infrastructure foundations again and again before starting on their actual projects.
September 29, 2017
Cloud
Working with credentials within ECS and passing them around is not entirely straighforward. As one way of doing this, this solution bases all environment variable storage in the AWS Parameter Store, then automatically synchronises them with the running tasks in a set of specified ECS clusters and tasks. [][1]
August 8, 2017
Cloud
Moving government into the cloud turned out all about asking the right questions. The arguments against had been around for many years, and put doubt in the minds of those with more traditional attitudes to IT. Is the cloud secure, is data safe? Many of these questions were the result of the disparity in experiences and conceptual understanding of the change between running in-house servers and running cloud infrastructures. Luckily as time has moved on, understanding and experience has moved in tandem, and these questions are not as commonplace. As we go from asking whether or not cloud technologies are the ‘right way’ for government we begin to ask more practical questions, such as, what is the ‘good way’ to run government in the cloud?
June 19, 2017
Cloud
This template deploys elastic beanstalk into a new VPC, specifically amazons VPC architecture quick start VPC [100]. The instances are deployed into private subnets and the Application Load Balancer (ALB) into public subnets. A route is created into the service with the format <service_name>.<domain_name>. The template can utilise all of the standard beanstalk backends, ruby, python, node, docker, ecs, tomcat, go, php, dotnet. AWS Certificate manager backed HTTPS can be enabled but this will require an existing MX record in the hosted zone. [101] There are also options to add Elasticache and some other options. In the case of AWS components such as Elasticache, some standard environment variables are also created such as REDIS_URL. [Elastic Beanstalk into a new VPC][1] [][10]
June 16, 2017
Cloud
[][2] Sometimes we just need a quick static site rather than anything elaborate, for example when we want to setup maintenance pages and Route 53 DNS failover for sites. Particularly for a maintenance or backup site we’re likely going to be using it for when there are availability zone or region problems in AWS. S3 static sites are incredibly easy, but we set SSL certificates on the site, which is not ideal.
May 26, 2017
Organisations
Commercial businesses and governmental departments can be some of the largest and most diverse organisations in any county. No matter the strengths of its internal capability there can never be enough expertise and enough flexibility in staffing to cover all its activities and transformational projects. This is where the benefits of external suppliers and collaborations can come in.
February 16, 2017
Cloud
[][5] This post will walkthrough automating AWS Certificate Manager validation through a Simple Email Service (SES), S3 and Lambda pipeline.
February 11, 2017
DevOps
These principles are an overview of the culture, practices and motivations driving teams working in the WebOps profession.
December 26, 2016
Monitoring
An easy way to cycle EC2 instances where we have an elasticsearch cluster running. As an example target we have, Two instances i-11111111 and i-22222222, both running elastic search as a cluster with replicas set to 2, so that each has a replica of the others primary indices. Add one to the auto-scaling group, increasing desired to 3 Wait for new instance i-333333 to join the cluster
November 12, 2016
Organisations
The natural and unmanaged formation of groups and sub-cultures in companies, as in broader society, is a well-established human behaviour, leading to many social and intellectual benefits. The diffusion and generation of information and ideas with other people, and connecting those that share an interest or view.
November 12, 2016
DevOps
What does it mean to deliver a service? Where does it’s value live? And, even if we know the answers, what do we actually practice in real life?
November 6, 2016
DevOps
This is the home of the various salt stack errors and quirks that I come across, just a nice little bucket of frustrations so that I have a quick reference page that is not based on human memory 🙂
August 26, 2016
Multi-Agent Systems
In the field of modern cloud operations, multiple services continuously run on many different platforms, across a broad spectrum or hardware, network, and software environments. Broadly, they can be briefly summarised as having the following properties,
June 12, 2016
Monitoring
Currently I’m using the logging setup of Beaver shipping logs into an ELK stack, and metrics with collectd shipping metrics into a Graphite stack. Now that Elastic have Beats that do both logging and metrics, its worth exploring further.
June 12, 2016
Monitoring
Just as you start off on a Monday morning, at 9:01am, there’s a page, that crucial, heavily used site is broken, users are blocked from working and frustrated. What went wrong?
April 30, 2016
Monitoring
One of the main pressures around response to incidents is simply being overwhelmed with tasks, the outcome of so many demands and so much context-switching can easily be chaos, or poor quality quick-fixes. As with all real-time response, the key thing is to take a step-back, and triage the incoming requests as they arrive, prioritising those we need to deal with first, and deferring those that we can tackle later.
April 30, 2016
Cloud
We need to get security updates onto instances on live AWS services. So, whats the best strategy? If we’re using the Amazon Linux AMI, then we security updates are automatically applied on the initial boot of the AMI. So if we cycle our instances, we get a freshly updated EC2 instance.
March 5, 2016
Cloud
The goal here is to implement an instance cycling task, resulting in all current instances being replaced with new instances with no downtime. When working with auto-scaling groups, its important to remember that the auto-scaling group is in control! Simply rebooting will most likely spook the scaling group into replacing the downed instance.
March 5, 2016
Cloud
What are the key concepts that define what we mean by a sustainable service on AWS?
March 5, 2016
DevOps
While we strive to deliver at pace there can be an over-confidence and a positioning of past policies and frameworks as being too rigid, of failing to allow the freedom with people to get on with their work.
February 16, 2016
Cloud
Quick bunch of notes on moving data between container on AWS using RDS. Here we have 2 stacks, each with a container sitting on an EC2 instance. The container is running a simple rails application connected to RDS specified in the DB_HOST environment variable. There are other shortcut ways to do this but this is the ‘pretty straightforward’ way 🙂
February 7, 2016
DevOps
Opinionated mantra though it is, I believe that the improvements brought by automation, cloud architectures, and intelligent processing of the resultant data streams mean that new opportunities are available to benefit peoples lives through access to knowledge and services.
January 9, 2016
Monitoring
Quick walkthrough of a problem on a 3 node elasticsearch cluster first noticed with the generic yellow/red cluster warning. The chain of events causing the problem looks like…
January 9, 2016
DevOps
Basically, a RabbitMQ image that uses confd to capture some environment variables to set itself up. All sorts of queues, bindings, vhosts, users, etc can be set up using this method.
January 2, 2016
DevOps
Operations people, we love to automate. In the case of continuous integration we want to be able to deploy to production, every day, multiple times a day. Our commits trigger building, testing and deploying through a pipeline untouched by human hands. Or, so the theory goes…
January 2, 2016
Monitoring
The slides from a quick review of Sensu, in short, Sensu is good! RabbitMQ is the only point of communication needed between clients and servers Setup your client-customisable subscription checks on the server Setup any weird custom checks on your clients Please, please don’t alert on anything but the essentials Really, the above ^^^ Sensu slideshow
January 2, 2016
Monitoring
Some slides from an investigation into migrating to using Amazons Cloudwatch. Quick summary, Create metrics on Cloudwatch logging streams and alert on them, eg, number of 500’s in a minute You get basic free metrics from AWS, custom metrics are pretty easy to setup You have access to plenty of AWS specific metrics and triggers They are well integrated with other AWS stuff so you can do more advanced Lamda processing But, is it enough to moving away from your custom ELK/graphite type stack? Cloudwatch slideshow
January 2, 2016