Quality, Agility and DevOps – Quo Vadis QA?

8ba54bdcc65a4f5b6bfa55e3546678d1

It is clear to anyone that traditional QA needs to change to adapt to agile and devops. That is why the Techwell 2015–2016 State of the Software Testing Profession is so interesting (sorry you’ll need to register to get a copy).

If you read between the lines in the report, there is a very strong message on how QA is changing – the graphs in the report show that QA is bifurcating into two separate roles – a test automation role and business testing role. The first focuses on technical and coding skills, while the second requires a deep understanding of the business and how business users will be using the system under test.

Another way to look at this is that split between testing for verification vs. testing for validation is becoming clearer and more distinct – especially for business applications. Verification is the process of checking that a software system meets specifications, while validation is the process of checking that it fulfills its intended business purpose.

Verification can (and should) be done through automation, but validation requires users (or skilled user proxies) in order to ensure that the system under test is accepted by users and the business.

It is interesting that the report ignores UX (user experience) testing which is different than verification or validation – but can have a large impact on acceptance. I have seen some examples where forward looking companies are starting to look at UA (user acceptance) and UX (user experience) together from a testing perspective.

This trend is related to what we wrote in our previous post: Requirements should be translated into user and business acceptance tests, first as manual scripts, which over time evolve into automated scripts.

50% of Your IT is Below Average – Is Digital Transformation Even Possible??

Screenshot 2016-07-10 13.32.50

There is a lot of interest in embracing digital transformation – I doubt if there is a Fortune 2000 company that isn’t at least investigating digital transformation. Digital transformation is complex and requires that companies take a hard look at their business models and business processes to decide how they will address the changes required for digital transformation.

Digitalization dictates changes to core business processes, not just the front end – making it hard and very risky to implement. It also dictates an unprecedented reliance on IT. Given the traditional failure rate of IT projects (you can find more details on IT project failure by McKinsey and Standish Group), the outlook for digital transformation is glum – and the historical statistics certainly don’t bode well for companies embarking on a digital transformation journey.

Many IT departments have started looking at various type of agile development and devops delivery methodologies as a way to prepare for digitalization. One issue is that most of these efforts focus on customer facing apps – and tend to ignore the business process aspects of digital transformation.

Even more worrisome for a large organisations is that both agile and devops tend to rely on a basic underpinning of high performance, highly skilled, small and relatively autonomous teams (aka ninjas) – making them hard to scale.

Automation can help bridge the gap – but we think “highly skilled” is still an issue. No matter how you look at it – especially if you are a large project oriented organization – at least 50% of your IT department’s development and delivery teams are BELOW AVERAGE. Even though it is a mathematical fact – it is very hard for IT to accept.

This bias towards “illusory superiority” isn’t unique to IT, but a general human tendency and a well documented psychological phenomenon – sometimes called the Lake Wobegon Effect or more technically the Dunning-Kruger effect. Just have a look at this classic graph of student self-assessment of their logical reasoning ability:

Screenshot 2016-07-10 11.41.49

There are a number of issues with blindly adapting and scaling agile and devops (no matter what scaling methodology you adopt) when your IT department is at best average. That doesn’t mean agile and devops aren’t an option for large organizations – they most certainly are. However, you can’t assume every team is a group of skilled “ninjas” – but rather a group of average performers. That means that you must make sure IT implements strong governance, great processes and appropriate metrics to make sure they do agile and devops right from a business perspective and actually provide business value.

We believe there is a recipe for large organizations to adopt agile and devops. Here is our breakdown at a high level:

and from a technical perspective:

  • Create a “lifelike” virtual staging environment as the continuous delivery target system
  • Embrace automated testing, but understand its limitations –  that will be QA job 1
  • Translate requirements into user and business acceptance tests, first as manual scripts, which over time evolve into automated scripts – that is QA job 2, and will require QA closer to the business.
    • Formalize acceptance through a domain specific language (DSL) common to IT and the business which is managed by Dev. The tests are owned and managed by business analysts and quality analysts (BQA?)
  • Build user acceptance into the Agile process (in theory with Scrum it should already be there – but in practice it almost never is) by making user testing the primary goal of Sprints and subordinating all other selection criteria to that. We mean actual user testing – within the virtual environment from (1) – not Demos, not Product Owner testing, not QA testing and certainly not Dev doing it for themselves
  • Create and measure user satisfaction KPIs (in staging and production) and business value KPIs (in production)

 

Production Assurance for Enterprise CIOs

nochangecartoon

Digitalization drives your CEO to demand radical increase in IT delivery coupled with expense reduction. Agile, virtualization and cloud accelerate business feature development. This will devastate your current IT operation because it can’t deliver at that velocity.  The only way to prevent collapse is to reinvent your delivery processes to provide production assurance – a methodology enabling agile, lean and rugged enterprise application delivery.

You’ve heard a lot about the benefits of DevOps and how it enables application delivery velocity. Pundits expound on how new and emerging Unicorns like Facebook and Netflix used DevOps tools like Puppet and Chef to trounce their competitors, and that anyone not “DevOps enabled” will suffer the same fate as Blockbuster or Barnes and Nobles.

What they don’t tell you is that (unless you are a new startup with no legacy, no products and no revenue), DevOps is a journey. There is no silver bullet for morphing an existing IT application delivery organization into high velocity DevOps delivery crew. DevOps first-and-foremost requires organization, processes and metrics, all enabled by appropriate delivery architectures and supported by automation.

Many DevOps initiatives start with automation – e.g. continuous integration. Automation is an important part of DevOps and we helped create one of the leading enterprise DevOps automation tools. From our experience, the right place to start is by understanding how delivery excellence supports the business vision and strategy.

Accera has developed the Production Assurance methodology based on lean software and theory of constraints principles. Accera uses Production Assurance to provide a delivery architecture and methodology tailored to your specific business needs. We analyze your current state and implement a robust solution based on:

  1.     KPIs for delivery velocity and quality;
  2.     Organization and process change to enable a DevOps culture
  3.     Structured identification and elimination of delivery constraints

This custom-made delivery solution enables business digitalization by enabling your team to manage delivery like a product, not a project. Through incremental improvements to your IT operation, Production Assurance decreases risk and transforms IT into a business asset for your organization.

Why BiModal IT Won’t Work

Tags

, , ,

I have worked in both large companies and small – started my career on mainframe, added distributed and cloud computing and now include mobile. From a development perspective I never really did waterfall, but started in what was then known as iterative development – and now has be re-designated as agile and DevOps.

I also had the good fortune to work early on with great folks like Stephen Boies and John Gould that taught me that in the end what really matters is the complete system that you deliver to your customers – and a great system means that every component works together and is the right one for the task.

When I first heard about Gartner’s take on “bi-modal” I wasn’t sure what it meant. I actually thought that it was just another way to define system risk governance for systems of records vs. systems of engagement. That makes sense to me – and I’ll get to why in a moment.

But as I read more on the subject I learned that either I misunderstood or somehow the message changed. It has become “slow ‘mode 1,’ responsible for traditional IT services, and fast ‘mode 2,’ which emphasizes agility and speed”, which seems to have been translated to – let your systems of record languish and focus on your systems of engagement. I agree with Jason Bloomberg’s comment that doing that is a recipe for disaster for any enterprise business.

First off – enterprise IT is becoming “multi-modal” not bi-modal – made up of Systems of Record, Systems of Engagement, Systems of Innovation and Systems of Intelligence (or what Gartner calls algorithms):
1. Systems of Innovation –changes are technically simple, have little business risk with the possibility for high reward
2. Systems of Engagement – changes are relatively complex (technically) , have medium business risk with the possibility of medium rewards
3. Systems of Record – changes are relatively complex(technically), have high risk with the possibility medium reward
4. Systems of Intelligence – changes are very complex(technically), have high risk with the possibility high reward

So it seems simple that you should focus on 1 and 4. The problem is (and with Bimodal IT) that all these systems are interconnected and needed together in order to provide a complete business solution to customers. They ALL need to move as fast as possible while still providing “production assurance”. You can’t allow systems of innovation that move so fast that they break systems of record, or systems of record that move so slow they inhibit systems of innovation.

The only difference should be governance – making sure that there is right tradeoff between risk and reward – not the methodologies or tools. To put it in DevOps terms all modalities need be agile and technically capable of continuous deployment – but from a governance perspective you can’t always do continuous delivery.

Companies need to apply enterprise agile and DevOps in a judicious manner for all modalities – and employ smart governance and tools to ensure production assurance.

What DevOp Days Doesn’t Want You to Know: Application Architecture and DevOps Go Together Like Peas and Carrots

Tags

We like to think about application delivery as a promotion pipeline from development to production. Each step in the pipeline is usually a separate environment – e.g. a different topology, different configuration and different tools+methods for promotion to the next step.

One school of thought is that DevOps exists to solve the problem of wasted developer time for creating environments. Developers are such a scarce and expensive resource, that on the face of it makes sense to subordinate operations to development. Wrong!

Taking that approach you recreate the problem of silos inherent in the machine metaphor – each role with its own specific function. By allowing development to ignore the needs of production you are setting yourself up to create products that that are like a “spherical cow” – creating an architecture that is beautiful in theory – but doesn’t work in practice.

A better alternative is the “organic metaphor” where responsibility for the final deliverable is shared across the organization through “design for delivery”. For any new software project, there are many (sometimes conflicting) ways to architect the product. In fact, one of the most important jobs of the architect is to take into consideration conflicting design patterns and possibilities and choose the optimal architecture. Historically, delivery wasn’t a key consideration in software product architecture design. Actually that is true for many products, not just software.

Given today’s focus on delivery, “design for delivery” connects architecture to DevOps, and is the difference between product success and failure. Delivery must be taken into consideration at every stage of product creation.

Design for delivery is a critical part of the DevOps shift left paradigm. We like to add “think right” to the equation because shift left isn’t enough – the whole organisation needs to include the ability to easily deliver the product as part of product design and implementation.

A great illustration of problem with the machine metaphor is in Jez Humble’s post on “Elisabeth Hendrickson Discusses Agile Testing” where she discusses working at a product company which was suffering a series of quality problems. As a result, they hired a VP of QA who set up a QA division. The net result of this, counter intuitively, was to increase the number of bugs. One of the major causes of this was that developers felt that they were no longer responsible for quality, and instead focused on getting their features into “test” as quickly as they could. Thus they paid less attention to making sure the system was of high quality in the first place, which in turn put more stress on the testers. This created a death spiral of increasingly poor quality, which led to increasing stress on the testers, and so on.

Achieving Design for Delivery

  • Identify delivery pains; organise and develop feedback loop for operations and development: delivery and operational experts should be pigs in every sprint 0 and should be chickens at scrum meetings – in the real world agile teams have developers, customers, product but lack a key component – delivery. It doesn’t matter why, but it ensures that what ever the team develops won’t be optimised for delivery – or even worse it won’t deliverable.
    • Delivery must be a part of every agile team – the problem is that there just aren’t enough delivery people to get them involved in every sprint for every team. Of course, you can always hire a lot more delivery folks – but that just won’t happen.
    • The solution is to make delivery a pig in every sprint 0 (planning sprint). They will plans and validate the tools and scripts to be built in support of team. They will also provide important input for time boxing and the order of feature development. They should also be part of every sprint review demo – just like customers. In a way delivery is a customer – they are the customers for the team’s artifacts and their job is to deliver them to the real customers.
  • Design by doing: new functionality is set by business and combined with delivery requirements defined by operations – this is the “theme”. The theme is then broken into smaller viable components (epics and stories in agile). e.g. faster deployment time, automatic deployment of components, simpler restart and recovery, built-in an event console for errors and logs, built-in automatic DB upgrade.
    • The first step for the architect is to introduce one of these epics even if it requires using a kludge, sacrificing architectural integrity and entering technical debt. Over time the architect is able to assess various implications of the technical debt, e.g. excessive response time, excessive memory footprint, dramatically higher maintenance requirements.
    • In the next phase, the architect fixes the most costly debt of the previous epic, i.e. refactors and aligns the architecture, while kludging on another epic. This “Firefighter” mode contains the most damaging aspects of the technical debt while still delivering rapid response. By only fighting the most damaging “fires” we can achieve optimize for delivery without rearchitecting the entire product. The rule is to start with a quick-win, and later fix only the crucial problems.
  • Shift left – think right: Recognise and identify deployment challenges earlier, i.e. during the development phase. Fix as soon as possible taking into consideration deployment and operational processes.
    • Development labs include real deployment use-cases and works with release automation tools that includes automation of software, DB and configuration.
    • Build-in health & monitoring tools based on KPI that are valuable for deployment and operations. Build-in an event console.
    • Daily build, deploy & stabilise lead by development
    • Review and manage, adopt and simplify deployment checklists
    • Deployment checklists as a process

This is the last in our “What DevOp Days Doesn’t Want You to Know Series” – hope you enjoyed them.

What DevOps Day Doesn’t want you to Know: DevOps isn’t Automation

Continuous deployment and continuous delivery are sometime used as synonyms – but they are actually quite different.

The difference between the two is business considerations – continuous deployment doesn’t care about the business aspects only technical considerations around deployment into a new environment. On the other hand continuous delivery takes into account business considerations around deployment, and uses them as a gate for deployment decisions into production.

Continuous deployment is the technical capability of being able to automatically move artifacts from one environment to the next and ensure that the new environment is set up and configured properly. That is the focus of most DevOps activity and tools.

Continuous delivery is deployment from staging environments to the production environments – aka release. The difference between deployment and delivery are the unique attributes of the production environment. Even though the technical capability for automatic promotion to production may be the same as for continuous deployment  – there are considerations for deployment to production that are different than those for any other environment. One example is that production has a unique set of provisioning and configuration needs not used by any other environment; another example is business gates always exist when promoting to production, and may not exist for any other environment.

Automation is enough to for continuous deployment, but continuous delivery is another matter. Production assurance is the link between the technical capabilities of continuous deployment to the business goal of enhanced delivery velocity and availability – aka continuous delivery.

For long term success, DevOps teams will need to provide the business with more than just DevOps automation, they’ll also need to provide production assurance to lower the overall risk of release, not just increase release velocity. The current DevOps focus is on the automation needed to provision and configure environments, and to deploy application artifacts into those environments. Modern deployment paradigms (virtualization, containers) are making application deployment topologies and configuration more complex, which is causing DevOps automation complexity to increase as well.

The role of assurance in developing application code is clear, no one releases application features that haven’t been thoroughly tested – unit tests, integration tests, regression tests etc. The only app testing requirement driven by DevOps is that application testing at all levels must be automated – otherwise testing becomes a bottleneck and may be circumvented to achieve velocity.

Not the same can be said for DevOps automation, which is essentially infrastructure and configuration as code. Achieving continuous delivery requires confidence that the deployment code will work correctly in production, and when it breaks the problem will be easy to diagnose and repair. But this just isn’t the case – very little testing is done to release automation code and what little is done is only at the unit test level. There has been some effort to introduce Test Driven Development for DevOps automation tools – but because of the unique attributes of the production environment even when adopted that isn’t enough.

Traditional testing falls short for DevOps automation because it is impossible (or at least uneconomical) to test deployment automation in production – companies do not have a “spare” production environment for deployment testing. This means production deployment is exposed to human errors during the coding of the automation (aka bugs) which will only increase along with the adoption of DevOps and new, more complex, deployment architectures (micro-services and containers).

Because of its importance, and the difficulty of testing deployment in a production environment, DevOps needs production assurance – a method to validate production deployment to ensure the correctness of the production topology and configuration, the ability to differentiate between application and deployment bugs, and a feedback loop that links production deployment problems to the artifacts and staging environments that caused the problem. There are four steps to enabling Production Assurance:

  1. DevOps automation based on an explicit deployment model.
  2. Replicable, versioned staging environments.
  3. DevOps automation development using the same development and testing rigor as application code.
  4. Validated production deployment topologies.
  5. A way to quickly differentiate application bugs from deployment bugs and a fast, easy mapping of deployment problems back to the specific staging environments that created them.

Adopting DevOps without Production Assurance increases the risk of deployment issues stemming from mistaken configuration or insufficient testing. As DevOps matures and deployment architectures become more complex (e.g. containers) automated deployment (aka configuration as code) is becoming more like programming  – and configuration issues become a complex interleaving of human error and automation bugs.

Businesses adopt DevOps in order to achieve increased delivery velocity, but it can’t come at the expense of availability. Velocity can be measured in a variety of ways, from a purely technical measure like time-to-null-deploy to more business oriented measures like the ratio between number-of-deliveries per quarter, month, day or even hour and the number of rollbacks (or roll-forwards). Availability is the ratio between uptime and operation time and needs to take into account mean time to detect, triage and repair a problem (triage is the gross ability to assign the problem to Dev or Ops). DevOps is focused on velocity, production assurance – availability.

Production assurance is complementary to DevOps, extending DevOps methodology during and after application production deployment, aka release. DevOps focuses on automation of all environments while production assurance focuses the unique aspects of the production environment and how it relates to previous environments. Production assurance provides visibility into and assurance of complex deployments in order to minimize incorrect deployment and the time to detect, triage, diagnose and repair deployment issues.

 

What DevOps Day Doesn’t want you to Know: Your CEO Hates DevOps

For a CEO the bottom line is always whether any new initiative will positively affect the business – increasing revenue or lowering costs. From that perspective there is frustration that agile has not delivered the promised benefits to the business.

Now IT teams are coming back to the business and telling them what they really need is DevOps – which for the CEO is just a way to spend even more to solve the same roadblocks they were promised agile would remove.

For many CEOs IT is perceived as slowing the business down rather than accelerating business agility. C-suite executives feel swindled by IT executives Agile/Scrum initiatives. Money was spent, markers and software were purchased, pigs and chickens as new job titles were coined – all to no avail.

All Agile achieved was to accelerate development and to move the bottlenecks of delivery to operations. Recent disruptive changes in the operations arena, e.g. configuration management, release automation, software as a service, virtualization and containers similarly burdened the balance sheet without delivering anticipated agility.

As a result we observe a spectrum of companies: on one extreme a typical enterprise that releases a new version once every several years (for legacy companies) or at best 2-3 times a year. On the other extreme Facebook delivers more than one release a day, Etsy delivers over 10 per day and GitHub delivers over a 100 releases a day.

Though everybody focuses on the tools, DevOps is really more about the process. Processes can be categorized through a maturity model which can be both descriptive and prescriptive. Maturity models are used to assess organizational capabilities and prescribe improvement steps – i.e. how to move up the maturity hierarchy.

Our maturity model defines a maturity hierarchy for Dev, a maturity hierarchy for Ops and joint layers where IT is subordinated to business priorities.

Only the synthesis of the silos can shift activities left, communicate business priorities to technical staff (beyond exposure to the product owner) and shrink the time from Aha to Ka-ching.

DevOps is a journey defined by a delivery maturity model used to assess an organization’s current delivery maturity and a high level roadmap. A maturity model  provides organizations with a way to assess their current delivery capabilities and a path from aha to ka-ching.

Here is our version of such a maturity model:

Screen Shot 2015-09-17 at 5.05.00 PM

Just Do It Level 1 (Dev: Sporadic delivery – Ops: Heroics): Deployment staff works around the clock to insure successful deployment. Myopically, each release is treated as a onetime sporadic effort. There are no specific application delivery objectives. Success depends on the competence and heroics of the people doing the delivery and not on the use of proven processes.

Doing by Design Level 2 (Dev: Continuous Integration – Ops: Planned) Ops design deployment processes based on lessons from level 1 that are standardized, often by spreadsheet (this is doing-by-design). Deployment status is managed and tracked. There may be partial automation through scripting or provisioning tools – using project management techniques.

Design by Doing Level 3 (Dev: Agile – Ops: Continuous Delivery): Ops automate best practice release processes based on proven release procedures (this is design-by-doing). They also establish appropriate reporting and tracking of release processes and artifacts. Automation exists at the release level, which is used for orchestrating and leveraging automation capabilities of level 2.

Value by Design Level 4 (DevOps: Integrated Continuous Deployment): Ops become an integral part of development methodologies (e.g. Agile development). End-to-end agile delivery processes integrate and orchestrate automation and lifecycle tools enabling Ops to deliver maximum value from product features. This enables continuous product delivery from development to production leveraging the automation of level 3.

Design by Value Level 5: (Continuous Value Delivery): Rigorous value understanding prescribes feature prioritization and version release. Release cycles are driven exclusively by business rationale, unconstrained by technical considerations. Product lifecycle is optimized by combining delivery insights provided by the agile delivery processes of level 4 together with strategic market data.

DevOps and Availability

There is a lot of focus in DevOps on the tools used for delivery automation especially around configuration management (e.g. Ansible, Puppet, Chef) and Release Automation (CA-Noliosoft, IBM-UrbanCode). The goal of  these tools is delivery automation to lower Mean Time Between Failure (MTBF) due to deployment and configuration problems.

That doesn’t completely solve delivery related app availability issues since availability is defined as the ratio between uptime and operation time (including mean time to detect, mean time to repair and uptime). Just increasing MTBF – may not help with MTTR and doesn’t address the issues of MTTD at all. These tools can help with MTTR if you are willing to assume that you fix problems by redeployment. There are cases when that is enough – for example in a containerized environment where you can redeploy only the affected containers.

But if you think about it, DevOps automation can actually increase the time needed to diagnose a problem (the decompose step in IDEA). Before automation Ops had intimate knowledge of every detail associated with the deployment as they manually did the release.  Since the details of deployment are now masked by the tool, they won’t be able to find issues as quickly.

Production Assurance has a complementary focus to DevOps – lowering MTTR by lowering the time to detect, diagnose and repair deployment issues:

  • Shorter time to detect – using the fingerprint from staging, Production Assurance can proactively detect deployment problems.
  • Shorter time to diagnose and repair  – Production assurance provides details on where the deployment issues occurred and maps them to the fingerprint making them easier to repair.

So combining Configuration Management and Production Assurance increases MTBF, lowers MTTD and MTTR enabling IT to meet the true goal of increased app availability.

A Universal Delivery Process

Even though every delivery process is different, many have similar processes that lead from Dev to Prod. The standard environments are usually Dev,  QA, UAT, NFT, Staging, Production. If your organization has adopted agile practices-  Dev and QA are usually a single environment.

The following chart describes the promotion path towards production – the stages, tools and processes involved at a very high level:Picture1Hopefully this clears up what we meant in our previous post – “Continuous Delivery vs. Continuous Deployment“. BTW, this makes it clear why Continuous Delivery may not be the best term.

PS – for all the technical folks reading this blog, BS isn’t bullshit 🙂

Continuous Deployment vs. Continuous Delivery

Tags

Sometimes people confuse continuous delivery and continuous deployment, but as Jez Humble points out, they are actually different. We agree that they are different, but have a slightly modified explanation.

The difference between the two is business considerations – continuous deployment doesn’t care about the business aspects only technical considerations around deployment into a new environment. On the other hand continuous delivery takes into account business considerations around deployment, and uses them as a gate for deployment decisions.

Most business decisions take place in the UAT (User Acceptance Test) environment where the business decides whether the application can be promoted to staging, which means that it has passed the needed tests and gates for business deploy-ability to production.

Staging is a technical environment used to test integration, technical deploy-ability to production, performance and testing.