The Art of Crafting KPIs That Actually Work

Welcome back to our series on managing self-managing teams! πŸ‘‹ We’ve reached the final instalment, where we’ll dive into the crucial skill of crafting Key Performance Indicators (KPIs) that truly work for your team. Let’s turn those dull metrics into powerful tools for success!

When Good Metrics Go Bad

Ever presented what you thought was a perfect set of KPIs, only to be met with blank stares or confused looks? You’re not alone. Many of us have faced the dreaded “Why are we measuring this again?” moment. So, how do we create KPIs that inspire “Aha!” moments instead of “Uh… what?”

The Essential Elements of Effective KPIs

Before we start, let’s review the key properties our KPIs should have:

  1. Easily Measurable: No complex calculations or long running batch jobs required.
  2. Team-Focused: Avoid singling out individuals.
  3. Business-Aligned: Clearly linked to company goals.
  4. Actionable: Provides clear direction for improvement.
  5. Motivating: Inspires the team to perform better.

KPIs to Avoid

Just as important as knowing what to measure is knowing what not to measure. Here are some KPIs to steer clear of:

  • Lines of Code: Quantity doesn’t equal quality.
  • Number of Bugs Fixed: Could encourage writing buggy code just to fix it.
  • Hours Worked: We’re after results, not time spent.
  • Story Points: Often arbitrary and not indicative of real progress.

Real-World KPI Success: The Booking Completion Saga

Let me share a story from a company I once worked at. We implemented a KPI around booking completion that became a game-changer. Here’s what made it so effective:

  1. Direct Business Impact: We measured “Incremental Bookings per Day.” This directly showed teams how much they were contributing to the company’s bottom line.
  2. Instant Feedback: The real magic was in the immediacy. As soon as an A/B test was turned on, the numbers started ticking. Our experimentation system was linked to a real-time Kafka feed from the booking website.
  3. Visible Results: We had TVs on office walls displaying dashboards of running experiments. This visibility created a buzz of excitement.
  4. Celebration of Wins: When an experiment showed significant improvement, the Product Owner would take the team out for drinks the day it was taken, when the experiment run finished. It wasn’t uncommon to see teams celebrating their wins at the local bar area in the evenings with a bottle of something and shots on the table.

The excitement was so palpable that one developer even created a Slack bot in his spare time to check experiment results during dinner! He wasn’t going to wait to the next day in the office to see what the users thought about his new feature.

This KPI worked because it connected directly to business impact and provided instant, visible feedback. It almost gamified the process for the engineers, making it thrilling to see in real-time how users responded to new features. The high volume of bookings meant meaningful results appeared quickly, sometimes within minutes.

The result? A highly motivated team, numerous significant wins, and a culture of continuous improvement and celebration.

Aligning Team Metrics with Business Goals

Your KPIs should create a clear line from daily team activities to high-level business objectives. For example:

  • Business Goal: Increase market share
  • Team KPI: “Feature Adoption Rate” (How quickly users embrace new features)
  • Daily Activity: Developing intuitive UI and smooth user on-boarding

Regular KPI Reviews

KPIs aren’t set-and-forget metrics. Schedule regular review sessions with your team to ensure your KPIs remain relevant and effective. Make these sessions collaborative and open to change.

The Ethics of KPIs

Remember these important principles:

  1. Never use KPIs as weapons against your team. Using KPIs punitively creates a culture of fear and discourages risk-taking and innovation.
    Example: If a team’s “Time to Value” KPI is lagging, don’t use it to criticise or penalise the team. Instead, use it as a starting point for a constructive discussion about process improvements or resource needs.
  2. Prioritise learning and improvement over hitting arbitrary numbers. Focusing solely on numbers can lead to short-term thinking and missed opportunities for meaningful growth.
    Example: If your “Feature Adoption Rate” isn’t meeting targets, don’t push features that aren’t ready. Instead, dig into why adoption is low. Are you building the right features? Is user education lacking? This approach leads to better products and sustained improvement.
  3. Celebrate the intent and progress behind the metrics, not just the numbers themselves. This approach encourages a growth mindset and values effort and learning, which are crucial for long-term success.
    Example: Even if a new feature doesn’t immediately boost your “Enthusiastic User Ratio”, celebrate the team’s efforts in user research, innovative design, or technical challenges overcome. This keeps the team motivated and focused on continuous improvement.
  4. Regularly review and adjust KPIs to ensure they remain relevant. As your product and market evolve, yesterday’s crucial metric might become irrelevant or even counterproductive.
    Example: If your product has matured, you might shift focus from a “New User Acquisition Rate” KPI to a “User Retention Rate” KPI, reflecting the changing priorities of your business.

By adhering to these principles, you create an environment where KPIs drive positive behaviour, foster learning, and contribute to both team satisfaction and business success. Remember, the goal of KPIs is to improve performance and guide decision-making, not to create pressure or assign blame.

Wrapping Up: The True Value of KPIs

The real power of KPIs lies not in the numbers, but in the conversations they spark, the behaviours they encourage, and the focus they provide. When done right, KPIs serve as a compass, guiding your team through the complex landscape of product development.

Craft KPIs that inspire, illuminate, and drive your team towards excellence. And remember, in high-performing teams, the best KPIs often become obsolete because the team internalises the principles behind them.

What’s the most effective KPI you’ve used? Or the least useful? Share your experiences in the comments below!

P.S. If this post helped you rethink your approach to KPIs, don’t hesitate to share it with your network. Let’s spread the word about better performance indicators!

Metrics That Matter: The Ultimate Guide to Measuring Self-Organising Team Success (Without Driving Everyone Crazy)

Hey there, data-driven dynamos and agile aficionados! πŸ‘‹ Ready to dive into the wild world of measuring team success? Buckle up, because we’re about to turn those vanity metrics upside down and discover what really matters in the land of self-organising teams!

The Metrics Maze: Don’t Get Lost in the Numbers

Picture this: You’re in a maze of mirrors, each one showing a different metric. Story points completed! Sprint velocity! Lines of code! Number of commits! It’s enough to make your head spin faster than a hard drive from 1995. πŸ’ΏπŸ’«

But here’s the million-dollar question: Which of these actually tell you if your team is succeeding?

Spoiler alert: Probably none of them. 😱

The Great Metrics Showdown

Let’s break down some common metrics and see how they stack up:

1. Sprint Completion / Story Points

The Good: Easy to measure, gives a sense of progress.Β 
The Bad: Can be gamed faster than a speedrunner playing Minecraft.Β 
The Ugly: Focuses on output, not outcome.

2. Meeting Deadlines / Completing Projects

The Good: Aligns with business expectations.Β 
The Bad: Can lead to corner-cutting and technical debt.Β 
The Ugly: Doesn’t account for value delivered.

3. DevOps Metrics (Deployment Frequency, Lead Time, etc.)

The Good: Focuses on flow and efficiency.Β 
The Bad: Can be technical overkill for some teams.Β 
The Ugly: Doesn’t directly measure business impact.

4. Business Metrics / KPIs

The Good: Directly ties to business value.Β 
The Bad: Can be hard to attribute to specific team actions.Β 
The Ugly: Might be too long-term for sprint-by-sprint evaluation.

The Secret Sauce: Metrics That Actually Matter

“Not everything that counts can be counted, and not everything that can be counted counts.” – Albert Einstein

Al wasn’t talking about Agile metrics, but he might as well have been. So what should we be measuring? Let’s cook up a recipe for metrics that actually matter:

  1. A Dash of Business Impact: How many users did that new feature attract?
  2. A Sprinkle of Team Health: How’s the team’s morale and collaboration?
  3. A Pinch of Technical Excellence: Is the codebase getting better or turning into spaghetti?
  4. A Dollop of Customer Satisfaction: Are users sending love letters or hate mail?

Mix these together, and you’ve got a metric feast that tells you how your team is really doing!

The Goldilocks Zone of Measurement

Remember Goldilocks? She wanted everything juuuust right. Your metrics should be the same:

  • Not too many: Analysis paralysis is real, folks!
  • Not too few: “Vibes” isn’t a metric (no matter how much we wish it was).
  • Just right: Enough to guide decisions without needing a PhD in statistics.

The Metrics Makeover: Before and After

Let’s give some common metrics a makeover:

Before: Number of Story Points Completed ❌

After: Business Value Delivered per Sprint βœ…

Instead of just counting points, assign business value to stories and track that. It’s like turning your backlog into a stock portfolio!

Before: Code Commit Frequency ❌

After: Feature Usage and User Engagement βœ…

Who cares how often you commit if users aren’t clicking that shiny new button?

Before: Bug Count ❌

After: User-Reported Issues vs. Proactively Fixed Issues βœ…

This shows both quality and how well you’re anticipating user needs. Crystal ball coding, anyone?

Some of your more technical metrics maybe SLAs as well, for example Quality, we want to deliver business value, without reducing quality.

The user engagement, you can usually glean from some kind of Web Analytics (Google, Analytics, etc), what ever you are using for this focus on the core user actions people are doing on your system, for example with ECommerce it usually Completed booking or step conversion in your funnel. Then these can be near real time even.

The Team Metrics Workshop: A Step-by-Step Guide

Want to revolutionise your team’s metrics? Try this workshop:

  1. Metric Brainstorm: Have everyone write down metrics they think matter.
  2. Business Value Voting: Get stakeholders to vote on which metrics tie closest to business goals.
  3. Feasibility Check: Can you actually measure these things without hiring a team of data scientists?
  4. Trial Run: Pick top 3-5 metrics and try them for a sprint.
  5. Retrospective: Did these metrics help or just add noise?

Repeat until you find your team’s metric sweet spot!

The Metrics Mindset: It’s a Journey, Not a Destination

Here’s the thing about metrics for self-organising teams: They should evolve as your team evolves. What works for a new team might not work for a seasoned one. It’s like updating your wardrobe – what looked good in the 90s probably doesn’t cut it now (unless you’re going for that retro vibe).

The Golden Rules of Team Metrics

  1. Measure what matters, not what’s easy.
  2. If a metric doesn’t drive action, it’s just noise.
  3. Team metrics should be about the team, not individuals.
  4. Metrics should spark conversations, not end them.
  5. When in doubt, ask the team what they think is important.

Wrapping Up: The Metric Mindfulness Movement

Measuring the success of self-organising teams isn’t about finding the perfect metric – it’s about finding the right combination of indicators that help your team improve and deliver value. It’s like being a DJ – you’re mixing different tracks to create the perfect sound for your audience.

Remember, the goal isn’t to hit some arbitrary numbers, it’s to build awesome products, delight users, and have a team that loves coming to work (or logging in) every day. If your metrics are helping with that, you’re on the right track!

So go forth, measure wisely, and may your charts always be up and to the right! πŸ“ˆ

What wild and wacky metrics have you seen in the wild? Got any metric horror stories or success sagas? Share in the comments – let’s start a metric revolution! πŸš€

P.S. If this post helped you see metrics in a new light, share it faster than your CI/CD pipeline! Your fellow tech leads will thank you (maybe with actual thank-you metrics)!

The Art of Hands-Off Management: Coaching Self-Organizing Teams Without Turning into a Micromanager

Hey there, tech leads and engineering managers! πŸ‘‹ Are you ready to level up your leadership game? Today, we’re diving into the delicate art of coaching self-organizing teams without accidentally morphing into the dreaded micromanager. Buckle up, because we’re about to walk the tightrope of hands-off management!

The Micromanager’s Dilemma

Picture this: You’re leading a team of brilliant devs. They’re self-organizing, they’re agile, they’re everything the tech blogs say they should be. But… they’re about to make a decision that makes your eye twitch. Do you:

A) Swoop in like a coding superhero and save the day? B) Bite your tongue so hard you taste binary? C) Find a way to guide without grabbing the wheel?

If you chose C, congratulations! You’re ready for the world of coaching self-organizing teams. If you chose A or B, don’t worry – we’ve all been there. Let’s explore how to nail that perfect balance.

The Golden Rule: Ask, Don’t Tell

“The art of leadership is saying no, not saying yes. It is very easy to say yes.” – Tony Blair

Okay, Tony wasn’t talking about tech leadership, but the principle applies. When you’re tempted to give directions, try asking questions instead. It’s like the difference between giving someone a fish and teaching them to fish – except in this case, you’re not even teaching. You’re just asking if they’ve considered using a fishing rod instead of their bare hands.

Example Time!

Let’s say your team is struggling with large, monolithic tasks that are slowing down the sprint. Instead of mandating “No task over 8 hours!”, try this:

You: “Hey team, I noticed our sprint completion rate is lower than usual. Any thoughts on why?”

Team: “Well, we have these huge tasks that only one person can work on…”

You: “Interesting. How might that be affecting our workflow?”

Team: “I guess it leads to a lot of ‘almost done’ stories at the end of the sprint.”

You: “Hmm, what could we do to address that?”

See what you did there? You guided them to the problem and let them find the solution. It’s like inception, but for project management!

The Five Whys: Not Just for Toddlers Anymore

Remember when kids go through that phase of asking “Why?” to everything? Turns out, they might be onto something. The Five Whys technique is a great way to dig into the root of a problem without telling the team what to do.

Here’s how it might go:

  1. Why is our sprint completion rate low?
  2. Why do we have a lot of long-running tasks?
  3. Why are our tasks so big?
  4. Why haven’t we broken them down further?
  5. Why didn’t we realize this was an issue earlier?

By the fifth “why,” you’ve usually hit the root cause. And the best part? The team has discovered it themselves!

When in Doubt, Shu Ha Ri

No, that’s not a new sushi restaurant. Shu Ha Ri is a concept from martial arts that applies beautifully to coaching self-organizing teams:

  • Shu (Follow): The team follows the rules and processes.
  • Ha (Detach): The team starts to break away from rigid adherence.
  • Ri (Fluent): The team creates their own rules and processes.

As a coach, your job is to recognize which stage your team is in and adapt accordingly. New team? Maybe they need more structure (Shu). Experienced team? Let them break some rules (Ha). Rockstar team? Stand back and watch them soar (Ri).

It’s a great way to introduce a process to them that isn’t overbearing, for example you can say how about we try “X” my way fora sprint or 2, see how you like it and evolve it from there.

The KPI Conundrum

“Not everything that can be counted counts, and not everything that counts can be counted.” – Albert Einstein

Al knew what he was talking about. When it comes to measuring the success of self-organizing teams, you need a KPI (Key Performance Indicator) that’s:

  • Instantly measurable (because who has time for complex calculations?)
  • Team-focused (no individual call-outs here)
  • Connected to business value (because that’s why we’re all here, right?)

Avoid vanity metrics like lines of code or number of commits. Instead, focus on things like deployment frequency, lead time for changes, or even better – actual business impact metrics.

Why instantly measurable? it doesn’t necessarily need to be instant, as long as it’s timely, the sooner you know results the sooner you can change direction, and if its very timely you can even get to the point of gamification, but more on that in another post.

A good KPI sets the course for the team, can solve arguments and helps them course correct if they choose the wrong direction.

It’s also good to agree on SLAs for technical metrics (Quality etc) to make sure we don’t make a decision that trades off long term for short without knowing.

The Coaching Toolkit: Your Swiss Army Knife of Leadership

Here are some tools to keep in your back pocket:

  1. The Silence Technique: Sometimes, the best thing you can say is nothing at all. Let the team fill the void. This will encourage your team to speak up on their own.
  2. The Mirror: Reflect the team’s ideas back to them. It’s like a verbal rubber duck debugging session.
  3. The Hypothetical: “What would happen if…” questions can open up new avenues of thinking.
  4. The Devil’s Advocate: Challenge assumptions, but make it clear you’re playing a role, if you don’t make this clear you may come across overly negative and not supportive.
  5. The Celebration: Recognize and celebrate when the team successfully self-organizes and solves problems.

Wrapping Up: The Zen of Hands-Off Management

Coaching self-organizing teams is a bit like being a gardener. You create the right conditions, you nurture, you occasionally prune, but ultimately, you let the plants do their thing. Sometimes you might get an odd-shaped tomato, but hey – it’s organic!

Remember, your goal is to make yourself progressively less necessary. If you’ve done your job right, the team should be able to function beautifully even when you’re on that beach vacation sipping piΓ±a coladas.

So go forth, ask questions, embrace the awkward silences, and watch your team bloom!

What’s your secret sauce for coaching self-organizing teams? Have you ever accidentally micromanaged and lived to tell the tale? Share your war stories in the comments – we promise not to judge (much)! πŸ˜‰

P.S. If you enjoyed this post, don’t forget to smash that like button, ring the bell, and subscribe to our newsletter for more tech leadership gems! (Just kidding, this isn’t YouTube, but do share if you found it helpful!)

Initiating and Nurturing Self-Organizing Teams

In our previous posts, we explored the role of an Engineering Manager and what makes an awesome team. Now, let’s dive into one of the most challenging aspects of building great teams: initiating and nurturing self-organization. This transition can be tricky, especially for teams and managers accustomed to more traditional, hierarchical structures.

The Challenge of Self-Organization

Throughout our lives, we’re often in situations where someone else tells us what to do:

  • As children, our parents tell us what to do
  • In school, teachers tell us what to do
  • In university your lecturer gives you assignments
  • In many traditional workplaces, managers assign tasks and make decisions

So I find that some people do this in their day to dya working life as well, they are looking for that person to tell them what to do.

So when we suddenly find ourselves in a self-organizing team, it can feel like being thrown into the deep end of a pool. The freedom can be both exhilarating and overwhelming.

Common Pitfalls in the Transition

When teams first attempt to self-organize, several common issues often arise:

  1. Looking to the Manager: Team members may still expect the manager to make all the decisions.
  2. Deferring to the Product Owner: In Agile teams, there might be a tendency to let the Product Owner drive all aspects of the work.
  3. Decision Paralysis: Without clear direction, teams might struggle to make decisions or take action.
  4. Lack of Structure: Some teams might interpret self-organization as a complete absence of process, leading to chaos.

So, how can we as Engineering Managers help our teams overcome these challenges and truly embrace self-organization?

Strategies for Initiating Self-Organization

  1. Be Quiet in MeetingsAs a manager, one of the most powerful things you can do is to be quiet. In team meetings, resist the urge to jump in with solutions or directions. Instead, give the team space to discuss and decide for themselves.You might find the silence uncomfortable at first. That’s okay! Embrace the discomfort and trust your team to fill the void.
  2. Stop Attending Every MeetingOnce your team starts to find its footing, consider stepping back from some meetings entirely, like stand-ups or sprint planning. This sends a clear message that you trust the team to handle things on their own.Tip: Start with less critical meetings and gradually expand to more important ones as the team gains confidence.
  3. Encourage Product Owners to Step Back TooIf your team works with Product Owners, have a conversation with them about the importance of team self-organization. Encourage them to focus on the “what” and “why” of the work, leaving the “how” to the team.
  4. Create Space for the TeamActively create opportunities for the team to make decisions without leadership figures present. This might feel uncomfortable at first, but it’s a crucial step in fostering true self-organization.

Nurturing Psychological Safety

Remember Google’s number one factor for successful teams? Psychological safety is crucial, especially when asking team members to step up and take more ownership.

Here are some ways to foster psychological safety:

  1. Encourage Risk-Taking: Celebrate when team members try new approaches, regardless of the outcome.
  2. Model Vulnerability: As a manager, admit when you don’t know something or when you’ve made a mistake.
  3. Respond Positively to Questions and Challenges: Thank team members for speaking up, even if you disagree.
  4. Focus on Learning, Not Blame: When things go wrong, focus on what can be learned rather than who’s at fault.

Coaching for Self-Organization

As your team starts to self-organize, your role shifts from directing to coaching. Here are some techniques:

  1. Ask Questions: Instead of providing solutions, ask questions that guide the team to find their own answers.
  2. Provide Context: Ensure the team has the information they need to make informed decisions.
  3. Offer Support: Let the team know you’re available if they need help, but avoid jumping in uninvited.
  4. Reflect Back: Help the team see their own progress and learning as they navigate self-organization.

Patience is Key

Remember, the transition to self-organization is a journey, not a destination. It takes time for teams to develop the skills and confidence to truly self-organize. Be patient, celebrate small wins, and keep reinforcing the team’s autonomy.

Wrapping Up

Initiating and nurturing self-organizing teams is one of the most challenging – and rewarding – aspects of being an Engineering Manager. By stepping back, creating space for the team to make decisions, fostering psychological safety, and shifting to a coaching role, you can help your team embrace self-organization and unlock their full potential.

In our next post, we’ll explore how to coach self-organizing teams without undermining their autonomy – a delicate balance that’s crucial for long-term success.

Have you been part of a transition to self-organization? What challenges did you face, and how did you overcome them? Share your experiences in the comments below!

What Makes an Awesome Team?

In our previous post, we established that the primary job of an Engineering Manager is to build awesome teams. But what exactly makes a team “awesome”? In this post, we’ll dive deep into the characteristics of high-performing engineering teams and explore why self-organization is a key factor in team success.

Google’s Recipe for Awesome Teams

When it comes to understanding team dynamics, few studies have been as influential as Google’s Project Aristotle. After years of research, Google identified five key factors that set successful teams apart. Let’s break these down:

  1. Psychological Safety: Can team members take risks without feeling insecure or embarrassed?This is the foundation of all great teams. When team members feel safe to voice their opinions, admit mistakes, and take risks, innovation flourishes.
  2. Dependability: Can team members count on each other to do high-quality work on time?Reliability builds trust, and trust is essential for smooth collaboration and high performance.
  3. Structure & Clarity: Are goals, roles, and execution plans clear within the team?When everyone understands their role and the team’s objectives, it’s easier to align efforts and make progress.
  4. Meaning of Work: Is the work personally important to team members?Teams perform better when individuals find their work meaningful and aligned with their personal values.
  5. Impact of Work: Do team members believe that their work matters?Understanding how their work contributes to the larger goals of the organization can significantly boost motivation and performance.

These factors provide a solid framework for what makes a team “awesome”. But there’s another crucial element that ties into several of these factors: self-organization.

The Power of Self-Organizing Teams

The concept of self-organizing teams is a cornerstone of many modern software development methodologies, including Agile and Scrum. As the Agile Manifesto states:

“The best architectures, requirements, and designs emerge from self-organizing teams.”

But what does self-organization really mean in practice?

According to the Scrum Guide:

“Self-organizing teams choose how best to accomplish their work, rather than being directed by others outside the team… Development Teams are structured and empowered by the organization to organize and manage their own work.”

Self-organizing teams have the autonomy to decide how to approach their work, distribute tasks, and solve problems. This autonomy directly contributes to several of Google’s success factors:

  • It enhancesΒ psychological safetyΒ by empowering team members to make decisions.
  • It increasesΒ dependabilityΒ by allowing the team to commit to what they believe they can achieve.
  • It providesΒ structure and clarityΒ by enabling the team to define their own processes and roles.
  • It boosts theΒ meaning and impact of workΒ by giving team members more control over their contributions.

Why Self-Organization Matters

  1. Encourages Ownership: When teams have the power to make decisions, they’re more likely to take ownership of both the process and the outcome.
  2. Increases Engagement: Autonomy is a key factor in job satisfaction. Self-organizing teams tend to be more engaged with their work.
  3. Promotes Responsibility: When team members are involved in decision-making, they’re more likely to take responsibility for the results.
  4. Fosters Innovation: Self-organizing teams can quickly adapt their processes and try new approaches, leading to innovative solutions.
  5. Builds Leadership Skills: In a self-organizing team, leadership is often distributed, giving team members opportunities to develop leadership skills.

The Role of the Engineering Manager in Self-Organizing Teams

You might wonder, “If teams are self-organizing, what’s left for the Engineering Manager to do?” The answer is: plenty!

The shift towards self-organization doesn’t diminish the importance of the Engineering Manager. Instead, it changes the nature of their role. Rather than directing the team’s day-to-day activities, the Engineering Manager becomes a facilitator, coach, and guardian and helps the team maintain autonomy.

In our next post, we’ll explore how to initiate and nurture self-organizing teams, and discuss the delicate balance of providing guidance without undermining the team’s autonomy.

Wrapping Up

Building an awesome team isn’t just about assembling a group of skilled individuals. It’s about creating an environment where psychological safety, dependability, clarity, meaning, and impact are prioritized. By embracing self-organization, teams can take ownership of their work, leading to higher engagement, more innovation, and ultimately, better results.

What’s your experience with self-organizing teams? Have you seen these principles in action? Share your thoughts in the comments below!

Understanding the Role of an Engineering Manager

In the fast-paced world of software development, the role of an Engineering Manager is crucial yet often misunderstood. As we embark on this series exploring the art of managing self-managing engineering teams, let’s start by demystifying what an Engineering Manager really does and why their role is pivotal in building successful tech organizations.

What IS an Engineering Manager?

If I had to distill the essence of an Engineering Manager’s job into one sentence, it would be this:

An Engineering Manager’s job is to build awesome teams.

At first glance, this might seem overly simplistic, but let’s unpack what building awesome teams really entails:

  1. Hiring the right people: This involves not just finding individuals with the right technical skills, but also those who align with the team’s culture and values.
  2. Onboarding new team members: Ensuring new hires can hit the ground running and quickly become productive members of the team.
  3. Skill set management: Making sure the team has the right mix of skills to execute on their projects and meet organizational goals.
  4. Coaching and mentoring: Helping team members improve themselves, grow in their roles, and reach their full potential.
  5. Career development: Providing opportunities and guidance for team members to progress in their careers.

The Split of Work

In my experience, the work of an Engineering Manager typically breaks down as follows:

  • 30% hiring
  • 70% everything else

While hiring is a significant part of the job, it’s that “everything else” that we’ll be focusing on in this series. It’s in this 70% where the real magic of building and nurturing awesome teams happens.

Why Focus on Building Awesome Teams?

You might wonder, “Why not focus on the technology or the product?” The answer lies in a fundamental truth of software development: great products are built by great teams, not just great individuals.

An awesome team:

  • Collaborates effectively
  • Innovates consistently
  • Delivers high-quality work
  • Adapts to challenges
  • Continuously improves

By focusing on building awesome teams, Engineering Managers create the foundation for all of these positive outcomes.

The Shift Towards Self-Managing Teams

In recent years, there’s been a significant shift towards self-managing or self-organizing teams in the tech industry. This approach, championed by methodologies like Agile and frameworks like Scrum, presents both opportunities and challenges for Engineering Managers.

On one hand, self-managing teams can lead to increased ownership, engagement, and job satisfaction among team members. On the other hand, it requires Engineering Managers to adapt their leadership style and find new ways to guide and support their teams without micromanaging.

In the upcoming posts in this series, we’ll dive deep into how to initiate, nurture, and measure the success of self-managing teams. We’ll explore strategies for coaching these teams without undermining their autonomy, and discuss how to create meaningful KPIs that align with business goals.

Wrapping Up

The role of an Engineering Manager is multifaceted and challenging, but at its core, it’s about people. By focusing on building awesome teams, Engineering Managers set the stage for innovation, productivity, and success.

In our next post, we’ll explore what makes a team truly awesome, drawing insights from Google’s groundbreaking research on team effectiveness. Stay tuned!

What aspects of the Engineering Manager role do you find most challenging or interesting? Share your thoughts in the comments below!

Planning for Change: The Fallacy of Long-Term Roadmaps in Software Development

Introduction

In the world of software development, long-term roadmaps have long been a staple of project management. These carefully crafted plans, often spanning months or even years, aim to provide a clear path forward for product development. They outline features, set deadlines, and allocate resources with precision that would make any project manager proud.

But here’s the deal: by the time you start working on that meticulously planned roadmap, the market is already changing. The technology landscape shifts like quicksand beneath your feet, and customer needs evolve at breakneck speed. This creates a fundamental tension between our desire for orderly planning and the chaotic reality of the tech world.

The Siren Song of Long-Term Roadmaps

Despite this tension, companies are often drawn to long-term planning like moths to a flame. It’s not hard to see why:

  1. Illusion of Control: Long-term roadmaps provide a comforting sense of control in an unpredictable industry. They offer a vision of the future that feels tangible and achievable.
  2. Alignment with Business Goals: These roadmaps allow companies to align their development efforts with broader business objectives, creating a narrative of progress that’s easy to communicate to stakeholders.
  3. Resource Allocation: With a long-term plan in hand, it becomes easier (in theory) to allocate resources, budget for future needs, and make hiring decisions.
  4. Compatibility with Traditional Business Cycles: Many businesses operate on annual or multi-year planning cycles. Long-term roadmaps fit neatly into these established rhythms, making them attractive to executives and boards.

The Problems: When Plans Meet Reality

However, as Mike Tyson famously said, “Everybody has a plan until they get punched in the mouth.” In software development, that punch often comes swiftly and from multiple directions:

  1. Rapid Technological Changes: The tech you planned to use might be outdated by the time you implement it. New frameworks, languages, or methodologies can emerge that render your careful plans obsolete.
  2. Shifting Market Demands: Customer needs and expectations can change dramatically in a short time. The feature that seemed critical six months ago might be irrelevant today.
  3. Disruptive Competitors: In the time it takes to execute your roadmap, a new competitor might enter the market with an innovative solution that changes the game entirely.
  4. Estimation Difficulties: Accurately estimating time and resources for software development is notoriously difficult, especially for work that’s months or years in the future.
  5. Stifled Innovation: Rigid adherence to a long-term plan can blind you to new opportunities and stifle the kind of rapid innovation that’s often necessary in the tech world.

Case Studies: The Perils and Promises of Planning

Let’s look at a couple of real-world examples that illustrate the challenges and opportunities in software development planning:

  1. Waterfall Woes at FirstBank: FirstBank, a large financial institution with over 10 million customers, spent 18 months meticulously planning a comprehensive overhaul of their online banking system. The project, codenamed “Digital Horizon,” was designed to modernize their web-based services and improve customer experience.However, by the time they were halfway through development in 2010, the mobile revolution was in full swing. The iPhone and Android smartphones had exploded in popularity, and customers were increasingly demanding mobile banking solutions. Much of FirstBank’s planned desktop-focused features suddenly seemed outdated.The bank found itself in a difficult position. They had already invested millions in the project, but continuing as planned would result in a product that was behind the curve at launch. They made the painful decision to scrap significant portions of their work and pivot towards mobile development. This led to delays of over a year and cost overruns exceeding $30 million.The “Digital Horizon” project, originally slated to give FirstBank a competitive edge, instead left them playing catch-up in the mobile banking space for years to come.
  2. Agile Triumph at QuickPay: In contrast, QuickPay, a small fintech startup founded in 2012, took an iterative approach to developing their peer-to-peer payment app. Instead of planning out years in advance, they released a minimal viable product (MVP) with basic transfer functionality and rapidly iterated based on user feedback.This agile approach allowed QuickPay to pivot quickly when they discovered an unexpected demand. Users were frequently splitting bills at restaurants and bars, and wanted an easy way to divide payments among friends. This wasn’t a feature QuickPay had originally considered as central to their app.Within two months, QuickPay had developed and released a “Split Bill” feature. They continued to refine it based on user feedback, adding capabilities like itemized bill splitting and integration with popular restaurant POS systems.Within a year, the “Split Bill” feature became QuickPay’s main selling point, setting them apart in a crowded fintech market. It propelled them from 100,000 users to over 5 million, capturing a significant market share from larger, more established payment apps.By 2015, QuickPay’s success attracted the attention of major financial institutions. They were acquired by a leading bank for $400 million, a testament to the value created by their, customer-focused development approach.

These examples highlight a crucial truth in the fast-paced world of software development: the ability to adapt quickly and respond to user needs often trumps even the most carefully laid long-term plans.

The Inevitability of Change in Software Development

In the world of software development, change isn’t just commonβ€”it’s inevitable. Unlike traditional industries where conditions might remain stable for years, the software landscape can transform dramatically in a matter of months or even weeks. This rapid evolution is driven by several factors:

  1. Technological Advancements: New programming languages, frameworks, and tools emerge constantly, often rendering existing solutions obsolete.
  2. Shifting User Expectations: As users interact with various digital products, their expectations for functionality, design, and user experience evolve rapidly.
  3. Market Disruptions: Startups with innovative ideas can quickly disrupt established markets, forcing everyone to adapt.
  4. Regulatory Changes: Especially in fields like fintech or healthcare, new regulations can necessitate significant changes to software systems.

This constant state of flux means that software development requires a fundamentally different approach to planning compared to other industries. While a construction project can often stick closely to initial blueprints, a software project needs to be able to pivot at a moment’s notice.

The key is to embrace change not as a disruption, but as an opportunity for innovation and improvement. This mindset shift is crucial for success in the dynamic world of software development.

Alternative Approaches to Planning

As Elon Musk once said, “You don’t need a plan. Sometimes you just need balls”, operating without a long term plan is scary for some people, you feel less in control, but once you realize you are in even less control with one, it get’s easier. Here are some alternative approaches that embrace the dynamic nature of software development:

  1. Inspect and Adapt Principle: This principle acknowledges that we can’t predict everything, so we need to regularly examine our progress and adjust our approach based on what we learn. Agile methodologies, in particular, heavily rely on this principle, incorporating regular retrospectives and iterative development to ensure teams can pivot quickly when needed.
  2. Extreme Programming (XP) Cycles: XP introduces an interesting approach with its weekly and quarterly cycles. The weekly cycle focuses on short-term planning and execution, where teams plan at the start of each week and deliver working software by the end. The quarterly cycle is used for reflection and longer-term planning, allowing teams to adjust their course based on what they’ve learned over the past quarter. This dual-cycle approach balances the need for immediate action with longer-term strategic thinking.
  3. Rolling Wave Planning: This technique involves detailed planning for the near-term future, with broader, less detailed plans for the longer term. As time progresses, the plan is continuously updated and the detailed planning “wave” rolls forward. This approach acknowledges that we have more accurate information about the near future and allows for flexibility as we move forward.
  4. OKRs (Objectives and Key Results): This goal-setting framework, popularized by Google, focuses on setting ambitious objectives and measurable key results. OKRs are typically set quarterly, allowing for more frequent reassessment and pivoting compared to traditional annual planning. They provide direction without prescribing specific solutions, giving teams the flexibility to determine the best way to achieve the objectives.
  5. “Just Enough” Planning: This concept emphasizes doing only the minimum amount of planning necessary to start making progress. It’s about finding the sweet spot between flying blind and over-planning. The idea is to do just enough planning to provide direction and alignment, but not so much that it becomes a burden or limits adaptability.

The common thread among these approaches is flexibility. They all acknowledge that plans will change and build in mechanisms for adapting to new information or circumstances. By embracing these more dynamic planning methods, software development teams can stay agile in the face of inevitable change and uncertainty.

Balancing Long-Term Vision with Short-Term Flexibility

While embracing change is crucial, it doesn’t mean operating without direction. The key is to balance a clear long-term vision with flexible short-term planning.

The Importance of Long-Term Vision

A compelling long-term vision serves several crucial purposes:

  • It provides a North Star for decision-making
  • It helps align teams and stakeholders around common goals
  • It inspires and motivates team members

Your long-term vision might be something like “Become the go-to platform for peer-to-peer payments” or “Revolutionize online education.” This vision should be ambitious and inspirational, but also clear and focused.

Combining Vision with Flexible Planning

Here’s how you can maintain your long-term vision while embracing flexible short-term planning:

  1. Set Directional OKRs: Use your long-term vision to inform high-level, directional OKRs. These provide guidance without prescribing specific solutions.
  2. Use Adaptive Roadmaps: Instead of detailed feature lists, create roadmaps that focus on problems to be solved or outcomes to be achieved. This allows teams the flexibility to find the best solutions.
  3. Regular Check-ins: Schedule regular sessions to review progress and reassess priorities (quarterly at most). This allows you to course-correct while still moving towards your long-term vision. Some advice I got on this once, if the project is on fire, bi-weekly check-ins, if the project is running smoothly monthly or even quarterly, if the project is doing “amazing” biweekly, work out what makes them so good and maybe you can apply in other areas.
  4. Empower Teams: Give your teams the autonomy to make decisions about how to best achieve the objectives. They’re closest to the work and often best positioned to respond to changes.
  5. Communicate Constantly: Regularly reinforce the long-term vision while acknowledging and explaining changes in short-term plans. This helps maintain alignment and buy-in.

By maintaining a clear long-term vision while embracing flexible short-term planning, you can navigate the ever-changing landscape of software development effectively. You’ll be positioned to seize new opportunities as they arise, while still moving consistently towards your ultimate goals.

Remember, in software development, the ability to adapt is often more valuable than the ability to predict. Embrace change, stay flexible, and keep your eyes on the horizon.

Implementing More Flexible Planning in Your Organization

Transitioning from long-term roadmaps to more adaptive planning isn’t just about adopting new methodologiesβ€”it’s a cultural shift. Here are some tips to help you make this transition:

  1. Start Small: Begin with a pilot project or team. This allows you to test and refine your approach before rolling it out organization-wide.
  2. Educate Your Team: Provide training on adaptive planning techniques. Help your team understand the ‘why’ behind the change.
  3. Emphasize Outcomes Over Outputs: Shift focus from feature delivery to achieving business outcomes. This mindset change is crucial for flexible planning.
  4. Shorten Planning Horizons: Instead of annual plans, consider quarterly or even monthly planning cycles.
  5. Embrace Uncertainty: Teach your team that it’s okay not to have all the answers upfront. Uncertainty is a normal part of software development.

Communicating this change to stakeholders is crucial. Here’s how to manage expectations:

  1. Be Transparent: Explain the reasons for the change. Share both the potential benefits and the challenges you anticipate.
  2. Focus on Value Delivery: Show stakeholders how this approach will lead to faster value delivery and better alignment with business needs.
  3. Use Visual Tools: Employ visual roadmaps or boards to show progress and plans. These can be easier for stakeholders to understand than traditional Gantt charts.
  4. Regular Updates: Provide frequent updates on progress and changes. This helps stakeholders feel involved and informed.

Conclusion

Long-term roadmaps, while comforting, often fall short in the fast-paced world of software development. They can lead to wasted resources, missed opportunities, and products that don’t meet user needs.

Instead, embracing more flexible planning approaches allows teams to:

  • Respond quickly to changes in technology and market demands
  • Deliver value to users more frequently
  • Learn and improve continuously

Remember, change in software development is not just inevitableβ€”it’s an opportunity. By adopting more adaptive planning methods, you position your team to seize new opportunities as they arise and create better products for your users.

Call to Action

As you finish reading this post, take a moment to reflect on your current planning processes. Are they truly serving your team and your users? Or are they holding you back?

Here’s what you can do right now:

  1. Review Your Current Process: Identify areas where your planning might be too rigidly planning or disconnected from user needs.
  2. Start a Conversation: Discuss these ideas with your team. Get their input on how you could make your planning more flexible.
  3. Experiment: Choose one small aspect of your planning to make more adaptive. It could be as simple as adding a monthly check-in to reassess priorities.
  4. Measure and Learn: Keep track of how these changes impact your team’s productivity and the value you deliver to users.

Remember, the goal isn’t to eliminate planning altogether, but to make it more responsive to the realities of software development. Start small, learn as you go, and gradually build a more adaptive, resilient planning process.

The future of your software development effortsβ€”and possibly your entire businessβ€”may depend on your ability to plan flexibly and embrace change. Are you ready to take the first step?

Are You Focusing on Output Over Outcomes? Rethinking Software Development

As an engineering manager, you’re tasked with building awesome teams. You work tirelessly to help keep projects on track, meet deadlines, and deliver results. But lately, something feels off. Your team is undoubtedly busy, even productive by conventional measures. Yet, you can’t shake the nagging feeling that all this activity isn’t translating into meaningful impact for your business or your users.

If this resonates with you, your team might be caught in a cycle of prioritizing output over outcomes. Let’s explore what this looks like from a leadership perspective and why it’s a critical issue to address.

Signs of the Problem

  1. Measuring Work Alone: Your team’s success is measured primarily by output – story points completed, tickets closed, features shipped. But what about the outcomes? Are you tracking the actual value these activities bring to the business?
  2. Lack of Feedback: Features are developed, shipped to production, and marked as “done.” But then… silence. There’s no mechanism in place to gather feedback on whether these features are successful or even used.
  3. Team Disconnection: Your developers are “shielded” from talking to business people. You might have heard the phrase, “Don’t interrupt our precious engineers’ time.” But this protection comes at a cost – disconnection from the very problems they’re trying to solve.
  4. Deadline-Driven Development: Your team is constantly working towards deadlines handed down from above. The problem? No one on the team understands where these deadlines come from or why they’re important.
  5. Product Overwork: Your Product Managers are spending so much time writing detailed specifications that you’re considering hiring people whose sole job is to create detailed specs for the engineers. This level of detail might seem helpful, but it can stifle creativity and problem-solving.

The Round Corners Saga: A Case Study

Let me share a story that might hit close to home. Your team has been working on a new user interface. In a recent sprint review, you’re surprised to learn that one of your engineers spent five days ensuring text boxes had perfectly rounded corners across all browsers and devices.

Proud of their attention to detail, your developer showcases the feature. But when you speak with the product owner, you discover that the round corners weren’t even a requirement – they were just a default style in the design tool’s mockups.

Five days of a skilled developer’s time, spent on an unintended, unnecessary detail. As a manager, how do you react? More importantly, how do you prevent this from happening again?

Here’s the thing: I don’t blame the engineer. I think it’s a fundamental problem that stems from the way we’re brought up, and more broadly, from Western-based education systems.

Think about it like this:

  • As a child, you have your parents telling you what to do.
  • In school, your teachers tell you what to do.
  • At university, your lecturers tell you what to do.

You spend the first part of your life with people telling you what to do. So it’s only natural that people find it easy to get trapped into finding the next person to tell them what to do. And usually, in dev teams that do Scrum, this becomes their Product Owner or manager.

It’s the managers responsibility to coach this out of people.

The Real Costs

This way of working comes with significant costs:

  1. Strategic Misalignment: Your team’s efforts aren’t driving towards key business objectives. You’re busy, but are you moving in the right direction?
  2. Opportunity Cost: Time spent on unnecessary features or misguided efforts is time not spent on innovations that could provide a real competitive advantage.
  3. Talent Retention Risk: Skilled developers often leave when they feel their work lacks purpose or impact. Are you at risk of losing your best people?

Solutions: Leadership for Purpose-Driven Development

As a leader, you have the power to shift your team’s focus from output to outcomes. Here’s how:

  1. Reinforce the ‘Why’: Push your engineers to ask why. Why are we building this feature? Why is it important? Why now?
  2. Redefine Success Metrics: Your Product Manager should already be doing this, but they might be hiding it from the engineers, I’ve seen this many times, some Product people think by doing this it’s less distracting for the engineers, but it has a negative effect. If they’re not doing it at all, you have bigger problems, and you probably need to go higher than your team to address them.
  3. Encourage Customer Connection: Break down the barriers between your developers and the users they’re building for. Teach them who they serve, the customer, and if you can introduce them to some and get them talking to them.
  4. Promote Learning Loops: Make time to analyze the impact of your work after it’s shipped. What worked? What didn’t? Why? Get your engineers engaged with the Analytics data that comes out of their system, get them excited about how many users are using their new feature, and if people aren’t, ask why?
  5. Cross-Team Collaboration: And I don’t just mean in formal settings. Have a beer together sometimes. Build real relationships across teams, Engineers, Design, Product and other business units.

Understanding the “Feature Factory” Phrase

The term “feature factory” vividly illustrates a development process that prioritizes output over outcomes, quantity over quality. Like workers on an assembly line, developers in this environment might be busy but disconnected from the larger purpose of their work.

As a leader, your role is to transform this factory into an innovation studio – a place where each line of code contributes to a larger vision, where your team’s skills are applied to solving real problems, not just checking off feature lists.

Moving Forward

Recognizing that your team may be stuck in this output-focused mindset is a crucial first step. The next is to start changing the conversation at all levels – with your team, with product managers, with stakeholders.

Start asking different questions in your meetings:

  • “How will we measure the success of this feature?”
  • “What problem are we solving for our users?”
  • “If this feature succeeds, what impact will it have on our key business metrics?”

Encourage your team to think beyond the immediate task to the larger purpose. Help them see their work not as isolated features, but as integral parts of a solution that brings real value to users and the business.

Remember, you became a leader to make a difference – to guide your team to create impactful, meaningful work. Don’t let the trap of focusing on output rob you and your team of that opportunity.

Are you ready to lead the change? Your team’s potential for true innovation and impact is waiting to be unleashed.

Measuring Product Health: Beyond Code Quality

In the world of software development, we often focus on code quality as the primary measure of a product’s health. While clean, efficient code with passing tests is crucial, it’s not the only factor that determines the success of a product. As a product engineer, it’s essential to look beyond the code and understand how to measure the overall health of your product. In this post, we’ll explore some key metrics and philosophies that can help you gain a more comprehensive view of your product’s performance and impact.

The “You Build It, You Run It” Philosophy

Before diving into specific metrics, it’s important to understand the philosophy that underpins effective product health measurement. We follow the principle of “You Build It, You Run It.” This approach empowers developers to take ownership of their products not just during development, but also in production. It creates a sense of responsibility and encourages a deeper understanding of how the product performs in real-world conditions.

What Can We Monitor?

When it comes to monitoring product health, there are several areas we usually focus on:

  1. Logs: Application, web server, and system logs
  2. Metrics: Performance indicators and user actions
  3. Application Events: State changes within the application

While all these are important, it’s crucial to understand the difference between logs and metrics, and when to use each.

The Top-Down View: What Does Your Application Do?

One of the most important questions to ask when measuring product health is: “What does my application do?” This top-down approach helps you focus on the core purpose of your product and how it delivers value to users. So ultimatelly when this value is impacted you know when to act.

Example: E-commerce Website

Let’s consider an e-commerce website. At its core, the primary function of such a site is to facilitate orders. That’s the ultimate goal – to guide users through the funnel to complete a purchase.

So, how do we use this for monitoring? We ask two key questions:

  1. Is the application successfully processing orders?
  2. How often should it be processing orders, and is it meeting that frequency right now?

How to Apply This?

To monitor this effectively, we generally look at 10-minute windows throughout the day (for example, 8:00 to 8:10 AM). For each window, we calculate the average number of orders for that same time slot on the same day of the week over the past four weeks. If the current number falls below this average, it triggers an alert.

This approach is more nuanced and effective than setting static thresholds. It naturally adapts to the ebb and flow of traffic throughout the day and week, reducing false alarms while still catching significant drops in performance. By using dynamic thresholds based on historical data, you’re less likely to get false positives during normally slow periods, yet you remain sensitive enough to catch meaningful declines in performance.

One of the key advantages of this method is that it avoids the pitfalls of static thresholds. With static thresholds, you often face a dangerous compromise. To avoid constant alerts during off-hours or naturally slow periods, you might set the threshold very low. However, this means you risk missing important issues during busier times. Our dynamic approach solves this problem by adjusting expectations based on historical patterns.

While we typically use 10-minute windows, you can adjust this based on your needs. For systems with lower volume, you might use hourly or even daily windows. This will make you respond to problems more slowly in these cases, but you’ll still catch significant issues. The flexibility allows you to tailor the system to your specific product and business needs.

Another Example: Help Desk Chat System

Let’s apply our core question – “What does this system DO?” – to a different type of application: a help desk chat system. This question is crucial because it forces us to step back from the technical details and focus on the fundamental purpose of the system adn teh value it delviers to the business and ultimately the customer.

So, what does a help desk chat system do? At its most basic level, it allows communication between support staff and customers. But let’s break that down further:

  1. It enables sending messages
  2. It displays these messages to the participants
  3. It presents a list of ongoing conversations

Now, you might be tempted to say that sending messages is the primary function, and you’d be partly right. But remember, we’re thinking about what the system DOES, not just how it does it.

With this in mind, how might we monitor the health of such a system? While tracking successful message sends is important, it might not tell the whole story, especially if message volume is low. We should also consider monitoring:

  • Successful page loads for the conversation list (Are users able to see their ongoing chats?)
  • Successful loads of the message window (Can users access the core chat interface?)
  • Successful resolution rate (Are chats leading to solved problems?)

By expanding our monitoring beyond just message sending, we get a more comprehensive view of whether the system is truly doing what it’s meant to do: helping customers solve their problems efficiently.

This example illustrates why it’s so important to always start with the question, “What does this system DO?” It guides us towards monitoring metrics that truly reflect the health and effectiveness of our product, rather than just its technical performance.

A 200 Ok response, is not always OK

As you consider your own systems, always begin with this fundamental question. It will lead you to insights about what you should be measuring and how you can ensure your product is truly serving its purpose.

The Bottom-Up View: How Does Your Application Work?

While the top-down view focuses on the end result, the bottom-up approach looks at the internal workings of your application. This includes metrics such as:

  • HTTP requests (response time, response code)
  • Database calls (response time, success rate)

Modern systems often collect these metrics through contactless telemetry, reducing the need for custom instrumentation.

Prioritizing Alerts: When to Wake Someone Up at 3 AM

A critical aspect of product health monitoring is knowing when to escalate issues. Ask yourself: Should the Network Operations Center (NOC) call you at 3 AM if a server has 100% CPU usage?

The answer is no – not if there’s no business impact. If your core business functions (like processing orders) are unaffected, it’s better to wait until the next day to address the issue.

Using Loss as a Currency for Prioritization

Once you’ve established a health metric for your system and can compare current performance against your 4-week average, you gain a powerful tool: the ability to quantify “loss” during a production incident. This concept of loss can become a valuable currency in your decision-making process, especially when it comes to prioritizing issues and allocating resources.

Imagine your e-commerce platform typically processes 1000 orders per hour during a specific time window, based on your 4-week average. During an incident, this drops to 600 orders. You can now quantify your loss: 400 orders per hour. If you know your average order value, you can even translate this into a monetary figure. This quantification of loss becomes your currency for making critical decisions.

With this loss quantified, you can now make more informed decisions about which issues to address first. This is where the concept of “loss as a currency” really comes into play. You can compare the impact of multiple ongoing issues, justify allocating more resources to high-impact problems, and make data-driven decisions about when it’s worth waking up engineers in the middle of the night.

Reid Hoffman, co-founder of LinkedIn, once said, “You won’t always know which fire to stamp out first. And if you try to put out every fire at once, you’ll only burn yourself out. That’s why entrepreneurs have to learn to let fires burnβ€”and sometimes even very large fires.” This wisdom applies perfectly to our concept of using loss as a currency. Sometimes, you have to ask not which fire you should put out, but which fires you can afford to let burn. Your loss metric gives you a clear way to make these tough decisions.

This approach extends beyond just immediate incident response. You can use it to prioritize your backlog, make architectural decisions, or even guide your product roadmap. When you propose investments in system improvements or additional resources, you can now back these proposals with clear figures showing the potential loss you’re trying to mitigate, all be it with a pitch of crytal ball about how likely these incident are to occura gain sometimes.

By always thinking in terms of potential loss (or gain), you ensure that your team’s efforts are always aligned with what truly matters for your business and your users. You create a direct link between your technical decisions and your business outcomes, ensuring that every action you take is driving towards real, measurable impact.

Remember, the goal isn’t just to have systems that run smoothly from a technical perspective. It’s to have products that consistently deliver value to your users and meet your business objectives. Using loss as a currency helps you maintain this focus, even in the heat of incident response or the complexity of long-term planning.

In the end, this approach transforms the abstract concept of system health into a tangible, quantifiable metric that directly ties to your business’s bottom line.

Conclusion: A New Perspective on Product Health

As we’ve explored throughout this post, measuring product health goes far beyond monitoring code quality or individual system metrics. It requires a holistic approach that starts with a fundamental question: “What does our system DO?” This simple yet powerful query guides us toward understanding the true purpose of our products and how they deliver value to users.

By focusing on core business metrics that reflect this purpose, we can create dynamic monitoring systems that adapt to the natural ebbs and flows of our product usage. This approach, looking at performance in time windows compared to 4-week averages, allows us to catch significant issues without being overwhelmed by false alarms during slow periods.

Perhaps most importantly, we’ve introduced the concept of using “loss” as a currency for prioritization. This approach transforms abstract technical issues into tangible business impacts, allowing us to make informed decisions about where to focus our efforts. As Reid Hoffman wisely noted, we can’t put out every fire at once – we must learn which ones we can let burn. By quantifying the loss associated with each issue, we gain a powerful tool for making these crucial decisions.

This loss-as-currency mindset extends beyond incident response. It can guide our product roadmaps, inform our architectural decisions, and help us justify investments in system improvements. It creates a direct link between our technical work and our business outcomes, ensuring that every action we take drives towards real, measurable impact.

Remember, the ultimate goal isn’t just to have systems that run smoothly from a technical perspective. It’s to have products that consistently deliver value to our users and meet our business objectives.

As you apply these principles to your own systems, always start with that core question: “What does this system DO?” Let the answer guide your metrics, your monitoring, and your decision-making. In doing so, you’ll not only improve your product’s health but also ensure that your engineering efforts are always aligned with what truly matters for your business and your users.

No QA Environment!? Are You F’ING Crazy?

In the world of software development, we’ve long held onto the belief that a separate Quality Assurance (QA) or staging environment is essential for delivering reliable software. But what if I told you that this might not be the case anymore? Let’s explore why some modern development practices are challenging this conventional wisdom and how we can ensure quality without a dedicated QA environment.

Rethinking the Purpose of QA

Traditionally, QA environments have been used for various types of testing:

  • Integration Testing
  • Manual Testing (by developers)
  • Cross-browser Testing
  • Device Testing
  • Acceptance Testing
  • End-to-End Testing

But do we really need a separate environment for all of these? Let’s break it down.

The Pros and Cons of Mocks vs. End-to-End Testing

When we talk about testing, we often debate between using mocks and real systems. Both approaches have their merits and drawbacks.

Cons of Mocks

  • Need frequent updates to match new versions
  • May miss breaking changes that affect your system
  • Can’t guarantee full system compatibility

Cons of Real Systems (QA/Staging)

  • Not truly representative of production
  • Require maintenance
  • May lack proper alerting and monitoring
  • Often have less hardware, resulting in slower performance

As Cindy Sridharan, a testing engineer and blogger, puts it:

“I’m more and more convinced that staging environments are like mocks – at best a pale imitation of the genuine article and the worst form of confirmation bias. It’s still better than having nothing – but ‘works in staging’ is only one step better than ‘works on my machine’.”

Consumer-Driven Contract Testing: A Replacement for End-to-End Testing

Consumer-Driven Contract Testing (CDCT) is more than just a bridge between mocks and real systems – it’s a powerful approach that can effectively replace traditional end-to-end testing. This method allows for “distributed end-to-end tests” without the need for a full QA environment. Let’s explore how this process works in detail.

The CDCT Process

  1. Defining and Recording Pact Contracts
    • Consumers write tests that define their expectations of the provider’s API.
    • These tests generate “pacts” – JSON files that document the interactions between consumers and providers.
    • Pacts include details like HTTP method, path, headers, request body, and expected response.
  2. Using Mocks for Consumer-Side Testing
    • The generated pacts are used to create mock providers.
    • Consumers can now run their tests against these mocks, simulating the provider’s behavior.
    • This allows consumers to develop and test their code without needing the actual provider service.
  3. Publishing Contracts by API Consumers
    • Once generated and tested locally, these pact files are published to a shared location, often called a “Pact Broker”.
    • The Pact Broker serves as a central repository for all contracts in your system.
  4. Verifying Contracts in Provider Pipelines
    • Providers retrieve the relevant pacts from the Pact Broker.
    • They run these contracts against their actual implementation as part of their CI/CD pipeline.
    • This step ensures that the provider can meet all the expectations set by its consumers.
    • If a provider’s changes would break a consumer’s expectations, the pipeline fails, preventing the release of breaking changes.
  5. Continuous Verification
    • As both consumers and providers evolve, the process is repeated.
    • New or updated pacts are published and verified, ensuring ongoing compatibility.

How CDCT Replaces End-to-End Testing

Consumer-Driven Contract Testing (CDCT) changes the testing process by enabling teams to conduct testing independently of other systems. This approach allows developers to use mocks for testing, eliminating the need for a fully integrated environment and providing fast feedback early in the development process.

The key advantage of CDCT lies in its solution to the stale mock problem. The same pact contract that generates the mock also publishes a test that verifies the assumptions made in the mock. This test is then run on the backend system, ensuring that the mock remains an accurate representation of the actual service behavior.

As systems grow in complexity, CDCT proves to be more scalable and maintainable than traditional end-to-end testing. It covers the same ground as end-to-end tests but in a more modular way, basing scenarios on real consumer requirements. This approach not only eliminates environment dependencies but also ensures that testing reflects actual use cases, making it a powerful replacement for traditional end-to-end testing in modern development practices.

In my opinion, you need end to end test to verify a feature works. But we know end-to-end test are flakey, so pact is the only viable solution I have found that gives you the best of both worlds.

Dark Launching: Enabling UAT in Production

Dark launching is a powerful technique that allows development teams to conduct User Acceptance Testing (UAT) directly in the production environment, effectively eliminating the need for a separate QA environment for this purpose. Let’s explore how this works and why it’s beneficial.

Dark launching, also known as feature toggling or feature flags, involves deploying new features to production in a disabled state. These features can then be selectively enabled for specific users or groups, allowing for controlled testing in the real production environment.

By leveraging dark launching for UAT, development teams can confidently test new features in the most realistic environment possible – production itself. This approach not only removes the need for a separate QA environment but also provides more accurate testing results and faster time-to-market for new features. It’s a key practice in modern development that supports rapid iteration and high-quality software delivery.

But it takes me a long time to deploy to production, it’s much faster to deploy to QA, right?

Your production deployment should be as fast as QA; there’s no reason for it not to be. Normally if it is, you have a CI pipeline that isn’t optimized. Your CI should take less than 10 minutes…

The Ten-Minute Build: A Development Practice from Extreme Programming

Kent Beck, in “Extreme Programming Explained,” introduces the concept of the Ten-Minute Build. This practice emphasizes the importance of being able to automatically build the whole system and run all tests in ten minutes or less. If the build takes longer than ten minutes, everyone stops working and optimizes it until it takes less.

He also says: “Practices should lower stress. An automated build becomes a stress reliever at crunch-times. ‘Did we make a mistake? Let’s just build and see’.”

But I didn’t write my tests yet, so I don’t want to go to production yet…

Test-First Development: Building Confidence for Production Releases

In the realm of modern software development, Test-First Development practices such as Behavior-Driven Development (BDD) and Acceptance Test-Driven Development (ATDD) have emerged as powerful tools for building confidence in code quality.

At its core, Test-First Development involves writing tests before writing the actual code. This might seem counterintuitive at first, but it offers several advantages. By defining the expected behavior upfront, developers gain a clear understanding of what the code needs to accomplish. This clarity helps in writing more focused, efficient code that directly addresses the requirements.

The power of these Test-First Development practices lies in their ability to instill confidence in the code from the very beginning. As developers write code to pass these predefined tests, they’re essentially building in quality from the ground up. This approach shifts the focus from finding bugs after development to preventing them during development.

By embracing Test-First Development, it will not only enhance your development process but makes practices like dark launching safe for UAT.

When to Use (and Not Use) Dark Launching

Dark launching is great for:

  • Showing feature progress to designers or Product Owners
  • Allowing stakeholders to use incremental UI changes

However, it’s not suitable for manual testing. Your automated tests should give you confidence in your changes.

Addressing Cross-Browser Testing

Cross-browser testing can be handled through automation tools like Playwright or by using local environments for fine-tuning and inspection.

The Case for Eliminating QA Environments

What I find most commonly is engineers who can’t run their systems locally. If this is the case for you, in order to see your changes, you need to wait for a CI pipeline and deployment to QA. This means your inner loop of development includes CI, and this will slow you down A LOT.

Our goal is to make the inner loop of development fast. QA environments, in my experience, are a crutch that engineers use to support a broken local developer experience. By taking them away, it forces people to fix the local experience and keep their production pipeline lean and fast, both things we want.

While it might be tempting to keep a QA environment “just in case,” this can lead to falling back into old habits.

Conclusion

Embracing modern development practices without a QA environment might seem daunting at first, but it can lead to faster, more reliable software delivery. By focusing on practices like consumer-driven contract testing, dark launching, and test-first development, teams can ensure quality without the overhead of maintaining a separate QA environment. Remember, as with any significant change, it requires commitment and a willingness to break old habits. But the rewards – in terms of efficiency, quality, and speed – can be substantial.