Traps to watch out for while using story point estimation

So far we’ve highlighted a few antipatterns of story point estimation that you should be aware of. We’re now concluding this series with a callout to the most common traps to watch out for while using this estimation style. 

Trap 1: Assigning points to every activity

Many XP teams often give points to all their activities – even those unrelated to the core activity of writing code. You’ll see them creating a 1-pointer for a meeting or another 2-pointer while coordinating with someone, and so on. This is problematic because not only are you tracking everything you do, but you are calling it velocity!

The real purpose of velocity, we reiterate, is to show how much your team can do in a given period of time. And this “how much” is essentially how much software you can build. At the end of the iteration, you should be able to sum up all the completed stories and say that the team can do 10 points (for example). You should then progress to discussing possible impediments, overcoming them, and ways to improve velocity. 

Tip: Keep points to only about writing code 

Velocity must strictly be about software and writing code – never about how many meetings you attended or emails you sent, even if those are important activities. If you give points to all the activities done in your 8 working hours, the velocity will only highlight the fact that you put in 8 hours a day or 40 hours a week. There will be no insights into how much software you’re able to build in that period, thus obfuscating the purpose and value of velocity. 

Trap 2 – Accommodating QA estimates along with dev estimates

At some point, even mature XP teams find themselves in a scenario where someone engaged in a supporting activity to development, most commonly QA, will question how their time requirement is getting accommodated in the overall estimation. The QA may be looking at a story that’s complex to test even if it’s simple to develop. They know they will need more time than what’s been estimated and raise a flag. They may ask for a larger size for that story, and denying them that could lead to ill-will in the team. 

In such a case, should the devs do one estimate and the QAs do another and you add it all up? No! That’s the worst way out, because this will instantly kill any relativity you may want to maintain in your backlog. 

Tips: 

Consider only the dev estimates 

The Theory of Constraints advises estimating the bottleneck, which is typically writing code. What a PM or a BA or a QA does is valuable, but it’s only writing code that should figure in your estimation. You can assume everything else will be proportional to that. A story that’s larger to build will naturally be more difficult to analyze or test or deploy into production. So, keep it simple, and estimate only the development. 

Treat exceptions as just that, but investigate trends 

However, your backlog can have stories that need to be treated differently. A story may require complex analysis even if its algorithm or logic is simple to implement. Or there could be a type of story that’s quick to do, but its impact is across several different places and testing will be arduous. Our advice here is not to worry about such exceptions. Size is based on pure complexity and your velocity will reflect that that particular story took longer although it was not very complex. 

However, if you see many stories that have disproportionate effort compared to complexity, then it needs attention. Ask yourself questions like: Is there an opportunity for automation in the face of multiple manual changes? Can we restructure the code such that we follow the DRY (Don’t Repeat Yourself) principle? Investigate such patterns thoroughly and address them at the right time.

Adjust capacities of the team to meet exceptional challenges

If you are building an application that needs to work across multiple device sizes and browsers, then, potentially, testing can be more complex than development. Some teams may argue that their constraint is testing rather than development and a way to resolve that would be to estimate the testing effort and assume that development will be proportional. 

While this is the correct response theoretically, it might be an unfamiliar and difficult approach for many. Instead, adjust capacities of roles in the team. In the above example, you could add more testers to take care of the various devices and browsers and work it in such a way that development complexity remains the constraint because that is what people are familiar with. 

Trap 3 – Spiking with no limits 

Spikes are small experiments that you could do when you don’t know the answer to something or how to implement something. For example, you’re integrating with some third party component and you know so little about it that you can’t even estimate the time it will take. To arrive at that estimate itself, you may want to first play around with that component. That is spiking. Spikes have an important role to play in the project lifecycle. But it’s critical to not let them turn into endless rabbit holes. 

Tips: 

Time-box a spike 

Manage spikes by time-boxing it. Allocate a definite time period and at the end of that, you should know what to do next. Since at least one of your developers will be playing that spike for that period instead of working on a production story, take that capacity away from your velocity and make adjustments. Velocity commitments for that particular iteration could be lower than usual because of the spike. 

Assign a size to a spike 

You can also use points. We’ve worked with the thumb rule that any spike should be a Small. While it is cheating a little because we are in a way associating size with the time expectation, it does help limit the time that people could spend on that spike.

Trap 4 – Creating stories for supporting activities

An extension of the above trap is that some teams create a separate story for activities that will eat up a developer’s time – such as technical analysis or proving out the best strategy with testers, etc. All this is time consuming, but it’s distinct from writing production code. 

Tip: Call them tasks, not stories

We insist that stories be used only to represent code that’s getting built. That must be the sole definition of a user story. Call anything else that will require a developer’s time as “tasks”  and call out that the team will be short of capacity in that particular iteration. 

Trap 5 – Creating 0-pointers

We’ve seen previously that as you become familiar and efficient at completing stories, you subconsciously start downsizing them. This behavior can leave you with 0-point stories (downsized from a previous 1-point). Despite its seemingly negligible size, the story will still require time and effort. But you’re discounting the velocity for that because it seems too small to even count. This leads to a tricky situation: your productivity has improved because you are getting better at your work, but your velocity does not reflect that. 

Tip: Keep in sight the original sizes you started out with

Keep a reference set of stories that have the original sizes from your first estimation exercise. Compare your current set to those and then bucket them in the correct and relative sizes. This will avoid random 0-pointers that you think require no time, but add up to the velocity. 

Trap 6 – Making the velocity math work…somehow

Often you may convince yourself that you will protect your velocity commitment no matter what; that you will sum up all the activities that add up to that velocity and somehow make the math work. What that really translates into is that you’re shying away from critical conversations. If you find yourself spending more time today doing tech analysis, for example, it could be that complex features are the trend going forward. You then need to talk about addressing that trend by adding developers or calling in relevant experts if required. 

Tip: Engage in right conversations at the right time

Surface the issue for the whole team and the stakeholders so you’ll get more options. Even if the options are not viable, at least there will be clarity about the nature of work that will go into building your software right. At all cost, refrain from contrived efforts to make it to the velocity numbers. It will not reveal the true state of the project and prove detrimental. 

The key takeaway

Be vigilant at all stages of the project in order to catch downward trends. Agile strives to uphold transparency, clarity and visibility in the project and your way of working. Having meaningful conversations as needed will ensure these attributes. This will go a long way in successfully forging a solid relationship – strongly rooted in trust and reliability – with your client stakeholders.

Estimation antipattern – Losing relativity in the backlog

In this article, we’re going to explain a common phenomenon we’ve observed around losing relativity of the sizes of the stories in your backlog. This is an antipattern that could lead to your team feeling a sense of pressure because despite your productivity having improved, the numbers do not show that–thus raising a concern among stakeholders.

What is this antipattern?

If you’re on a long-running project that carries over multiple months or even years, your team begins to gain familiarity with the tech landscape, the domain, the environment in which they work, and the actual work they are doing. Over time, your ability to get work done improves. However, the velocity does not reflect that (and we’ll get into that). There is no data to match your feeling of getting more productive or effective. The velocity numbers remain similar or dip or oscillate. 

The root of this antipattern lies in incorrect sizing. The current sizes are not in sync with the relative sizes that you originally came up with, say, 6 months ago and that is why the velocity doesn’t reflect your improved productivity. 

Explaining it with numbers 

Let’s do some math here. Six months ago, you could complete 10 “medium” sized stories (2 pointers) in one iteration and so, your velocity was 20 points. Now, say you’ve got twice as productive at doing this work so you are able to do 20 “medium” stories in an iteration. But because of your familiarity with the application you size these “medium” stories as “small” (1 point) stories. The net effect is that your velocity will remain 20 points.

The velocity doesn’t show your improvement because your story sizes are not relative to what you started out with. 

Understanding using an analogy 

The analogy of a pizza works well to understand how we end up messing with sizing. As a child, you were barely able to finish a small pizza. As a teenager, you could easily devour an entire medium-sized pizza by yourself. Did the pizza size change? No! The 6-inch pizza remained the same. What changed was your appetite and ability to eat pizza. 

Connecting that to stories, the size of the story remains the same from what you’d estimated 6 months ago. Now, it seems smaller because of your enhanced ability. In your mind, a 2-pointer now feels more like a 1-pointer. 

The “How big” and “How fast” get mixed up, leading to the antipattern

To understand the “why”, we’ll revisit the process of story point estimation. Time-based estimates go wrong because they ask one compound – and incorrect – question: “How long will this take?” Story points replace that with 2 meaningful questions: “How big is this task?” and “How fast can it be done?” And then answers each question independently. 

The errors in sizing and losing the relativity of the sizes occur because the teams start to mix up these two questions. Something seems smaller (and easier) than before, not because it is indeed smaller, but because you’ve got better at doing it. The distinction between the two questions and the objectivity of each are lost. 

This mixing up questions happens when teams begin to consider only the current set of stories or the most recent ones during the sizing exercise. Stories done in the past are ignored. The real meaning of a 1-pointer is lost. It had a certain definition 6 months ago, but that changed over time, leading to this antipattern. And when clients do not see any improvement in the velocity numbers, they get restless: “You’ve been on this project for such a long time. What do you have to show for it? How do we get to see that your productivity has improved?” As you grapple with these questions, you find yourself stuck in the whirlpool of pressure and self-doubt – those same negative behaviors that we said don’t happen with story point estimation. 

Ways to avoid this antipattern 

Maintain a set of reference stories as examples of what Small, Medium, or Large stories look like. That way, when you get a new set of stories to estimate, you will not lose sight of the older, original sizes you set out with. You will avoid downsizing and your velocity, too, will reflect the correct picture. The improvement in your productivity will get its due visibility and appreciation.   

This practice leverages a key benefit of story sizing, i.e., speed. Comparing your new stories to a reference and then bucketing them is a quicker process because you don’t reinvent the wheel of what a 1-pointer is with each new set. You could change these reference stories as needed, so long as you are diligent that they match the original sizes. 

Display the reference stories prominently in the team area on a wall. Make it a habit to carry out estimation around that wall. This will also work well while onboarding new members into the team’s estimation exercise. All they will need to do is understand those reference stories and their sizes well enough to start contributing immediately and productively to estimation. 

With reference stories, you will put into place a sure-fire method to eliminate errors in sizing and to maintain the relativity of the stories in your backlog. By displaying them such that they are the focus of every estimation exercise, you will establish a best practice to avoid this antipattern.

Estimation antipattern – Comparing teams’ velocities

In this article, we’re going to examine another antipattern around story point estimation. Last time we discussed the fallout of using story points as targets. Now, we caution you against using story points to compare teams. It’s a common enough trap, but we must hold ourselves back. By the end of this article, you’ll know why such comparison is a meaningless exercise. 

What’s the comparison about?

To put it simply, using story points to compare team has accusatory implications. “Hey this other team is doing more points per iteration than your team, so that makes them better.” So Team A could be doing 20 points and Team B could be completing 12 points, and because Team A’s velocity is greater, it’s the better team. When put like that, we will naturally witness Team B desperately try to outdo Team A, or at least match its velocity. End result: Pressure builds up, the environment can get ugly, team members may resort to workarounds to meet the numbers, and little thought will go toward what should really be in the focus – the quality of the software that’s going into production. 

Here’s why such comparison is meaningless 

Despite the exercise having no real meaning or value, we still find many teams engaging in it during story point estimation. We reiterate that such thinking, i.e., comparing teams on the basis of completed points, is totally flawed, and here are our reasons for taking this stance: 

Every team is unique

Each team has its own distinct personality and comes with diverse skill sets and levels of experience. No two teams are alike in their inter-personal as well as working relationships – how well they gel together, how effectively can they collaborate, etc. Then there are more factors like do they have any dependencies, what’s the environment in which they work, and so on.

All these factors contribute to the uniqueness of each team. Hence it is completely unfair and impractical to compare velocities because it would be impossible to create two teams that have the exact same skills and the same personalities and do the same kind of work in the same environment. Velocity is a reflection of many things happening within a team – As we saw above, each team comes with its unique composition, skills, experience, and dynamics. All of these have a bearing on how long that team will take to complete a story and how much they can pack into one iteration. This output will vary across teams and there is no ground for comparison.

The understanding of a point is not common across teams – The basis of comparison is itself skewed. As we saw earlier, comparison takes the form of: Team A completed 20 story points but Team B could complete only 12 in the same duration. What we fail to realize is that each team’s understanding of what makes up 1 story point is limited to that team and it will differ across teams. This understanding of how big is the backlog is not standard; it’s relative to each team. What Team A considers to be 1 point can be very different from the perspective that Team B takes. Note that 1 point is just a numeric representation for a “small” story. And 4 points would be just a numeric representation of a large story. But the buckets of what is a small story is only relevant and meaningful within the team. How then can we compare their performance and efficiency on the basis of the number of points completed?

A standard definition of 1 point doesn’t work

To counter the above problem, some folks go a step further and say everyone must agree upon a standard definition of 1 point, so everyone can align the sizes for stories. While this sounds reasonable this is very difficult to achieve in practice. The cost of the bureaucracy required to ensure this especially on a large program OR for the entire organisation is hardly worth the ability to compare teams’ velocities. 

The nature of each team’s work can be different

So let’s say there are two teams in the organization that are fundamentally doing very different kinds of work. One is building some API’s and the other is building a mobile app. It’s a no-brainer that the tech landscape and the kind of complexities faced by each team are extremely different. In such a scenario, comparing how fast they are completing their points is fundamentally meaningless!

Similar work cannot be compared either

Let’s look at cases where teams could be involved in similar work, such as both teams are building on top of a common API. One is building an Android app and the other an iOS app. It may seem like they are doing similar work, but even then, it’s not so straightforward that we can begin comparing their velocities. That’s because contributing factors such as environment, skills, etc. continue to remain different. 

Let’s look at a typical development scenario where two teams are working on the same backlog – one is onshore, the other is offshore. On one hand, we can safely assume that that the onsite time will go faster because the Product Owners are sitting next to them, providing real quick answers or feedback. On the other hand, this same situation may work to their detriment. The team members may find themselves getting distracted because the client stakeholders are right there. The offshore team doesn’t face that pressure and may be able to focus more effectively and actually end up with a higher velocity. But, truly, can we conclude that they are really more efficient? We cannot, because the comparison itself has no real meaning. 

How to avoid the antipattern of comparing teams using story points

The above reasons illustrate the futility of such a comparison. That’s why we strongly recommend against comparing teams. It’s not a valuable exercise to say one team is better than the other by itself. But if at all you have to compare or you want to compare, then track the trends of each team, such as the level of predictability, frequency of releases, or the business impact they create. 

Predictability

Have the teams set up short-term goals or iteration-level targets and then track how often each team is able to meet its targets. The targets can vary across iterations, depending on factors like yesterday’s weather. But if a team says on a Monday that in the next two-week-long iteration, they will complete 10 points, then how often do they actually achieve that completion? Or how close do they get to it every time? This level of predictability can be compared, if at all one really has to.

Frequency of releases

The more frequently the team is putting software into production and into the hands of real users, the more value they are creating. So you can compare how frequently each team is able to create such value.

Business Impact

Mature XP teams don’t even bother with intermediate output; they are more concerned with the outcome, i.e., the business impact the team is able to create. The way you do that is to have the team sign up for key OKRs. These could be key business results they want to achieve such as improving customer conversion or reducing churn in customers. That’s the business objective they work toward. And whether they deliver software to achieve that or do something else – that’s entirely up to the team.

However, this extreme practice of only measuring business outcomes may not be feasible in every software development scenario, but the takeaway from that is to not just measure output, but measure on these more meaningful aspects.

However, it’s again important to bear in mind that you shouldn’t be comparing on parameters such as the revenue each team is able to generate. That’s because each team functions in a different context and could have distinct strategies. One team’s focus could be revenue generation, another team could be driving consumer engagement–which again eliminates any common ground for comparison. So, even if we’re comparing on the parameter of business impact, the OKRs are not to be compared. How far each team is able to achieve its OKRs is more important. 

By now, you have got a chance to understand the various reasons to avoid comparing teams in itself. Comparing teams on velocity and the points they’ve delivered has absolutely no meaning. If at all you have to compare and want to compare, then you can certainly find ways to make the comparison more relevant and meaningful. The outcome of such comparisons has more value. 

How Story Points Make Our Life Better

The entire series on estimation has focused on the inherent problems of time-based estimates. We introduced and recommended story points as a more viable option to time estimates. In this series finale article, we look at story point estimation more closely and understand how it resolves the issues that crop up with time estimates. 

Creates confidence by asking the right questions

A prime concern with time estimates is that they are almost always wrong. That’s because you’re asking too complex a question–combining two aspects, viz., how big something is and how fast you can go. Story point estimation makes things easier at the very outset by separating these two questions. 

Further, it simplifies the “How big?” question by asking how big the stories are relative to a benchmark we’ve set, which is typically the smallest story in the backlog. This process of Relative Sizing or allocating a size to a story by comparing it with a standard can be completed quickly. In addition to speed, this process generates a high level of confidence in the sizes. Infact, we need not call them estimates at all; they are definite sizes. There is no scope for ambiguity and there is no need to revisit or change the sizes.  

The second of the 2 sub-questions that story points ask is “How fast can you go?”. We admit upfront that nobody really knows the answer. So we move on to the next best way to approach this question and that’s through informed and logical guesswork–using the exercise of Raw Velocity. This involves multiple developers picking up multiple and diverse items from the backlog with their estimates hidden and stating which ones they can finish in a given time period (an iteration). Then we add up the estimates for the items that were picked up to find the “gut feel” of the story points that can be completed in an iteration. We do this math over several rounds with each of the developers. The average derived thus reflects a gut feeling of the majority of the team regarding the time required to complete the given backlog. 

While it’s accepted as guesswork, these numbers evoke a higher level of confidence because the developers have answered a comparatively easier question–regarding their own capabilities and not that of someone else. Hence the answer is likely to be accurate. 

Moreover, we use the Raw Velocity numbers only for a short duration – during the first couple of iterations. After those initial iterations, we are in a position to make more informed velocity calculations because we have real data from the real work done during the first few iterations. Going forward, we use our judgement based on this data and not on the guesswork we started out with. Thus the answer to “How fast?” is now rooted in accuracy and we can proceed with confidence and conviction in our sign-up for the subsequent iteration.

Story point estimation, thus, instills confidence by asking the right questions. The responses to these questions leave no room for confusion or doubt. From one question we get definite sizes and the response to the other is grounded in real data. We can successfully avoid the feeling of being wrong–which is typically what happens in time estimation. On the contrary, we are now confident and convinced about how much the team can do especially in the short term.

Eliminates pressure

Time-based estimation also creates pressure–at the 2 stages of estimation and execution. Let’s dive deep…

During estimation

There are 2 problem-creating scenarios with time estimates during the estimation
phase:

  • The compound question that kickstarts time estimation is itself a pressure point because you’ve mixed up 2 sub-questions. In an attempt to respond to that compound question, you try to cover all possible scenarios and come up with precise numbers. Too much time is spent on seeking too many details too early on. 
  • The other problem is that you have too many options to choose from in terms of time period. This wide range of choices significantly raises the probability of errors. You may end up either over- or under-estimating. In either case, you end up feeling pressured. Typically you will react to such pressure by being extra vigilant about every possible scenario or then by adding buffers to play it safe. 

Story point estimation has no scope for such problems. Firstly, it limits the number of size options to just 3 or 4. All we’re doing is picking up each story and comparing it to a sample set in terms of size. Once we’ve assigned a size, it’s frozen. We don’t need any more details; we don’t need to revisit the sizes. The issues–and consequent pressure–of time estimates do not even exist in the story points world.

During execution

In a time estimates scenario, you may realize that the size of the task is different from what was assumed during planning, or that the pace of work is getting affected by extraneous conditions. However, you’ve already committed to getting an X amount of work done in a given time-frame. You naturally feel pressured because time is running out. 

Story points avoid this pressure by setting the right expectations from the start. We are cognizant that Raw Velocity is a guess and those numbers will change in the face of real work. We take a cue from previously completed iterations and incorporate those learnings into Estimated Velocity. The guesswork progresses into informed estimation, which definitely has more value. Over time, this boosts the team’s confidence too. 

Replaces negative pressure with positive ambition

Using Relative Sizing and Velocity, we’ve eliminated the pressure of completing an X amount of work in a given period of time. With that pressure gone, what we experience is a positive emotion and the aspiration and ambition to strive for continuous delivery and improvement. 

Having sifted out questions that have no real meaning, we zone in on the most relevant ones in our effort to achieve continuous delivery – What should we take up in the next iteration? What will be of highest value to go next? With these questions, we strive to make the delivery process better, more productive and centered around excellence. 

Empowers the team

Story points create a transition from individual performance and individual goals to team goals. The commitment now is to put out a set of features into production. The onus is on the team as a whole and so developers are likely to help each other and even help other roles to complete what has been started. There is no rush to start new items from the backlog. The emphasis is on completing what’s been started–in the right sequence, with the right quality, with mutual collaboration as a team. The debilitating competitive pressure gives way to a positive environment and a collective intention to achieve. The atmosphere is still charged with high intensity, but it’s energizing and collaborative.

Retains the focus on completion

At this point, it’s important to remember that story points work around a rule of averages without any direct conversion of points into days. If we still did the conversion in our minds , we would lose some of the benefits intrinsic to this type of estimation. There can be no blanket rule that a story of X points will always get done in Y days. Conditions are fluid and each story and each developer has their own pace. If this is accepted, then we’ve reduced some of the pressure of dates and time commitments faced by developers.

However, it’s imperative to be aware that doing away with the pressure of time commitments does not equate to development meandering along at its own pace or the absence of accountability around completion. We definitely must ask these pertinent questions, but with the intention and aim of removing impediments and achieving progress. The ticking clock must not be allowed to assume nightmarish proportions. Developers should breathe easy and strive for excellence and continuous improvement instead of struggling for self-preservation. 

Enables effective tracking

Time-based estimation poses a tracking hazard because we cannot segregate how much time was spent doing real development work and how much was lost in breaks or leaves. If we do attempt to track break and leave time, it would lead to micromanagement of the team, and we do not want to go there. 

Separately, in time estimates, there is no way to apply the learnings from previous work to what will be taken up next. For example, if one story gets delayed, we cannot predict the time completion for the next story. 

Story point estimation is able to counter such problems. Velocity is an average across multiple developers and multiple iterations. After the initial stories, the guesswork is replaced by insights gathered from real work and real data. We are in a good position to estimate how much time will be required to complete the next story, based on our learnings from previous iterations. This is valuable and it helps boost the team’s confidence in its own capabilities. 

Similarly, if we have visibility into who’s going to be away next week, we count them out of capacity and make adjustments accordingly. For e.g., if the full team’s velocity is 10, we might only plan for 8, knowing that there will be 2 less on the team. 

Clarity that results from learning from previous iterations or from accounting for leaves and breaks prevents us from erroneously concluding that the team hasn’t been working hard enough if our targets aren’t met. The team’s morale isn’t affected adversely and we stay focused on guiding the next iteration in the correct way, using practices that can ensure accomplishment and achievement as a team. 

Targets continuous delivery even if project is lagging

Despite the best intentions and care, the project can go off track. Imagine that it’s not possible to meet the target of 10 points per iteration. At such a time, story point estimation adopts a rational problem-solving approach instead of mindlessly indulging in the blame game. Since XP’s version of the Golden Triangle keeps Scope variable, we can proceed by going live with the most important, prioritized stories on the said date. The rest of the features can get added over the next few iterations. 

This ensures that we deliver on the committed date, even if it’s not the entire bulk of scheduled deliverables. This practice of Continuous Delivery instills confidence in the client stakeholders that there is no major delay harming the project. We’ve ensured that the most relevant features are already in production by the half-way mark, and what didn’t get completed will be delivered in the next few iterations, which will be just a couple of weeks out. We’ve succeeded in derisking the project to a large extent quite early in the game. 

Story point estimation makes life better

At the end of this series, we’ve firmly established that story point estimation and the XP way of planning makes estimation easier and more accurate. It creates the right atmosphere for individuals and the team as a whole. Software quality and excellence remain the star of the show throughout. Harmful after-effects and negative behavior patterns arising out of pressure find no place in this process. Stakeholders also experience a high level of confidence as they get a clear view into the project’s progress and can take efficient and timely decisions in response to possible changes. Story point estimation is a win-win for all parties, and it’s the only effective method for project estimation and planning. 

The XP Game

Let’s begin this article with a recap of what we’ve covered. We started out with the argument against time-based estimates. We then sought to answer the question “How big something is?” using story points and relative sizing. Next, we attempted to meaningfully respond to “How fast can we go?” and explored the concept of velocity in that context. Now, we move on to “When will all of this get done?”

The question is not meaningful

While we may be eager to know the answer, the question “When will all of this get done?” is not meaningful and needs to be rephrased for 2 reasons:

  1. While Raw Velocity is a logical starting point to estimate how fast you will go, the averages derived from it remain guesswork. It offers us a reasonable starting point that must be periodically revisited once actual work commences. The numbers are bound to change in real work situations and that presents a challenge. To illustrate this: If your scope is 200 points and the Raw Velocity is 10 points per iteration, then it’s safe to say it will take 20 iterations to complete the entire scope. But once you start working, you may realize that this 10 itself is going to change–it may come down to 8 or 5, or even go up to 12. This fluctuating number affects the response to “When will all of this get done?”
  2. Moreover, the “all” in question, i.e., the scope, is also variable and therein lies the second challenge. You may have estimated 200 points as scope while initiating the project. But dynamic situations in the market or the organisation can impact this number. The scope is very likely to change over the course of the project and it’s prudent to proceed with that awareness.

So, what we know is that neither the scope nor the velocity will remain constant. Hence, the question “When will all of this get done?” has no real value. 

What, then, should we be really asking?

The Triple Constraints Triangle

Before we frame the right question, let’s take a step back to understand the oft-cited paradigm of project management – the Triple Constraints Triangle – and how XP connects with it. Scope, Time and Cost are the 3 points of this triangle, with Quality as the central, overarching theme. What happens if we arrange these 3 constraints into different permutations? 

  • Flexible time, fixed scope and cost – This is not advisable because we could spend years building out “all” the (ever increasing) scope we can think of, but not be able to reap the benefits of the software we are building. This has been the fate of many-a-traditional software projects.
  • Flexible cost, fixed scope and time – Keeping cost flexible essentially means agreeing to varying the team size in response to variations in scope or velocity. However, growing and shrinking the team is neither feasible nor useful.One of the most important ingredients of a successful software team is the context and the knowledge that they build over a period of time about the problem at hand and the solution. A team that’s constantly changing its composition will not be able to take advantage of such a benefit.  
  • Everything fixed – The biggest casualty in this scenario would be quality because if everything is fixed, quality is the only thing that can change and it will change. 

The XP view of Triple Constraints Triangle

The third permutation is typical to traditional software development. XP, however, takes a different view. We advocate keeping quality fixed at the highest possible level as well as freezing time and cost, and keeping scope as the only variable. Freezing the team size (i.e., the cost) is age-old wisdom that adding people to a late software project makes it later. And, time must be frozen too and you should commit to yourself and to your stakeholders that you will go live by a certain date. 

However, be mindful that the concept of “freezing time” in XP refers to freezing time in multiple small chunks or iterations. At the beginning of each iteration, ask yourself how much will you get done in that period. And at the end of each iteration, be sure to deliver something into production. 

The meaningful questions to ask

Having understood and accepted this XP-variant of the Triple Constraints Triangle, we frame the right and relevant questions that you should really be concerned about: 

  1. “How much are you able to do in each iteration?”
  2. “What should you do in the next iteration?”

Now the focus shifts to prioritization. Instead of debating over the timeframe for the entire project completion and then scrambling about to achieve that, you are now asking what should be completed next. You are systematically breaking down “all” the scope into small releases and iterations. By asking yourself the above 2 questions regularly and frequently, say every 2 weeks, you nail down a priority list. This also helps in appropriately spreading work out over iterations. 

The correct questions reduce pressure

The exercise of Raw Velocity–as we saw in previous articles–provides a good starting point. Since it’s an average derived from multiple developers across multiple stories, it instills some confidence even if it’s guesswork. As you progress with these numbers, the guesswork gets replaced by real numbers derived from real data. The first few iterations give a more realistic estimate of how many story points can be completed per iteration, and this Actual Velocity–based on real experience–guides future iterations. 

Having understood how much can get done in each iteration, you can now prioritize what to take up in the next iteration. For example, if the last 3 iterations showed you that it’s possible to complete 10 points per iteration (that’s the response to our first question), you can then decide which 10 points to take up next (the response to the second question). Such prioritization helps you move the most important features ahead in production. And this momentum of continuous delivery ensures the project doesn’t lag behind terribly or goes completely off-track. 

Here’s a practical scenario borrowed from the real world to illustrate the above: 

We’d said we’d go live with a set of features in 3 months. At the end of 3 months, we’ve built 80% of the features. It’s imperative to now go live with that 80%. The good part is because we have effectively prioritized what needs to be taken up in every iteration, we have covered the most important aspects of those features in that 80% and in the estimated 3 months. The remaining 20% will get covered in the next few weeks or so. The client doesn’t have to choose between “We’ll go live with everything or nothing”. They feel assuaged that the most important 80% is out there already and the remaining pieces will be put out shortly. Likewise, the team doesn’t feel pressured about performance or productivity, keeping the environment positive and collaborative.    

Prioritization enhances outcome

We reiterate that such prioritization and focus on “What to do in the next iteration?” also ensures that you take the most valuable pieces into production first. What needs to get delivered on priority gets picked up on priority. When you’re thinking about what should go into production next, here are 2 useful approaches: 

  1. Learn from software that’s been put into production. For example, you have previously built a feature to increase conversion rate of your customers. Now, you may want to evaluate the progress of that metric and assess how it can be made more effective. So you can pick that up as the next piece to deliver based on what’s already been done. 
  2. Get early feedback from real users and use that as a guide. Build prototypes to carry out user testing with real users–even before you write a single line of code. Analyze its results to determine what your users really want and schedule that as your next to-do item.

It’s important to note that we are making space for new scope. We’ve accepted that scope will change and we welcome that. Based on yesterday’s weather – average of actual velocity from the last few iterations, we can go back and identify what to pick up next. We may even end up identifying new features to build in the next iteration based on user/market response. Our process of prioritization and our Velocity numbers give us room to be flexible in order to deliver what is critical and remain responsive to what users want or what the market needs. In a nutshell, we’ve ensured the outcome is truly effective and efficient. 

Summary

Let’s pause at this point for a quick review. 

“When will all of this get done?” is really not the question you should fuss about. Instead, you should try to answer the question “How much are we actually able to do every iteration?” and then concentrate on prioritising the most valuable functionality at the beginning of each iteration.  We recommend keeping scope flexible, while freezing time, cost, and most importantly, quality. Freezing time in XP refers to the concept of of iterations–with the emphasis being on continuous release into production at the boundary of each iteration or even more frequently if possible. This practice of Continuous Delivery works hand in hand with Continuous Discovery–as we investigate and assess the next valuable thing to take up next. And to do that right, we either learn from software that’s already in production or use feedback from user testing. 

Going forward, we will see how this XP perspective of software development and project management successfully overcomes many of the typical hurdles that are common in the traditional development space that uses time-based estimates. 

Answering the question “How fast will we go?”

To corroborate our advocacy of story points over time-based estimates, our last article recommended the use of Relative Sizing to answer the question “How big is this task?” We ended that piece by highlighting how the sizing process provides a good grip on the project’s scope. We now move ahead in the estimation and planning journey to the next question “How fast can it be done?”

“How fast?” has no concrete answer

The stark truth is that the question “How fast?” cannot really have a correct and definitive answer. The only honest answer is “I don’t know.” Many factors contribute to this ambiguity: 

  1. Team composition – How fast can the team get something done directly depends on who is on the team, what’s their experience with this specific technology, and their overall exposure and experience within the given domain. 
  2. Team dynamics – However, it’s not just about putting the right team together and ticking off the technical requirements. It’s important for the team to collaborate efficiently and harmoniously. They must be able to work well together in order to get things done. 
  3. Environmental factors – The ecosystem in which the team will work also affects their speed. For example, will they use fast machines, fast internet, and fast servers that serve data quickly? Or will it be virtual networking and logging on to remote terminals that can compromise the pace of work? Even if the team has ideal experience and exposure levels, such environmental factors will have an effect. 
  4. Business stakeholders’ response time – This impacts task completion in a big way. Are clients available to answer your questions as soon as required? Can anything in their work environment lead to deprioritization of your questions? Maybe they don’t have the answers ready and need to conduct research or ask around…delaying their response time and consequently the team’s progress. 

Answering the “How fast?” question helps planning

Given the above unknowns, the only honest answer remains “I don’t know.” Any claim to a precise response should raise suspicion because nobody can really know.

If not knowing is the only thing we really know, do we still need to even address this question? And the answer is yes, we do, because its response guides the important activity of planning. Planning is a must because it gives us a chance to uncover assumptions, become aware of risks, and get a clearer picture of the reality of the project. Of course, with the caveat that the response is going to be based on guesswork. 

Keep the guesswork logical

Within this framework, we must be as logical as possible while estimating how fast we will go. That can be achieved by making more people guess how fast they can go about various items in your backlog. Ask them questions about different stories, types of technologies, the UI, database, services, and the interplay between these. Gather all these myriad responses and average it out. So you’ve got as many inputs and insights as possible into a wide range of relevant concerns. Averaging all that will help you make headway–but it will still be guesswork. 

Raw Velocity provides a logical starting point

We recommend doing this guesswork logically by using the Raw Velocity exercise. Let’s quickly understand what it is and why do we recommend it: 

  1. Enlist 4 or 5 developers with diverse experiences and exposure to your technology and domain. The more diverse they are, the better off the results of the guessed work will be. 
  2. Assign Developer A a batch of 10 random stories from your backlog of say, 100 stories. Offer a mix of story points, features, etc.. Make sure you hide the size or points. Get an understanding from the developer how many s/he can do in 1 iteration (the timespan of an iteration is fixed, e.g., 1 week). Essentially you are asking Developer A how many stories s/he can get done in 1 week. 
  3. The developer assesses the size of the stories and gives the response. For e.g., s/he could say stories P, Q, and R from this batch and they turn out to be 2 Smalls and 1 Medium. So Dev A can do 4 points in 1 iteration. 
  4. Give Dev A more batches of 10, and at the end, average out their responses. 
  5. Conduct similar exercises with the remaining developers.
  6. Finally average across rounds and across people–and that average is your Raw Velocity. 
  7. You then factor in other variables such as paid time off, unplanned leave, etc. and calculate the team’s planning velocity. 
  8. This is then used to create an initial plan that is taken to the stakeholders and expectations are set. You can start development with the points you get from the Raw Velocity exercise. But 2-3 weeks later, revisit these expectations because the real picture and real feedback once you start work will input additional–and more valuable–lessons. This initial plan will keep evolving over time based on real, observed velocity.  

Advantages of Raw Velocity 

While we are fully aware that Raw Velocity also entails guesswork, the process is better informed. 

  • It is an average of multiple developers picking up multiple and diverse items from the backlog and doing the math over several rounds. The average arrived at thus reflects the gut feeling of the majority of your team. 
  • Each developer is answering a fairly easier question of how many stories they can complete in an iteration. It’s about their capabilities and not a notional response about a third person. Hence it has more potential for accuracy. 
  • This averaging exercise is done speedily–10 rounds with 4 developers each can be accomplished in just 2-3 hours. 
  • Since you hide the story points from each developer, you ensure that all possibility of bias is removed. What you get are fresh, original estimates from each developer about the batch in each iteration.  

Call it Estimated Velocity

Since it’s based on guesswork, it is best to call this as “Estimated Velocity”. This is contrary to the definitive sizes we got after the process of Relative Sizing, wherein we recommended to not use the word “estimated”.  

It’s important to bear in mind that just because you guessed a number, it does not entail that that is the only correct number. Hence some of the anti-patterns in this space must be avoided, viz., “Target Velocity” or “Planned Velocity”. These create an impression that that’s the target velocity and those numbers must be achieved. No, you’re just guessing and it’s bound to be different from the ground reality. Let’s just be honest and call this what it is–guessed velocity, estimated velocity, etc. 

Set clear expectations 

With the understanding that this is guessed or estimated velocity, set clear expectations within your team and with your stakeholders. Emphasise that this guesswork will need to be revisited periodically–it’s more of a short-term view. Take a cue from previously completed iterations and incorporate those learnings into the next version of estimated velocity. Staying in touch with reality is what will refine your guesswork and make it more probable. Over time, your confidence about your guesses will improve too. 

Conclusion

To sum it up, the only honest answer to “How fast can this be done?” is “I don’t know.” The next best thing to do is use logical guesswork to arrive at estimated velocity. Furthermore, stay aligned with the real picture and be prepared to relook at this estimated velocity as needed. 

In our next article, we’ll talk about the real show–what happens when you start actually working on the project, what kind of exceptional situations can crop up, and the resultant reactions… Keep watching this space! 

T-shirts & story points – get the size right

In our previous article on estimation, we concluded that time-based estimates is a bad idea and must be avoided at all costs. That approach involves too much guesswork that results in imprecise numbers and leads to harmful pressure on the team. Here, we offer story points as a more viable unit of measurement and make a case for them over estimating in hours or days.

Continue reading “T-shirts & story points – get the size right”

Time-based estimates are a bad idea

Time-based estimation is frequently used in software development projects, even though it is far from being accurate or efficient. Here’s a deep dive into the inherent problems that should prompt both developers and managers to avoid this approach.

Continue reading “Time-based estimates are a bad idea”