One of the first questions I always get asked when I’m talking to someone who is new to scrum is this: Why use points to estimate the effort involved in completing a story, instead of estimating in hours or days?
Agile points seem arbitrary, and it’s hard to see how they can help a team with its real-world concerns, like meeting deadlines.
Hours and days, on the other hand, are easy to understand, measure and weigh against practical external requirements.
Additionally, one of the key rewards that gets teams interested in implementing an agile workflow is the promise that they can learn to accurately estimate the amount of work the team can do in a certain number of days or weeks. Using points seems like taking a step back from that goal.
So why use points?
There is a practical, measurable, real-world value to using a relative estimation tool like points.
To illustrate this, let’s set up two imaginary teams–one that uses hours to estimate effort and one that uses points.
To keep the playing field even, let’s assume that both teams are just beginning to implement an agile workflow. The stories the teams need to complete in their first sprint are clearly defined, and the teams have a very solid understanding of the codebase, so they have a relatively good idea of what’s required to accomplish their goals.
Further, let’s even assume that the teams have been together for several months before this attempt to adopt agile, and that they have completed similar tasks before.
Estimating with hours
The first team is using hours to estimate their effort.
When looking at a story, they all try to compare it to work that they’ve done in the past. So for a given story, they think back to the last project that feels similar to this one. They know that it took one of their engineers about two days to complete, so they estimate this story at 16 hours.
Using this approach, they go through the stack of stories until they’ve estimated enough stories to fill the days of the engineers on the team for the duration of the sprint, and they add these to their first sprint backlog.
Note that the team has already made a few assumptions.
First of all, can an engineer realistically work 8 hours a day for the entire length of the sprint? Will anyone need to review the code? Will there be any overhead for meetings or other interruptions?
The team has also arbitrarily decided that eight is the number hours in a day, and that is already an abstract estimate.
Once the engineer assigned to the two-day story gets started, he realizes that the story is actually a little bit more complicated then the team anticipated, and it involves some extra research. This happens, regardless of whether you’re using points or using hours.
By the time the story is completed, it turns out that the engineer spent about 30 hours over the course of four days on it–about twice as long as the team had anticipated.
And at the end of the sprint, the team makes note of how long it took to complete this story and the others in the backlog for reference in future estimates.
At the next sprint planning meeting, the product owner comes in with another story that’s very similar to the previous story.
The question is, how should the team estimate the amount of hours it’s going to take to complete the story? Do they look back at history and say that stories like this generally take 16 hours? Do they assign the story 30 hours based on the most recent experience? Has their ability to complete stories like this actually changed since the first story, or are the stories themselves getting more complicated?
After only one sprint, there’s very little clarity about how to estimate the number of hours this second story will take.
The team decides to be conservative, and it estimates that this story is going to take 30 hours to complete. But when the engineer assigned to the story starts working on it, she realizes that this particular story is only going to take 10 hours, and she can complete it in under two days. Again, she makes note of the mis-estimation at the sprint retrospective.
At the next sprint planning, the team is presented with a third story, very similar to the previous two.
What scale should be used to estimate how long it’s actually going to take? If they say it will take 30 hours, they’re probably being overly conservative. If they say it’s going to take 10 hours, they may end up trying to accomplish too much and too little time. How many hours the story will actually take to complete is anybody’s guess.
Estimating with points
Now let’s consider another team, presented with the exact same stories, and with the exact same work history. But in this case the team will be estimating using points.
The team is presented with the new story. Relative to the other stories the team has estimated, this one falls somewhere in the middle of their point scale, which runs from 1 to 21 following a Fibonacci sequence of 1, 2, 3, 5, 8, 13, and 21. So they estimate it will take about five points of effort to complete. For them, five points means about the amount of effort that has been involved in stories like this that they’ve encountered in the past.
After working about twice as hard on the story than they had on previous stories like it, the engineer who completed this story mentions at the retrospective that it took longer than he had expected. As a result, the team was unable to complete all the stories in the backlog for this sprint, lowering their velocity.
At the next sprint planning meeting, the team is presented with a second story, very similar to the first. Relative to the other stories being presented, and the stories they completed in the previous sprint, this story now feels like an eight. The points that they are estimating reflect the relative effort the team believes will be involved in completing a story similar to this one, compared to the other stories they need to work on.
An engineer starts to work on the story, and notices that it actually took less effort than the previous similar story had. During the retrospective, she brings up the fact that the story didn’t seem as difficult as the points the team had assigned. But for the sake of tracking, it was still an eight point story, and it is completed and recorded as eight points on the team’s burn down chart.
As a result of underestimating the points for the story, the team was able to take on more work from the icebox during the sprint. That also means that their average velocity, or the average number of points the team can complete in a single sprint, goes up.
Now when the next story like this comes in, the team has an easy decision about how to estimate it.
Relative to the other stories presented, and the team’s their history, they estimate that the new story will take five points of effort. The number of points accepted into the sprint backlog reflects what they believe they can get done in a single sprint, based on the experience of two sprints. And this value is adjusted constantly to match the quality of the stories the team is receiving, and the team’s improving ability to estimate their capacity.
Points make for more accurate estimates
Did you notice how the first team’s estimate of hours, and the second team’s estimate of points, actually had very little to do with how much work the team had to put in to get the stories done?
In fact, the first-team was almost using hours as if they represented some abstract concept, rather than a measurable objective hour. Just calling their unit of effort “hours” made their attempts to estimate much more complicated, because an “hour” for them didn’t map directly to real-world hours.
Furthermore, without the relative scaling that comes from points, engineers may be tempted to inflate the number of hours they estimate a story might require to allow for practical overhead, or underestimate the hours to demonstrate their own heroic ability to get more done in less time than is sustainable or realistic.
After just two sprints, the second team already has an advantage over the first team: They know how to estimate new stories going forward, and how to incorporate them into their capacity planning. When they estimate a story is going to require eight points of effort, that means something specific to the team.
Points help the team members understand their own ability to take on a new story and accomplish it, relative to other stories they have already completed. Without the useful abstraction provided by points, a team has a much harder time predicting how much work they can actually accomplish in any given period of time.
I've worked as a Web Engineer, Writer, Communications Manager, and Marketing Director at companies such as Apple, Salon.com, StumbleUpon, and Moovweb. My research into the Social Science of Telecommunications at UC Berkeley, and while earning MBA in Organizational Behavior, showed me that the human instinct to network is vital enough to thrive in any medium that allows one person to connect to another.