Software development effort estimation
Encyclopedia
Software development
efforts estimation
is the process of predicting the most realistic use of effort required to develop or maintain software based on incomplete, uncertain and/or noisy input. Effort estimates may be used as input to project plans, iteration plans, budgets, investment analyses, pricing processes and bidding rounds.
Typically, effort estimates are over-optimistic and there is a strong over-confidence in their accuracy. The mean effort overrun seems to be about 30% and not decreasing over time. For a review of effort estimation error surveys, see. However, the measurement of estimation error is not unproblematic, see Assessing and interpreting the accuracy of effort estimates.
The strong over-confidence in the accuracy of the effort estimates is illustrated by the finding that, on average, if a software professional is 90% confident or “almost sure” to include the actual effort in a minimum-maximum interval, the observed frequency of including the actual effort is only 60-70%.
Currently the term “effort estimate” is used to denote as different concepts as most likely use of effort (modal value), the effort that corresponds to a probability of 50% of not exceeding (median), the planned effort, the budgeted effort or the effort used to propose a bid or price to the client. This is believed to be unfortunate, because communication problems may occur and because the concepts serve different goals
.
Most of the research has focused on the construction of formal software effort estimation models. The early models were typically based on regression analysis
or mathematically derived from theories from other domains. Since then a high number of model building approaches have been evaluated, such as approaches founded on case-based reasoning
, classification and regression trees, simulation
, neural networks
, Bayesian statistics
, lexical analysis
of requirement specifications, genetic programming
, linear programming
, economic production models, soft computing
, fuzzy logic
modeling, statistical bootstrapping
, and combinations of two or more of these models. The perhaps most common estimation products today, e.g., the formal estimation models COCOMO
and SLIM have their basis in estimation research conducted in the 1970s and 1980s. The estimation approaches based on functionality-based size measures, e.g., function points, is also based on research conducted in the 1970s and 1980s, but are re-appearing with modified size measures under different labels, such as “use case points” in the 1990s and COSMIC in the 2000s.
Below are examples of estimation approaches within each category.
. This implies that different organizations benefit from different estimation approaches. Findings, summarized in, that may support the selection of estimation approach based on the expected accuracy of an approach include:
The most robust finding, in many forecasting domains, is that combination of estimates from independent sources, preferable applying different approaches, will on average improve the estimation accuracy
.
In addition, other factors such as ease of understanding and communicating the results of an approach, ease of use of an approach, cost of introduction of an approach should be considered in a selection process.
MRE = |actual effort − estimated effort| / |actual effort|
This measure has been criticized
and there are several alternative measures, such as more symmetric measures
, Weighted Mean of Quartiles of relative errors (WMQ)
and Mean Variation from Estimate (MVFE).
A high estimation error cannot automatically be interpreted as an indicator of low estimation ability. Alternative, competing or complementing, reasons include low cost control of project, high complexity of development work, and more delivered functionality than originally estimated. A framework for improved use and interpretation of estimation error measurement is included in.
, anchoring
, planning fallacy
and cognitive dissonance
. A discussion on these and other factors can be found in work by Jørgensen and Grimstad.
Software development
Software development is the development of a software product...
efforts estimation
Estimation
Estimation is the calculated approximation of a result which is usable even if input data may be incomplete or uncertain.In statistics,*estimation theory and estimator, for topics involving inferences about probability distributions...
is the process of predicting the most realistic use of effort required to develop or maintain software based on incomplete, uncertain and/or noisy input. Effort estimates may be used as input to project plans, iteration plans, budgets, investment analyses, pricing processes and bidding rounds.
State-of-practice
Published surveys on estimation practice suggest that expert estimation is the dominant strategy when estimating software development effort.Typically, effort estimates are over-optimistic and there is a strong over-confidence in their accuracy. The mean effort overrun seems to be about 30% and not decreasing over time. For a review of effort estimation error surveys, see. However, the measurement of estimation error is not unproblematic, see Assessing and interpreting the accuracy of effort estimates.
The strong over-confidence in the accuracy of the effort estimates is illustrated by the finding that, on average, if a software professional is 90% confident or “almost sure” to include the actual effort in a minimum-maximum interval, the observed frequency of including the actual effort is only 60-70%.
Currently the term “effort estimate” is used to denote as different concepts as most likely use of effort (modal value), the effort that corresponds to a probability of 50% of not exceeding (median), the planned effort, the budgeted effort or the effort used to propose a bid or price to the client. This is believed to be unfortunate, because communication problems may occur and because the concepts serve different goals
.
History
Software researchers and practitioners have been addressing the problems of effort estimation for software development projects since at least the 1960s; see, e.g., work by Farr and Nelson.Most of the research has focused on the construction of formal software effort estimation models. The early models were typically based on regression analysis
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
or mathematically derived from theories from other domains. Since then a high number of model building approaches have been evaluated, such as approaches founded on case-based reasoning
Case-based reasoning
Case-based reasoning , broadly construed, is the process of solving new problems based on the solutions of similar past problems. An auto mechanic who fixes an engine by recalling another car that exhibited similar symptoms is using case-based reasoning...
, classification and regression trees, simulation
Simulation
Simulation is the imitation of some real thing available, state of affairs, or process. The act of simulating something generally entails representing certain key characteristics or behaviours of a selected physical or abstract system....
, neural networks
Neural Networks
Neural Networks is the official journal of the three oldest societies dedicated to research in neural networks: International Neural Network Society, European Neural Network Society and Japanese Neural Network Society, published by Elsevier...
, Bayesian statistics
Bayesian statistics
Bayesian statistics is that subset of the entire field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief or, more specifically, Bayesian probabilities...
, lexical analysis
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...
of requirement specifications, genetic programming
Genetic programming
In artificial intelligence, genetic programming is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms where each individual is a computer program...
, linear programming
Linear programming
Linear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
, economic production models, soft computing
Soft computing
Soft computing is a term applied to a field within computer science which is characterized by the use of inexact solutions to computationally-hard tasks such as the solution of NP-complete problems, for which an exact solution cannot be derived in polynomial time.-Introduction:Soft Computing became...
, fuzzy logic
Fuzzy logic
Fuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...
modeling, statistical bootstrapping
Bootstrapping
Bootstrapping or booting refers to a group of metaphors that share a common meaning: a self-sustaining process that proceeds without external help....
, and combinations of two or more of these models. The perhaps most common estimation products today, e.g., the formal estimation models COCOMO
COCOMO
**********************************************************************************************The Constructive Cost Model is an algorithmic software cost estimation model developed by Barry W. Boehm...
and SLIM have their basis in estimation research conducted in the 1970s and 1980s. The estimation approaches based on functionality-based size measures, e.g., function points, is also based on research conducted in the 1970s and 1980s, but are re-appearing with modified size measures under different labels, such as “use case points” in the 1990s and COSMIC in the 2000s.
Estimation approaches
There are many ways of categorizing estimation approaches, see for example. The top level categories are the following:- Expert estimation: The quantification step, i.e., the step where the estimate is produced based on judgmental processes.
- Formal estimation model: The quantification step is based on mechanical processes, e.g., the use of a formula derived from historical data.
- Combination-based estimation: The quantification step is based on a judgmental or mechanical combination of estimates from different sources.
Below are examples of estimation approaches within each category.
Estimation approach | Category | Examples of support of implementation of estimation approach |
---|---|---|
Analogy Analogy Analogy is a cognitive process of transferring information or meaning from a particular subject to another particular subject , and a linguistic expression corresponding to such a process... -based estimation |
Formal estimation model | ANGEL, Weighted Micro Function Points Weighted Micro Function Points Weighted Micro Function Points is a modern software sizing algorithm invented by Logical Solutions in 2009 which is a successor to solid ancestor scientific methods as COCOMO, COSYSMO, maintainability index, cyclomatic complexity, function points, and Halstead complexity... |
WBS-based Work breakdown structure A work breakdown structure , in project management and systems engineering, is a deliverable oriented decomposition of a project into smaller components. It defines and groups a project's discrete work elements in a way that helps organize and define the total work scope of the project.A work... (bottom up) estimation |
Expert estimation | Project management software Project management software Project management software is a term covering many types of software, including estimation and planning, scheduling, cost control and budget management, resource allocation, collaboration software, communication, quality management and documentation or administration systems, which are used to... , company specific activity templates |
Parametric models | Formal estimation model | COCOMO COCOMO **********************************************************************************************The Constructive Cost Model is an algorithmic software cost estimation model developed by Barry W. Boehm... , SLIM, SEER-SEM SEER-SEM SEER for Software is an algorithmic project management software application designed specifically to estimate, plan and monitor the effort and resources required for any type of software development and/or maintenance project... |
Size-based estimation models | Formal estimation model | Function Point Analysis, Use Case Use case In software engineering and systems engineering, a use case is a description of steps or actions between a user and a software system which leads the user towards something useful... Analysis, SSU (Software Size Unit), Story points-based estimation in Agile software development Agile software development Agile software development is a group of software development methodologies based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams... |
Group estimation | Expert estimation | Planning poker Planning poker Planning Poker, also called Scrum poker, is a consensus-based technique for estimating, mostly used to estimate effort or relative size of tasks in software development. It is a variation of the Wideband Delphi method... , Wideband Delphi Wideband delphi The Wideband Delphi estimation method is a consensus-based technique for estimating effort. It derives from the Delphi Method which was developed in the 1940s at the RAND Corporation as a forecasting tool... |
Mechanical combination | Combination-based estimation | Average of an analogy-based and a Work breakdown structure Work breakdown structure A work breakdown structure , in project management and systems engineering, is a deliverable oriented decomposition of a project into smaller components. It defines and groups a project's discrete work elements in a way that helps organize and define the total work scope of the project.A work... -based effort estimate |
Judgmental combination | Combination-based estimation | Expert judgment based on estimates from a parametric model and group estimation |
Selection of estimation approach
The evidence on differences in estimation accuracy of different estimation approaches and models suggest that there is no “best approach” and that the relative accuracy of one approach or model in comparison to another depends strongly on the context. This implies that different organizations benefit from different estimation approaches. Findings, summarized in, that may support the selection of estimation approach based on the expected accuracy of an approach include:
- Expert estimation is on average at least as accurate as model-based effort estimation. In particular, situations with unstable relationships and information of high importance not included in the model may suggest use of expert estimation. This assumes, of course, that experts with relevant experience are available.
- Formal estimation models not tailored to a particular organization’s own context, may be very inaccurate. Use of own historical data is consequently crucial if one cannot be sure that the estimation model’s core relationships (e.g., formula parameters) are based on similar project contexts.
- Formal estimation models may be particularly useful in situations where the model is tailored to the organization’s context (either through use of own historical data or that the model is derived from similar projects and contexts), and/or it is likely that the experts’ estimates will be subject to a strong degree of wishful thinking.
The most robust finding, in many forecasting domains, is that combination of estimates from independent sources, preferable applying different approaches, will on average improve the estimation accuracy
.
In addition, other factors such as ease of understanding and communicating the results of an approach, ease of use of an approach, cost of introduction of an approach should be considered in a selection process.
Assessing and interpreting the accuracy of effort estimates
The most common measures of the average estimation accuracy is the MMRE (Mean Magnitude of Relative Error), where MRE is defined as:MRE = |actual effort − estimated effort| / |actual effort|
This measure has been criticized
and there are several alternative measures, such as more symmetric measures
, Weighted Mean of Quartiles of relative errors (WMQ)
and Mean Variation from Estimate (MVFE).
A high estimation error cannot automatically be interpreted as an indicator of low estimation ability. Alternative, competing or complementing, reasons include low cost control of project, high complexity of development work, and more delivered functionality than originally estimated. A framework for improved use and interpretation of estimation error measurement is included in.
Psychological issues related to effort estimation
There are many psychological factors potentially explaining the strong tendency towards over-optimistic effort estimates that need to be dealt with to increase accuracy of effort estimates. These factors are essential even when using formal estimation models, because much of the input to these models is judgment-based. Factors that have been demonstrated to be important are: Wishful thinkingWishful thinking
Wishful thinking is the formation of beliefs and making decisions according to what might be pleasing to imagine instead of by appealing to evidence, rationality or reality...
, anchoring
Anchoring
Anchoring or focalism is a cognitive bias that describes the common human tendency to rely too heavily, or "anchor," on one trait or piece of information when making decisions.-Background:...
, planning fallacy
Planning fallacy
The planning fallacy is a tendency for people and organizations to underestimate how long they will need to complete a task, even when they have experience of similar tasks over-running. The term was first proposed in a 1979 paper by Daniel Kahneman and Amos Tversky...
and cognitive dissonance
Cognitive dissonance
Cognitive dissonance is a discomfort caused by holding conflicting ideas simultaneously. The theory of cognitive dissonance proposes that people have a motivational drive to reduce dissonance. They do this by changing their attitudes, beliefs, and actions. Dissonance is also reduced by justifying,...
. A discussion on these and other factors can be found in work by Jørgensen and Grimstad.
- It's easy to estimate what you know.
- It's hard to estimate what you know you don't know.
- It's very hard to estimate things that you don't know you don't know.
See also
- Software Cost Factors
- Estimation in software engineeringEstimation in software engineeringThe ability to accurately estimate the time and/or cost taken for a project to come in to its successful conclusion is a serious problem for software engineers. The use of a repeatable, clearly defined and well understood software development process has, in recent years, shown itself to be the...
- Wideband DelphiWideband delphiThe Wideband Delphi estimation method is a consensus-based technique for estimating effort. It derives from the Delphi Method which was developed in the 1940s at the RAND Corporation as a forecasting tool...
- Project managementProject managementProject management is the discipline of planning, organizing, securing, and managing resources to achieve specific goals. A project is a temporary endeavor with a defined beginning and end , undertaken to meet unique goals and objectives, typically to bring about beneficial change or added value...
- Planning pokerPlanning pokerPlanning Poker, also called Scrum poker, is a consensus-based technique for estimating, mostly used to estimate effort or relative size of tasks in software development. It is a variation of the Wideband Delphi method...
- Cost overrunCost overrunA cost overrun, also known as a cost increase or budget overrun, is an unexpected cost incurred in excess of a budgeted amount due to an under-estimation of the actual cost during budgeting...
- COCOMOCOCOMO**********************************************************************************************The Constructive Cost Model is an algorithmic software cost estimation model developed by Barry W. Boehm...
- SEER-SEMSEER-SEMSEER for Software is an algorithmic project management software application designed specifically to estimate, plan and monitor the effort and resources required for any type of software development and/or maintenance project...
- Function points
- Weighted Micro Function PointsWeighted Micro Function PointsWeighted Micro Function Points is a modern software sizing algorithm invented by Logical Solutions in 2009 which is a successor to solid ancestor scientific methods as COCOMO, COSYSMO, maintainability index, cyclomatic complexity, function points, and Halstead complexity...
- Story points
- Proxy-based estimatingProxy-based estimatingPROxy-Based Estimating is an estimating process used in the Personal Software Process to estimate size and effort.Proxy Based Estimating , is the estimation method introduced by Watts Humphrey...
- Putnam modelPutnam modelThe Putnam model is an empirical software effort estimation model.The original paper by Lawrence H. Putnam published in 1978 is seen as pioneering work in the field software process modelling....
- PERT
- Comparison of development estimation softwareComparison of development estimation softwareA comparison of notable Software development effort estimation software.-See also:* Software Sizing* Software metric* Software development effort estimation* Software parametric models* Cost estimation models...
- Gearing factor Function Points to ESLOC
External links
- Industry Productivity data for Input into Software Development Estimates and guidance and tools for Estimation - International Software Benchmarking Standards Group: http://www.isbsg.org
- Free first-order benchmarking utility from Software Benchmarking Organization: http://www.sw-benchmarking.org/report.php
- Special Interest Group on Software Effort Estimation: http://www.forecastingprinciples.com/Software_Estimation/index.html
- General forecasting principles: http://www.forecastingprinciples.com
- Project estimation tools: http://www.projectmanagementguides.com/TOOLS/project_estimation_tools.html
- Downloadable research papers on effort estimation: http://simula.no/research/engineering/projects/best
- Mike Cohn's Estimating With Use Case Points from article from Methods & Tools: http://www.methodsandtools.com/archive/archive.php?id=25
- Resources on Software Estimation from Steve McConnell: http://www.construx.com/Page.aspx?nid=297
- Resources on Software Estimation from Dan GalorathDan GalorathDaniel D. Galorath is the President and CEO of Galorath Incorporated and the chief architect of SEER-SEM, an algorithmic project management software application...
: http://www.galorath.com/wp/