Cohesion (computer science)
Encyclopedia
In computer programming
, cohesion is a measure of how strongly-related each piece of functionality expressed by the source code of a software module is. Methods of measuring cohesion vary from qualitative measures classifying the source text being analyzed using a rubric
with a hermeneutics approach to quantitative measures which examine textual characteristics of the source code to arrive at a numerical cohesion score. Cohesion is an ordinal type of measurement
and is usually expressed as “high cohesion” or “low cohesion” when being discussed. Modules with high cohesion tend to be preferable because high cohesion is associated with several desirable traits of software including robustness
, reliability, reusability, and understandability whereas low cohesion is associated with undesirable traits such as being difficult to maintain, difficult to test, difficult to reuse, and even difficult to understand.
Cohesion is often contrasted with coupling
, a different concept. Nonetheless high cohesion often correlates with loose coupling
, and vice versa. The software quality metrics of coupling and cohesion were invented by Larry Constantine
based on characteristics of “good” programming practices that reduced maintenance and modification costs.
, cohesion is a measure of how strongly-related or focused the responsibilities of a single module are. As applied to object-oriented programming
, if the methods that serve the given class tend to be similar in many aspects, then the class is said to have high cohesion. In a highly-cohesive system, code readability and the likelihood of reuse is increased, while complexity is kept manageable.
Cohesion is decreased if:
Disadvantages of low cohesion (or “weak cohesion”) are:
to determine a cohesion classification. The types of cohesion, in order of the worst to the best type, are as follows:
Coincidental cohesion (worst): Coincidental cohesion is when parts of a module are grouped arbitrarily; the only relationship between the parts is that they have been grouped together (e.g. a “Utilities” class).
Logical cohesion: Logical cohesion is when parts of a module are grouped because they logically are categorized to do the same thing, even if they are different by nature (e.g. grouping all mouse and keyboard input handling routines).
Temporal cohesion: Temporal cohesion is when parts of a module are grouped by when they are processed - the parts are processed at a particular time in program execution (e.g. a function which is called after catching an exception which closes open files, creates an error log, and notifies the user).
Procedural cohesion: Procedural cohesion is when parts of a module are grouped because they always follow a certain sequence of execution (e.g. a function which checks file permissions and then opens the file).
Communicational cohesion: Communicational cohesion is when parts of a module are grouped because they operate on the same data (e.g. a module which operates on the same record of information).
Sequential cohesion: Sequential cohesion is when parts of a module are grouped because the output from one part is the input to another part like an assembly line (e.g. a function which reads data from a file and processes the data).
Functional cohesion (best): Functional cohesion is when parts of a module are grouped because they all contribute to a single well-defined task of the module (e.g. tokenizing a string of XML).
Although cohesion is a ranking type of scale, the ranks do not indicate a steady progression of improved cohesion. Studies by various people including Larry Constantine
, Edward Yourdon
, and Steve McConnell
indicate that the first two types of cohesion are inferior; communicational and sequential cohesion are very good; and functional cohesion is superior.
While functional cohesion is considered the most desirable type of cohesion for a software module, it may not be achievable. There are cases where communicational cohesion is the highest level of cohesion that can be attained under the circumstances.
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...
, cohesion is a measure of how strongly-related each piece of functionality expressed by the source code of a software module is. Methods of measuring cohesion vary from qualitative measures classifying the source text being analyzed using a rubric
Rubric
A rubric is a word or section of text which is traditionally written or printed in red ink to highlight it. The word derives from the , meaning red ochre or red chalk, and originates in Medieval illuminated manuscripts from the 13th century or earlier...
with a hermeneutics approach to quantitative measures which examine textual characteristics of the source code to arrive at a numerical cohesion score. Cohesion is an ordinal type of measurement
Measurement
Measurement is the process or the result of determining the ratio of a physical quantity, such as a length, time, temperature etc., to a unit of measurement, such as the metre, second or degree Celsius...
and is usually expressed as “high cohesion” or “low cohesion” when being discussed. Modules with high cohesion tend to be preferable because high cohesion is associated with several desirable traits of software including robustness
Robustness (computer science)
In computer science, robustness is the ability of a computer system to cope with errors during execution or the ability of an algorithm to continue to operate despite abnormalities in input, calculations, etc. Formal techniques, such as fuzz testing, are essential to showing robustness since this...
, reliability, reusability, and understandability whereas low cohesion is associated with undesirable traits such as being difficult to maintain, difficult to test, difficult to reuse, and even difficult to understand.
Cohesion is often contrasted with coupling
Coupling (computer science)
In computer science, coupling or dependency is the degree to which each program module relies on each one of the other modules.Coupling is usually contrasted with cohesion. Low coupling often correlates with high cohesion, and vice versa...
, a different concept. Nonetheless high cohesion often correlates with loose coupling
Loose coupling
In computing and systems design a loosely coupled system is one where each of its components has, or makes use of, little or no knowledge of the definitions of other separate components. The notion was introduced into organizational studies by Karl Weick...
, and vice versa. The software quality metrics of coupling and cohesion were invented by Larry Constantine
Larry Constantine
Larry LeRoy Constantine is an American software engineer and professor in the Mathematics and Engineering Department at the University of Madeira Portugal, who is considered one of the pioneers of computing...
based on characteristics of “good” programming practices that reduced maintenance and modification costs.
High cohesion
In computer programmingComputer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...
, cohesion is a measure of how strongly-related or focused the responsibilities of a single module are. As applied to object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...
, if the methods that serve the given class tend to be similar in many aspects, then the class is said to have high cohesion. In a highly-cohesive system, code readability and the likelihood of reuse is increased, while complexity is kept manageable.
Cohesion is decreased if:
- The functionalities embedded in a class, accessed through its methods, have little in common.
- Methods carry out many varied activities, often using coarsely-grainedGranularityGranularity is the extent to which a system is broken down into small parts, either the system itself or its description or observation. It is the "extent to which a larger entity is subdivided...
or unrelated sets of data.
Disadvantages of low cohesion (or “weak cohesion”) are:
- Increased difficulty in understanding modules.
- Increased difficulty in maintaining a system, because logical changes in the domain affect multiple modules, and because changes in one module require changes in related modules.
- Increased difficulty in reusing a module because most applications won’t need the random set of operations provided by a module.
Types of cohesion
Cohesion is a qualitative measure meaning that the source code text to be measured is examined using a rubricRubric (academic)
A rubric is an assessment tool for communicating expectations of quality. Rubrics support student self-reflection and self-assessment as well as communication between assessor and assessees...
to determine a cohesion classification. The types of cohesion, in order of the worst to the best type, are as follows:
Coincidental cohesion (worst): Coincidental cohesion is when parts of a module are grouped arbitrarily; the only relationship between the parts is that they have been grouped together (e.g. a “Utilities” class).
Logical cohesion: Logical cohesion is when parts of a module are grouped because they logically are categorized to do the same thing, even if they are different by nature (e.g. grouping all mouse and keyboard input handling routines).
Temporal cohesion: Temporal cohesion is when parts of a module are grouped by when they are processed - the parts are processed at a particular time in program execution (e.g. a function which is called after catching an exception which closes open files, creates an error log, and notifies the user).
Procedural cohesion: Procedural cohesion is when parts of a module are grouped because they always follow a certain sequence of execution (e.g. a function which checks file permissions and then opens the file).
Communicational cohesion: Communicational cohesion is when parts of a module are grouped because they operate on the same data (e.g. a module which operates on the same record of information).
Sequential cohesion: Sequential cohesion is when parts of a module are grouped because the output from one part is the input to another part like an assembly line (e.g. a function which reads data from a file and processes the data).
Functional cohesion (best): Functional cohesion is when parts of a module are grouped because they all contribute to a single well-defined task of the module (e.g. tokenizing a string of XML).
Although cohesion is a ranking type of scale, the ranks do not indicate a steady progression of improved cohesion. Studies by various people including Larry Constantine
Larry Constantine
Larry LeRoy Constantine is an American software engineer and professor in the Mathematics and Engineering Department at the University of Madeira Portugal, who is considered one of the pioneers of computing...
, Edward Yourdon
Edward Yourdon
Edward Nash Yourdon is an American software engineer, computer consultant, author and lecturer, and pioneer in the software engineering methodology...
, and Steve McConnell
Steve McConnell
Steven C. McConnell is an author of many software engineering textbooks including Code Complete, Rapid Development, and Software Estimation...
indicate that the first two types of cohesion are inferior; communicational and sequential cohesion are very good; and functional cohesion is superior.
While functional cohesion is considered the most desirable type of cohesion for a software module, it may not be achievable. There are cases where communicational cohesion is the highest level of cohesion that can be attained under the circumstances.
See also
- Coupling (computer science)Coupling (computer science)In computer science, coupling or dependency is the degree to which each program module relies on each one of the other modules.Coupling is usually contrasted with cohesion. Low coupling often correlates with high cohesion, and vice versa...
- List of object-oriented programming terms
- Static code analysisStatic code analysisStatic program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...
External links
- Definitions of Cohesion metrics
- Cohesion metrics
- NDepend Software metrics for .NET
- CPPDepend Software metrics for Microsoft Visual Studio C++
- JavaDepend Software metrics for Java
- SemmleCode - A code querying tool