Gini coefficient

Overview

**Gini coefficient**is a measure of statistical dispersion developed by the Italian

Italian people

The Italian people are an ethnic group that share a common Italian culture, ancestry and speak the Italian language as a mother tongue. Within Italy, Italians are defined by citizenship, regardless of ancestry or country of residence , and are distinguished from people...

statistician

Statistics

Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

and sociologist

Sociology

Sociology is the study of society. It is a social science—a term with which it is sometimes synonymous—which uses various methods of empirical investigation and critical analysis to develop a body of knowledge about human social activity...

Corrado Gini

Corrado Gini

Corrado Gini was an Italian statistician, demographer and sociologist who developed the Gini coefficient, a measure of the income inequality in a society. Gini was also a leading fascist theorist and ideologue who wrote The Scientific Basis of Fascism in 1927...

and published in his 1912 paper "Variability and Mutability" .

The Gini coefficient is a measure of the inequality of a distribution, a value of 0 expressing perfect equality and a value of 1 maximal inequality. It has found application in the study of inequalities in disciplines as diverse as sociology

Sociology

Sociology is the study of society. It is a social science—a term with which it is sometimes synonymous—which uses various methods of empirical investigation and critical analysis to develop a body of knowledge about human social activity...

, economics

Economics

Economics is the social science that analyzes the production, distribution, and consumption of goods and services. The term economics comes from the Ancient Greek from + , hence "rules of the house"...

, health science, ecology

Ecology

Ecology is the scientific study of the relations that living organisms have with respect to each other and their natural environment. Variables of interest to ecologists include the composition, distribution, amount , number, and changing states of organisms within and among ecosystems...

, chemistry

Chemistry

Chemistry is the science of matter, especially its chemical reactions, but also its composition, structure and properties. Chemistry is concerned with atoms and their interactions with other atoms, and particularly with the properties of chemical bonds....

, engineering

Engineering

Engineering is the discipline, art, skill and profession of acquiring and applying scientific, mathematical, economic, social, and practical knowledge, in order to design and build structures, machines, devices, systems, materials and processes that safely realize improvements to the lives of...

and agriculture

Agriculture

Agriculture is the cultivation of animals, plants, fungi and other life forms for food, fiber, and other products used to sustain life. Agriculture was the key implement in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that nurtured the...

.

It is commonly used as a measure of inequality of income

Income inequality metrics

The concept of inequality is distinct from that of poverty and fairness. Income inequality metrics or income distribution metrics are used by social scientists to measure the distribution of income, and economic inequality among the participants in a particular economy, such as that of a specific...

or wealth.

Unanswered Questions

Encyclopedia

The

statistician

and sociologist

Corrado Gini

and published in his 1912 paper "Variability and Mutability" .

The Gini coefficient is a measure of the inequality of a distribution, a value of 0 expressing perfect equality and a value of 1 maximal inequality. It has found application in the study of inequalities in disciplines as diverse as sociologySociology is the study of society. It is a social science—a term with which it is sometimes synonymous—which uses various methods of empirical investigation and critical analysis to develop a body of knowledge about human social activity...

, economics

, health science, ecology

, chemistry

, engineering

and agriculture

.

It is commonly used as a measure of inequality of income

or wealth. Worldwide, Gini coefficients for income range from approximately 0.23 (Sweden) to 0.70 (Namibia) although not every country has been assessed.

based on the Lorenz curve

, which plots the proportion of the total income of the population (y axis) that is cumulatively earned by the bottom x% of the population (see diagram). The line at 45 degrees thus represents perfect equality of incomes. The Gini coefficient can then be thought of as the ratio

of the area

that lies between the line of equality and the Lorenz curve

(marked 'A' in the diagram) over the total area under the line of equality (marked 'A' and 'B' in the diagram); i.e., G=A/(A+B).

The Gini coefficient can range from 0 to 1; it is sometimes expressed as a percentage ranging between 0 and 100. More specifically, the upper bound of the Gini coefficient equals 1 only in populations of infinite size. In a population of size N, the upper bound is equal to 1 − 2 / (N + 1).

A low Gini coefficient indicates a more equal distribution, with 0 corresponding to complete equality, while higher Gini coefficients indicate more unequal distribution, with 1 corresponding to complete inequality. To be validly computed, no negative goods can be distributed. Thus, if the Gini coefficient is being used to describe household income

inequality, then no household

can have a negative income. When used as a measure of income inequality, the most unequal society will be one in which a single person receives 100% of the total income and the remaining people receive none (G=1); and the most equal society will be one in which every person receives the same income (G=0).

Some find it more intuitive (and it is mathematically equivalent) to think of the Gini coefficient as half of the relative mean difference

. The mean difference is the average absolute difference

between two items selected randomly from a population, and the relative mean difference is the mean difference divided by the average, to normalize for scale.

diagram. If the area between the line of perfect equality and the Lorenz curve is A, and the area under the Lorenz curve is B, then the Gini index is A/(A+B). Since A+B = 0.5, the Gini index, G = A/(0.5) = 2A = 1-2B. If the Lorenz curve is represented by the function Y = L(X), the value of B can be found with integration

and:

In some cases, this equation can be applied to calculate the Gini coefficient without direct reference to the Lorenz curve. For example:

.

There does not exist a sample statistic that is in general an unbiased estimator of the population Gini coefficient, like the relative mean difference.

For some functional forms, the Gini index can be calculated explicitly. For example, if y follows a lognormal distribution with the standard deviation of logs equal to , then where is the cumulative distribution function

of the standard normal distribution.

Sometimes the entire Lorenz curve is not known, and only values at certain intervals are given. In that case, the Gini coefficient can be approximated by using various techniques for interpolating

the missing values of the Lorenz curve. If ( X

If the Lorenz curve is approximated on each interval as a line between consecutive points, then the area B can be approximated with trapezoids and:

is the resulting approximation for G. More accurate results can be obtained using other methods to approximate the area

B, such as approximating the Lorenz curve with a quadratic function

across pairs of intervals, or building an appropriately smooth approximation to the underlying distribution function that matches the known data. If the population mean and boundary values for each interval are also known, these can also often be used to improve the accuracy of the approximation.

The Gini coefficient calculated from a sample is a statistic and its standard error, or confidence intervals for the population Gini coefficient, should be reported. These can be calculated using bootstrap techniques but those proposed have been mathematically complicated and computationally onerous even in an era of fast computers. Ogwang (2000) made the process more efficient by setting up a “trick regression model” in which the incomes in the sample are ranked with the lowest income being allocated rank 1. The model then expresses the rank (dependent variable) as the sum of a constant A and a normal error term whose variance is inversely proportional to y

Ogwang showed that G can be expressed as a function of the weighted least squares estimate of the constant A and that this can be used to speed up the calculation of the jackknife estimate for the standard error. Giles (2004) argued that the standard error of the estimate of A can be used to derive that of the estimate of G directly without using a jackknife at all. This method only requires the use of ordinary least squares regression after ordering the sample data. The results compare favorably with the estimates from the jackknife with agreement improving with increasing sample size. The paper describing this method can be found here: http://web.uvic.ca/econ/ewp0202.pdf

However it has since been argued that this is dependent on the model’s assumptions about the error distributions (Ogwang 2004) and the independence of error terms (Reza & Gastwirth 2006) and that these assumptions are often not valid for real data sets. It may therefore be better to stick with jackknife methods such as those proposed by Yitzhaki (1991) and Karagiannis and Kovacevic (2000). The debate continues.

The Gini coefficient can be calculated if you know the mean of a distribution, the number of people (or percentiles), and the income of each person (or percentile). Princeton

development economist

Angus Deaton

(1997, 139) simplified the Gini calculation to one easy formula:

where u is mean income of the population, P

where p

tend to have Gini indices between 0.24 and 0.36, the United States' and Mexico's Gini indices are both above 0.40, indicating that the United States

(according to the US Census Bureau

) and Mexico

have greater inequality. Using the Gini can help quantify differences in welfare and compensation

policies and philosophies. However it should be borne in mind that the Gini coefficient can be misleading when used to make political comparisons between large and small countries (see criticisms section).

The Gini index for the entire world has been estimated by various parties to be between 0.56 and 0.66. The graph shows the values expressed as a percentage, in their historical development for a number of countries.

or gross domestic product

. The simplicity of Gini makes it easy to use for comparison across diverse countries and also allows comparison of income distributions across different groups as well as countries; for example the Gini coefficient for urban areas differs from that of rural areas in many countries (though not in the United States). Like any time-based measure, Gini coefficients can be used to compare income distribution over time, thus it is possible to see if inequality is increasing or decreasing independent of absolute incomes. The Gini coefficient satisfies four principles suggested to be important:

In addition, Gini does not address causes: income equality may reflect differences in opportunity, or capability. For example, some countries may have a social class structure

that presents barriers to upward mobility; some people may have more skills than others.

By measuring inequality in income, the Gini ignores the differential efficiency of use of household income. By ignoring wealth (except as it contributes to income) the Gini can create the appearance of inequality when the people compared are at different stages in their life. Wealthy countries (e.g. Sweden

) can appear more equal, yet have high Gini coefficients for wealth (for instance 77% of the share value owned by households is held by just 5% of Swedish shareholding households). These factors are not assessed in income-based Gini.

Gini has some mathematical limitations as well. For instance, different sets of people cannot be averaged to obtain the Gini coefficient of all the people in the sets: if a Gini coefficient were to be calculated for each person it would always be zero. For a large, economically diverse country, a much higher coefficient will be calculated for the country as a whole than will be calculated for each of its regions. (The coefficient is usually applied to measurable nominal income rather than local purchasing power

, tending to increase the calculated coefficient across larger areas.)

As is the case for any single measure of a distribution, economies with similar incomes and Gini coefficients can still have very different income distributions. This results from differing shapes of the Lorenz curve. For example, consider a society where half of individuals had no income and the other half shared all the income equally (i.e. whose Lorenz curve is linear from (0,0) to (0.5,0) and then linear to (1,1)). As is easily calculated, this society has Gini coefficient 0.5 -- the same as that of a society in which 75% of people equally shared 25% of income while the remaining 25% equally shared 75% (i.e. whose Lorenz curve is linear from (0,0) to (0.75,0.25) and then linear to (1,1)).

Too often only the Gini coefficient is quoted without describing the proportions of the quantiles used for measurement. As with other inequality coefficients, the Gini coefficient is influenced by the granularity of the measurements. For example, five 20% quantiles (low granularity) will usually yield a lower Gini coefficient than twenty 5% quantiles (high granularity) taken from the same distribution. This is an often encountered problem with measurements.

Care should be taken in using the Gini coefficient as a measure of egalitarianism

, as it is properly a measure of income dispersion. For example, if two equally egalitarian countries pursue different immigration policies, the country accepting a higher proportion of low-income or impoverished migrants will be assessed as less equal (gain a higher Gini coefficient).

Expanding on the importance of life-span measures, the Gini coefficient as a point-estimate of equality at a certain time, ignores life-span changes in income. Typically, increases in the proportion of young or old members of a society will drive apparent changes in equality, simply because people generally have lower incomes and wealth when they are young than when they are old. Because of this, factors such as age distribution within a population and mobility within income classes can create the appearance of differential equality when none exist taking into account demographic effects. Thus a given economy may have a higher Gini coefficient at any one point in time compared to another, while the Gini coefficient calculated over individuals' lifetime income is actually lower than the apparently more equal (at a given point in time) economy's. Essentially, what matters is not just inequality in any particular year, but the composition of the distribution over time.

As one result of this criticism, in addition to or in competition with the Gini coefficient entropy measures are frequently used (e.g. the Theil Index

and the Atkinson index

). These measures attempt to compare the distribution of resources by intelligent agents in the market with a maximum entropy

random distribution, which would occur if these agents acted like non-intelligent particles in a closed system following the laws of statistical physics.

systems in credit risk

management.

The discriminatory power refers to a credit risk model's ability to differentiate between defaulting and non-defaulting clients. The above formula may be used for the final model and also at individual model factor level, to quantify the discriminatory power of individual factors. This is as a result of too many non defaulting clients falling into the lower points scale e.g. factor has a 10 point scale and 30% of non defaulting clients are being assigned the lowest points available e.g. 0 or negative points. This indicates that the factor is behaving in a counter-intuitive manner and would require further investigation at the model development stage.

, where the cumulative proportion of species is plotted against cumulative proportion of individuals. In health, it has been used as a measure of the inequality of health related quality of life

in a population. In education, it has been used as a measure of the inequality of universities. In chemistry it has been used to express the selectivity of protein kinase inhibitors against a panel of kinases. In engineering, it has been used to evaluate the fairness achieved by Internet routers in scheduling packet transmissions from different flows of traffic. In statistics, building decision trees, it is used to measure the purity of possible child nodes, with the aim of maximising the average purity of two child nodes when splitting, and it has been compared with other equality measures.

**Gini coefficient**is a measure of statistical dispersion developed by the ItalianItalian people

The Italian people are an ethnic group that share a common Italian culture, ancestry and speak the Italian language as a mother tongue. Within Italy, Italians are defined by citizenship, regardless of ancestry or country of residence , and are distinguished from people...

statistician

Statistics

Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

and sociologist

Sociology

Sociology is the study of society. It is a social science—a term with which it is sometimes synonymous—which uses various methods of empirical investigation and critical analysis to develop a body of knowledge about human social activity...

Corrado Gini

Corrado Gini

Corrado Gini was an Italian statistician, demographer and sociologist who developed the Gini coefficient, a measure of the income inequality in a society. Gini was also a leading fascist theorist and ideologue who wrote The Scientific Basis of Fascism in 1927...

and published in his 1912 paper "Variability and Mutability" .

The Gini coefficient is a measure of the inequality of a distribution, a value of 0 expressing perfect equality and a value of 1 maximal inequality. It has found application in the study of inequalities in disciplines as diverse as sociology

Sociology

, economics

Economics

Economics is the social science that analyzes the production, distribution, and consumption of goods and services. The term economics comes from the Ancient Greek from + , hence "rules of the house"...

, health science, ecology

Ecology

Ecology is the scientific study of the relations that living organisms have with respect to each other and their natural environment. Variables of interest to ecologists include the composition, distribution, amount , number, and changing states of organisms within and among ecosystems...

, chemistry

Chemistry

Chemistry is the science of matter, especially its chemical reactions, but also its composition, structure and properties. Chemistry is concerned with atoms and their interactions with other atoms, and particularly with the properties of chemical bonds....

, engineering

Engineering

Engineering is the discipline, art, skill and profession of acquiring and applying scientific, mathematical, economic, social, and practical knowledge, in order to design and build structures, machines, devices, systems, materials and processes that safely realize improvements to the lives of...

and agriculture

Agriculture

Agriculture is the cultivation of animals, plants, fungi and other life forms for food, fiber, and other products used to sustain life. Agriculture was the key implement in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that nurtured the...

.

It is commonly used as a measure of inequality of income

Income inequality metrics

The concept of inequality is distinct from that of poverty and fairness. Income inequality metrics or income distribution metrics are used by social scientists to measure the distribution of income, and economic inequality among the participants in a particular economy, such as that of a specific...

or wealth. Worldwide, Gini coefficients for income range from approximately 0.23 (Sweden) to 0.70 (Namibia) although not every country has been assessed.

## Definition

The Gini coefficient is usually defined mathematicallyMathematics

Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

based on the Lorenz curve

Lorenz curve

In economics, the Lorenz curve is a graphical representation of the cumulative distribution function of the empirical probability distribution of wealth; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values...

, which plots the proportion of the total income of the population (y axis) that is cumulatively earned by the bottom x% of the population (see diagram). The line at 45 degrees thus represents perfect equality of incomes. The Gini coefficient can then be thought of as the ratio

Ratio

In mathematics, a ratio is a relationship between two numbers of the same kind , usually expressed as "a to b" or a:b, sometimes expressed arithmetically as a dimensionless quotient of the two which explicitly indicates how many times the first number contains the second In mathematics, a ratio is...

of the area

Area

Area is a quantity that expresses the extent of a two-dimensional surface or shape in the plane. Area can be understood as the amount of material with a given thickness that would be necessary to fashion a model of the shape, or the amount of paint necessary to cover the surface with a single coat...

that lies between the line of equality and the Lorenz curve

Lorenz curve

In economics, the Lorenz curve is a graphical representation of the cumulative distribution function of the empirical probability distribution of wealth; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values...

(marked 'A' in the diagram) over the total area under the line of equality (marked 'A' and 'B' in the diagram); i.e., G=A/(A+B).

The Gini coefficient can range from 0 to 1; it is sometimes expressed as a percentage ranging between 0 and 100. More specifically, the upper bound of the Gini coefficient equals 1 only in populations of infinite size. In a population of size N, the upper bound is equal to 1 − 2 / (N + 1).

A low Gini coefficient indicates a more equal distribution, with 0 corresponding to complete equality, while higher Gini coefficients indicate more unequal distribution, with 1 corresponding to complete inequality. To be validly computed, no negative goods can be distributed. Thus, if the Gini coefficient is being used to describe household income

Household income

Household income is a measure of the combined incomes of all people sharing a particular household or place of residence. It includes every form of income, e.g., salaries and wages, retirement income, near cash government transfers like food stamps, and investment gains.Average household income can...

inequality, then no household

Household

The household is "the basic residential unit in which economic production, consumption, inheritance, child rearing, and shelter are organized and carried out"; [the household] "may or may not be synonymous with family"....

can have a negative income. When used as a measure of income inequality, the most unequal society will be one in which a single person receives 100% of the total income and the remaining people receive none (G=1); and the most equal society will be one in which every person receives the same income (G=0).

Some find it more intuitive (and it is mathematically equivalent) to think of the Gini coefficient as half of the relative mean difference

Mean difference

The mean difference is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean difference, which is the mean difference divided by the arithmetic mean...

. The mean difference is the average absolute difference

Absolute difference

The absolute difference of two real numbers x, y is given by |x − y|, the absolute value of their difference. It describes the distance on the real line between the points corresponding to x and y...

between two items selected randomly from a population, and the relative mean difference is the mean difference divided by the average, to normalize for scale.

## Calculation

The Gini index is defined as a ratio of the areas on the Lorenz curveLorenz curve

In economics, the Lorenz curve is a graphical representation of the cumulative distribution function of the empirical probability distribution of wealth; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values...

diagram. If the area between the line of perfect equality and the Lorenz curve is A, and the area under the Lorenz curve is B, then the Gini index is A/(A+B). Since A+B = 0.5, the Gini index, G = A/(0.5) = 2A = 1-2B. If the Lorenz curve is represented by the function Y = L(X), the value of B can be found with integration

Integral

Integration is an important concept in mathematics and, together with its inverse, differentiation, is one of the two main operations in calculus...

and:

In some cases, this equation can be applied to calculate the Gini coefficient without direct reference to the Lorenz curve. For example:

- For a population uniform on the values y
_{i}, i = 1 to n, indexed in non-decreasing order ( y_{i}≤ y_{i+1}):

- This may be simplified to:

- For a discrete probability function f(y), where y
_{i}, i = 1 to n, are the points with nonzero probabilities and which are indexed in increasing order ( y_{i}< y_{i+1}):

- where and

- For a cumulative distribution functionCumulative distribution functionIn probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

F(y) that is piecewisePiecewiseOn mathematics, a piecewise-defined function is a function whose definition changes depending on the value of the independent variable...

differentiable, has a meanMeanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

μ, and is zero for all negative values of y:

- Since the Gini coefficient is half the relative mean difference, it can also be calculated using formulas for the relative mean difference. For a random sample S consisting of values y
_{i}, i = 1 to n, that are indexed in non-decreasing order ( y_{i}≤ y_{i+1}), the statistic:

- is a consistent estimatorConsistent estimatorIn statistics, a sequence of estimators for parameter θ0 is said to be consistent if this sequence converges in probability to θ0...

of the population Gini coefficient, but is not, in general, unbiased. Like G, G(S) has a simpler form:

.

There does not exist a sample statistic that is in general an unbiased estimator of the population Gini coefficient, like the relative mean difference.

For some functional forms, the Gini index can be calculated explicitly. For example, if y follows a lognormal distribution with the standard deviation of logs equal to , then where is the cumulative distribution function

Cumulative distribution function

In probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

of the standard normal distribution.

Sometimes the entire Lorenz curve is not known, and only values at certain intervals are given. In that case, the Gini coefficient can be approximated by using various techniques for interpolating

Interpolation

In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points....

the missing values of the Lorenz curve. If ( X

_{ k}, Y_{k}) are the known points on the Lorenz curve, with the X_{ k}indexed in increasing order ( X_{ k - 1}< X_{ k}), so that:- X
_{k}is the cumulated proportion of the population variable, for k = 0,...,n, with X_{0}= 0, X_{n}= 1. - Y
_{k}is the cumulated proportion of the income variable, for k = 0,...,n, with Y_{0}= 0, Y_{n}= 1. - Y
_{k}should be indexed in non-decreasing order (Y_{k}>Y_{k-1})

If the Lorenz curve is approximated on each interval as a line between consecutive points, then the area B can be approximated with trapezoids and:

is the resulting approximation for G. More accurate results can be obtained using other methods to approximate the area

Numerical integration

In numerical analysis, numerical integration constitutes a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations. This article focuses on calculation of...

B, such as approximating the Lorenz curve with a quadratic function

Simpson's rule

In numerical analysis, Simpson's rule is a method for numerical integration, the numerical approximation of definite integrals. Specifically, it is the following approximation:...

across pairs of intervals, or building an appropriately smooth approximation to the underlying distribution function that matches the known data. If the population mean and boundary values for each interval are also known, these can also often be used to improve the accuracy of the approximation.

The Gini coefficient calculated from a sample is a statistic and its standard error, or confidence intervals for the population Gini coefficient, should be reported. These can be calculated using bootstrap techniques but those proposed have been mathematically complicated and computationally onerous even in an era of fast computers. Ogwang (2000) made the process more efficient by setting up a “trick regression model” in which the incomes in the sample are ranked with the lowest income being allocated rank 1. The model then expresses the rank (dependent variable) as the sum of a constant A and a normal error term whose variance is inversely proportional to y

_{k};Ogwang showed that G can be expressed as a function of the weighted least squares estimate of the constant A and that this can be used to speed up the calculation of the jackknife estimate for the standard error. Giles (2004) argued that the standard error of the estimate of A can be used to derive that of the estimate of G directly without using a jackknife at all. This method only requires the use of ordinary least squares regression after ordering the sample data. The results compare favorably with the estimates from the jackknife with agreement improving with increasing sample size. The paper describing this method can be found here: http://web.uvic.ca/econ/ewp0202.pdf

However it has since been argued that this is dependent on the model’s assumptions about the error distributions (Ogwang 2004) and the independence of error terms (Reza & Gastwirth 2006) and that these assumptions are often not valid for real data sets. It may therefore be better to stick with jackknife methods such as those proposed by Yitzhaki (1991) and Karagiannis and Kovacevic (2000). The debate continues.

The Gini coefficient can be calculated if you know the mean of a distribution, the number of people (or percentiles), and the income of each person (or percentile). Princeton

Princeton University

Princeton University is a private research university located in Princeton, New Jersey, United States. The school is one of the eight universities of the Ivy League, and is one of the nine Colonial Colleges founded before the American Revolution....

development economist

Development economics

Development Economics is a branch of economics which deals with economic aspects of the development process in low-income countries. Its focus is not only on methods of promoting economic growth and structural change but also on improving the potential for the mass of the population, for example,...

Angus Deaton

Angus Deaton

Angus Stewart Deaton is a leading microeconomist. He was educated at Fettes College in Edinburgh, where he was a Foundation Scholar, and earned his B.A., M.A., and Ph.D...

(1997, 139) simplified the Gini calculation to one easy formula:

where u is mean income of the population, P

_{i}is the income rank P of person i, with income X, such that the richest person receives a rank of 1 and the poorest a rank of N. This effectively gives higher weight to poorer people in the income distribution, which allows the Gini to meet the Transfer Principle.## Generalised inequality index

The Gini coefficient and other standard inequality indices reduce to a common form. Perfect equality—the absence of inequality—exists when and only when the inequality ratio, , equals 1 for all j units in some population; for example, there is perfect income equality when everyone’s income equals the mean income , so that for everyone). Measures of inequality, then, are measures of the average deviations of the from 1; the greater the average deviation, the greater the inequality. Based on these observations the inequality indices have this common form:where p

_{j}weights the units by their population share, and f(r_{j}) is a function of the deviation of each unit’s r_{j}from 1, the point of equality. The insight of this generalised inequality index is that inequality indices differ because they employ different functions of the distance of the inequality ratios (the r_{j}) from 1.## Gini coefficient of income distributions

While developed European nations and CanadaCanada

Canada is a North American country consisting of ten provinces and three territories. Located in the northern part of the continent, it extends from the Atlantic Ocean in the east to the Pacific Ocean in the west, and northward into the Arctic Ocean...

tend to have Gini indices between 0.24 and 0.36, the United States' and Mexico's Gini indices are both above 0.40, indicating that the United States

United States

The United States of America is a federal constitutional republic comprising fifty states and a federal district...

(according to the US Census Bureau

United States Census Bureau

The United States Census Bureau is the government agency that is responsible for the United States Census. It also gathers other national demographic and economic data...

) and Mexico

Economy of Mexico

The economy of Mexico is the 13th largest in the world in nominal terms and the 11th by purchasing power parity, according to the World Bank.Since the 1994 crisis, administrations have improved the country's macroeconomic fundamentals...

have greater inequality. Using the Gini can help quantify differences in welfare and compensation

Living wage

In public policy, a living wage is the minimum hourly income necessary for a worker to meet basic needs . These needs include shelter and other incidentals such as clothing and nutrition...

policies and philosophies. However it should be borne in mind that the Gini coefficient can be misleading when used to make political comparisons between large and small countries (see criticisms section).

The Gini index for the entire world has been estimated by various parties to be between 0.56 and 0.66. The graph shows the values expressed as a percentage, in their historical development for a number of countries.

## Advantages and disadvantages

### Advantages of Gini coefficient as a measure of inequality

The Gini coefficient's main advantage is that it is a measure of inequality by means of a ratio analysis. This makes it easily interpretable, and avoids references to a statistical average or position unrepresentative of most of the population, such as per capita incomePer capita income

Per capita income or income per person is a measure of mean income within an economic aggregate, such as a country or city. It is calculated by taking a measure of all sources of income in the aggregate and dividing it by the total population...

or gross domestic product

Gross domestic product

Gross domestic product refers to the market value of all final goods and services produced within a country in a given period. GDP per capita is often considered an indicator of a country's standard of living....

. The simplicity of Gini makes it easy to use for comparison across diverse countries and also allows comparison of income distributions across different groups as well as countries; for example the Gini coefficient for urban areas differs from that of rural areas in many countries (though not in the United States). Like any time-based measure, Gini coefficients can be used to compare income distribution over time, thus it is possible to see if inequality is increasing or decreasing independent of absolute incomes. The Gini coefficient satisfies four principles suggested to be important:

- Anonymity: it does not matter who the high and low earners are.
- Scale independence: the Gini coefficient does not consider the size of the economy, the way it is measured, or whether it is a rich or poor country on average.
- Population independence: it does not matter how large the population of the country is.
- Transfer principle: if income (less than the difference), is transferred from a rich person to a poor person the resulting distribution is more equal.

### Disadvantages of Gini coefficient as a measure of inequality

The limitations of Gini largely lie in its relative nature: It loses information about absolute national and personal incomes. Countries may have identical Gini coefficients, but differ greatly in wealth. Basic necessities may be equal (available to all) in a rich country, while in the poor country, even basic necessities are unequally available.In addition, Gini does not address causes: income equality may reflect differences in opportunity, or capability. For example, some countries may have a social class structure

Social class

Social classes are economic or cultural arrangements of groups in society. Class is an essential object of analysis for sociologists, political scientists, economists, anthropologists and social historians. In the social sciences, social class is often discussed in terms of 'social stratification'...

that presents barriers to upward mobility; some people may have more skills than others.

By measuring inequality in income, the Gini ignores the differential efficiency of use of household income. By ignoring wealth (except as it contributes to income) the Gini can create the appearance of inequality when the people compared are at different stages in their life. Wealthy countries (e.g. Sweden

Sweden

Sweden , officially the Kingdom of Sweden , is a Nordic country on the Scandinavian Peninsula in Northern Europe. Sweden borders with Norway and Finland and is connected to Denmark by a bridge-tunnel across the Öresund....

) can appear more equal, yet have high Gini coefficients for wealth (for instance 77% of the share value owned by households is held by just 5% of Swedish shareholding households). These factors are not assessed in income-based Gini.

Gini has some mathematical limitations as well. For instance, different sets of people cannot be averaged to obtain the Gini coefficient of all the people in the sets: if a Gini coefficient were to be calculated for each person it would always be zero. For a large, economically diverse country, a much higher coefficient will be calculated for the country as a whole than will be calculated for each of its regions. (The coefficient is usually applied to measurable nominal income rather than local purchasing power

Purchasing power

Purchasing power is the number of goods/services that can be purchased with a unit of currency. For example, if you had taken one dollar to a store in the 1950s, you would have been able to buy a greater number of items than you would today, indicating that you would have had a greater purchasing...

, tending to increase the calculated coefficient across larger areas.)

As is the case for any single measure of a distribution, economies with similar incomes and Gini coefficients can still have very different income distributions. This results from differing shapes of the Lorenz curve. For example, consider a society where half of individuals had no income and the other half shared all the income equally (i.e. whose Lorenz curve is linear from (0,0) to (0.5,0) and then linear to (1,1)). As is easily calculated, this society has Gini coefficient 0.5 -- the same as that of a society in which 75% of people equally shared 25% of income while the remaining 25% equally shared 75% (i.e. whose Lorenz curve is linear from (0,0) to (0.75,0.25) and then linear to (1,1)).

Too often only the Gini coefficient is quoted without describing the proportions of the quantiles used for measurement. As with other inequality coefficients, the Gini coefficient is influenced by the granularity of the measurements. For example, five 20% quantiles (low granularity) will usually yield a lower Gini coefficient than twenty 5% quantiles (high granularity) taken from the same distribution. This is an often encountered problem with measurements.

Care should be taken in using the Gini coefficient as a measure of egalitarianism

Egalitarianism

Egalitarianism is a trend of thought that favors equality of some sort among moral agents, whether persons or animals. Emphasis is placed upon the fact that equality contains the idea of equity of quality...

, as it is properly a measure of income dispersion. For example, if two equally egalitarian countries pursue different immigration policies, the country accepting a higher proportion of low-income or impoverished migrants will be assessed as less equal (gain a higher Gini coefficient).

Expanding on the importance of life-span measures, the Gini coefficient as a point-estimate of equality at a certain time, ignores life-span changes in income. Typically, increases in the proportion of young or old members of a society will drive apparent changes in equality, simply because people generally have lower incomes and wealth when they are young than when they are old. Because of this, factors such as age distribution within a population and mobility within income classes can create the appearance of differential equality when none exist taking into account demographic effects. Thus a given economy may have a higher Gini coefficient at any one point in time compared to another, while the Gini coefficient calculated over individuals' lifetime income is actually lower than the apparently more equal (at a given point in time) economy's. Essentially, what matters is not just inequality in any particular year, but the composition of the distribution over time.

#### General problems of measurement

- Comparing income distributions among countries may be difficult because benefits systems may differ. For example, some countries give benefits in the form of money while others give food stamps, which might not be counted by some economists and researchers as income in the Lorenz curve and therefore not taken into account in the Gini coefficient. Income in the United States is counted before benefits, while in France it is counted after benefits, which may lead the United States to appear somewhat more unequal vis-a-vis France. In another example, the Soviet Union was measured to have relatively high income inequality: by some estimates, in the late 1970s, Gini coefficient of its urban population was as high as 0.38, which is higher than many Western countries today. This number would not reflect those benefits received by Soviet citizens that were not monetized for measurement, which may include child care for children as young as two months, elementary, secondary and higher education, cradle-to-grave medical care, and heavily subsidized or provided housing. In this example, a more accurate comparison between the 1970s Soviet Union and Western countries may require one to assign monetary values to all benefits – a difficult task in the absence of free markets. Similar problems arise whenever a comparison between more liberalized economies and partially socialist economies is attempted. Benefits may take various and unexpected forms: for example, major oil producers such as Venezuela and Iran provide indirect benefits to its citizens by subsidizing the retail price of gasoline.

- Similarly, in some societies people may have significant income in other forms than money, for example through subsistence farming or barterBarterBarter is a method of exchange by which goods or services are directly exchanged for other goods or services without using a medium of exchange, such as money. It is usually bilateral, but may be multilateral, and usually exists parallel to monetary systems in most developed countries, though to a...

ing. Like non-monetary benefits, the value of these incomes is difficult to quantify. Different quantifications of these incomes will yield different Gini coefficients.

- The measure will give different results when applied to individuals instead of households. When different populations are not measured with consistent definitions, comparison is not meaningful.

- As for all statistics, there may be systematic and random errors in the data. The meaning of the Gini coefficient decreases as the data become less accurate. Also, countries may collect data differently, making it difficult to compare statistics between countries.

As one result of this criticism, in addition to or in competition with the Gini coefficient entropy measures are frequently used (e.g. the Theil Index

Theil index

The Theil index is a statistic used to measure economic inequality. It has also been used to measure the lack of racial diversity. The basic Theil index TT is the same as redundancy in information theory which is the maximum possible entropy of the data minus the observed entropy. It is a special...

and the Atkinson index

Atkinson index

The Atkinson index is a measure of income inequality developed by British economist Anthony Barnes Atkinson...

). These measures attempt to compare the distribution of resources by intelligent agents in the market with a maximum entropy

Information entropy

In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits...

random distribution, which would occur if these agents acted like non-intelligent particles in a closed system following the laws of statistical physics.

#### Credit risk

The Gini coefficient is also commonly used for the measurement of the discriminatory power of ratingCredit rating

A credit rating evaluates the credit worthiness of an issuer of specific types of debt, specifically, debt issued by a business enterprise such as a corporation or a government. It is an evaluation made by a credit rating agency of the debt issuers likelihood of default. Credit ratings are...

systems in credit risk

Credit risk

Credit risk is an investor's risk of loss arising from a borrower who does not make payments as promised. Such an event is called a default. Other terms for credit risk are default risk and counterparty risk....

management.

The discriminatory power refers to a credit risk model's ability to differentiate between defaulting and non-defaulting clients. The above formula may be used for the final model and also at individual model factor level, to quantify the discriminatory power of individual factors. This is as a result of too many non defaulting clients falling into the lower points scale e.g. factor has a 10 point scale and 30% of non defaulting clients are being assigned the lowest points available e.g. 0 or negative points. This indicates that the factor is behaving in a counter-intuitive manner and would require further investigation at the model development stage.

## Other uses

Although the Gini coefficient is most popular in economics, it can in theory be applied in any field of science that studies a distribution. For example, in ecology the Gini coefficient has been used as a measure of biodiversityBiodiversity

Biodiversity is the degree of variation of life forms within a given ecosystem, biome, or an entire planet. Biodiversity is a measure of the health of ecosystems. Biodiversity is in part a function of climate. In terrestrial habitats, tropical regions are typically rich whereas polar regions...

, where the cumulative proportion of species is plotted against cumulative proportion of individuals. In health, it has been used as a measure of the inequality of health related quality of life

Quality of life

The term quality of life is used to evaluate the general well-being of individuals and societies. The term is used in a wide range of contexts, including the fields of international development, healthcare, and politics. Quality of life should not be confused with the concept of standard of...

in a population. In education, it has been used as a measure of the inequality of universities. In chemistry it has been used to express the selectivity of protein kinase inhibitors against a panel of kinases. In engineering, it has been used to evaluate the fairness achieved by Internet routers in scheduling packet transmissions from different flows of traffic. In statistics, building decision trees, it is used to measure the purity of possible child nodes, with the aim of maximising the average purity of two child nodes when splitting, and it has been compared with other equality measures.

## See also

- Human Poverty IndexHuman Poverty IndexThe Human Poverty Index is an indication of the standard of living in a country, developed by the United Nations . For highly developed countries, the UN considers that it can better reflect the extent of deprivation compared to the Human Development Index....
- Pareto distribution
- Robin Hood indexRobin Hood indexThe Hoover index is a measure of income inequality. It is equal to the portion of the total community income that would have to be redistributed for there to be perfect equality....
- ROC analysis

- Social welfare provision
- The Spirit LevelThe Spirit Level: Why More Equal Societies Almost Always Do BetterThe Spirit Level: Why More Equal Societies Almost Always Do Better is a book by Richard G. Wilkinson and Kate Pickett, published in 2009 by Allen Lane. The book is published in the US by Bloomsbury Press with the new sub-title: Why Greater Equality Makes Societies Stronger...
- Suits indexSuits indexThe Suits index of a public policy is a measure of collective progressivity, named for economist Daniel B. Suits. Similar to the Gini coefficient, the Suits index is calculated by comparing the area under the Lorenz curve to the area under a proportional line...
- Welfare economicsWelfare economicsWelfare economics is a branch of economics that uses microeconomic techniques to evaluate economic well-being, especially relative to competitive general equilibrium within an economy as to economic efficiency and the resulting income distribution associated with it...
- Economic inequalityEconomic inequalityEconomic inequality comprises all disparities in the distribution of economic assets and income. The term typically refers to inequality among individuals and groups within a society, but can also refer to inequality among countries. The issue of economic inequality is related to the ideas of...

- List of countries by distribution of wealth
- List of U.S. states by income equality

## Further reading

- Gini, Corrado (1912). "Variabilità e mutabilità" Reprinted in Memorie di metodologica statistica (Ed. Pizetti E, Salvemini, T). Rome: Libreria Eredi Virgilio Veschi (1955).
- Giorgi, G. M. (1990). A bibliographic portrait of the Gini ratio, Metron, 48, 183-231. The Chinese version of this paper appears in

## External links

- Deutsche Bundesbank: Do banks diversify loan portfolios?, 2005 (on using e.g. the Gini coefficient for risk evaluation of loan portfolios)
- Forbes Article, In praise of inequality
- Measuring Software Project Risk With The Gini Coefficient, an application of the Gini coefficient to software
- The World Bank: Measuring Inequality
- Travis Hale, University of Texas Inequality Project:The Theoretical Basics of Popular Inequality Measures, online computation of examples: 1A, 1B
- Article from The Guardian analysing inequality in the UK 1974 - 2006
- World Income Inequality Database
- Income Distribution and Poverty in OECD Countries
- Software:
- A Matlab Inequality Package, including code for computing Gini, Atkinson, Theil indexes and for plotting the Lorenz Curve. Many examples are available.
- Free Online Calculator computes the Gini Coefficient, plots the Lorenz curve, and computes many other measures of concentration for any dataset
- Free Calculator: Online and downloadable scripts (PythonPython (programming language)Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

and LuaLua programming languageLua is a lightweight multi-paradigm programming language designed as a scripting language with extensible semantics as a primary goal. Lua has a relatively simple C API compared to other scripting languages.- History :...

) for Atkinson, Gini, and Hoover inequalities - Users of the R data analysis software can install the "ineq" package which allows for computation of a variety of inequality indices including Gini, Atkinson, Theil.
- LORENZ 3.0 is a MathematicaMathematicaMathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...

notebook available from http://pure.au.dk/portal/en/cfd@dmu.dk which draws sample Lorenz curveLorenz curve

s and calculates Gini coefficients and Lorenz asymmetry coefficientLorenz asymmetry coefficientThe Lorenz asymmetry coefficient is a summary statistic of the Lorenz curve that measures the degree of asymmetry of the curve. The Lorenz asymmetry coefficient is defined asS = F+ L\,...

s from data in an Excel sheet Lorenz_for_download_2.zip.