H-index
Encyclopedia
The h-index is an index that attempts to measure both the productivity and impact of the published work of a scientist
Scientist
A scientist in a broad sense is one engaging in a systematic activity to acquire knowledge. In a more restricted sense, a scientist is an individual who uses the scientific method. The person may be an expert in one or more areas of science. This article focuses on the more restricted use of the word...

 or scholar. The index is based on the set of the scientist's most cited papers and the number of citations that they have received in other publications. The index can also be applied to the productivity and impact of a group of scientists, such as a department or university or country. The index was suggested by Jorge E. Hirsch
Jorge E. Hirsch
Jorge E. Hirsch is an Argentine American professor of physics at the University of California, San Diego. He is known for his for public warnings in 2005 and 2006 about the increasing risk of nuclear war that he claimed was being caused by an unnecessarily aggressive military nuclear policy on the...

, a physicist at UCSD
University of California, San Diego
The University of California, San Diego, commonly known as UCSD or UC San Diego, is a public research university located in the La Jolla neighborhood of San Diego, California, United States...

, as a tool for determining theoretical physicists
Theoretical physics
Theoretical physics is a branch of physics which employs mathematical models and abstractions of physics to rationalize, explain and predict natural phenomena...

' relative quality and is sometimes called the Hirsch index or Hirsch number.

Definition and purpose

The index is based on the distribution of citation
Citation
Broadly, a citation is a reference to a published or unpublished source . More precisely, a citation is an abbreviated alphanumeric expression Broadly, a citation is a reference to a published or unpublished source (not always the original source). More precisely, a citation is an abbreviated...

s received by a given researcher's publications. Hirsch writes:
A scientist has index h if h of his/her Np papers have at least h citations each, and the other (Np − h) papers have no more than h citations each.


In other words, a scholar with an index of h has published h papers each of which has been cited in other papers at least h times. Thus, the h-index reflects both the number of publications and the number of citations per publication. The index is designed to improve upon simpler measures such as the total number of citations or publications. The index works properly only for comparing scientists working in the same field; citation conventions differ widely among different fields.

The h-index serves as an alternative to more traditional journal impact factor
Impact factor
The impact factor, often abbreviated IF, is a measure reflecting the average number of citations to articles published in science and social science journals. It is frequently used as a proxy for the relative importance of a journal within its field, with journals with higher impact factors deemed...

 metrics in the evaluation of the impact of the work of a particular researcher. Because only the most highly cited articles contribute to the h-index, its determination is a relatively simpler process. Hirsch has demonstrated that h has high predictive value for whether a scientist has won honors like National Academy
National academy
A national academy is an organizational body, usually operating with state financial support and approval, that co-ordinates scholarly research activities and standards for academic disciplines, most frequently in the sciences but also the humanities. Typically the country's learned societies in...

 membership or the Nobel Prize
Nobel Prize
The Nobel Prizes are annual international awards bestowed by Scandinavian committees in recognition of cultural and scientific advances. The will of the Swedish chemist Alfred Nobel, the inventor of dynamite, established the prizes in 1895...

. In physics
Physics
Physics is a natural science that involves the study of matter and its motion through spacetime, along with related concepts such as energy and force. More broadly, it is the general analysis of nature, conducted in order to understand how the universe behaves.Physics is one of the oldest academic...

, a moderately productive scientist should have an h equal to the number of years of service while biomedical scientists tend to have higher values. The h-index grows as citations accumulate and thus it depends on the 'academic age' of a researcher.

Hirsch suggested (with large error bars) that, for physicists, a value for h of about 12 might be typical for advancement to tenure (associate professor) at major research universities. A value of about 18 could mean a full professorship, 15–20 could mean a fellowship in the American Physical Society
American Physical Society
The American Physical Society is the world's second largest organization of physicists, behind the Deutsche Physikalische Gesellschaft. The Society publishes more than a dozen scientific journals, including the world renowned Physical Review and Physical Review Letters, and organizes more than 20...

, and 45 or higher could mean membership in the United States National Academy of Sciences
United States National Academy of Sciences
The National Academy of Sciences is a corporation in the United States whose members serve pro bono as "advisers to the nation on science, engineering, and medicine." As a national academy, new members of the organization are elected annually by current members, based on their distinguished and...

. Little systematic investigation has been made on how academic recognition correlates with h-index over different institutions, nations and fields of study.

Among the 22 scientific disciplines listed in the Thomson Reuters
Thomson Reuters
Thomson Reuters Corporation is a provider of information for the world's businesses and professionals and is created by the Thomson Corporation's purchase of Reuters Group on 17 April 2008. Thomson Reuters is headquartered at 3 Times Square, New York City, USA...

 Essential Science Indicators Citation Thresholds, physics has the second most citations after space science. During the period January 1, 2000-February 28, 2010, a physicist had to receive 2073 citations to be among the most cited 1% of physicists in the world. The threshold for Space Science is the highest (2236 citations), and Physics is followed by Clinical Medicine (1390) and Molecular Biology & Genetics (1229). Most disciplines, such as Environment/Ecology (390), have fewer scientists, fewer papers, and fewer citations. Therefore, these disciplines have lower citation thresholds in the Essential Science Indicators, with the lowest citation thresholds observed in Social Sciences (154), Computer Science (149), and Multidisciplinary Sciences (147).

Calculation

The h-index can be manually determined using citation databases or using automatic tools. Subscription-based databases such as Scopus
Scopus
Scopus, officially named SciVerse Scopus, is a bibliographic database containing abstracts and citations for academic journal articles. It covers nearly 18,000 titles from over 5,000 international publishers, including coverage of 16,500 peer-reviewed journals in the scientific, technical, medical,...

 and the Web of Knowledge provide automated calculators. Harzing's Publish or Perish program calculates the h-index based on Google Scholar entries. In July 2011 Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 trialled a tool which allows a limited number of scholars to keep track of their own citations and also produces a h-index and an i10-index
I10-index
- Body of the article :The I10-index indicates the number of academic papers an author has written that have at least ten citations from others. It was introduced in July 2011 by Google as part of their work on Google Scholar, a search engine dedicated to academic and related papers....

. Each database is likely to produce a different h for the same scholar, because of different coverage: Google Scholar
Google Scholar
Google Scholar is a freely accessible web search engine that indexes the full text of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes most peer-reviewed online journals of Europe and America's largest...

 has more citations than Scopus and Web of Science but the smaller citation collections tend to be more accurate. In addition, specific databases, such as the Stanford Physics Information Retrieval System (SPIRES
Spires
Spires may refer to:* SPIRES, a database for publications in High-Energy Physics* Speyer , a city in Germany* The Spires, a commercial conference centre, operated out of Church House, Belfast by the Presbyterian Church in Ireland...

) can automatically calculate h-index for researchers working in High Energy Physics.

The topic has been studied in detail by Lokman I. Meho and Kiduk Yang. Web of Knowledge was found to have strong coverage of journal publications, but poor coverage of high impact conferences. Scopus has better coverage of conferences, but poor coverage of publications prior to 1996; Google Scholar has the best coverage of conferences and most journals (though not all), but like Scopus has limited coverage of pre-1990 publications. The exclusion of conference preprints is a problem for scholars in computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, where conference preprints are considered an important part of the literature, but reflects common practice in most scientific fields where conference preprints are unrefereed and are accorded less weight in evaluating academic productivity. The Scopus and Web of Knowledge calculations also fail to count the citations that publication gathers while 'in press' (i.e. after being accepted for publication but before being printed on paper); with electronic pre-publication and very long printing lags for some journals, these 'in press' citations can be considerable. Google Scholar has been criticized for producing "phantom citations," including gray literature
Gray literature
Gray literature is a field in library and information science. The term is used variably by the intellectual community, librarians, and medical and research professionals to refer to a body of materials that cannot be found easily through conventional channels such as publishers, "but which is...

 in its citation counts, and failing to follow the rules of Boolean logic
Boolean logic
Boolean algebra is a logical calculus of truth values, developed by George Boole in the 1840s. It resembles the algebra of real numbers, but with the numeric operations of multiplication xy, addition x + y, and negation −x replaced by the respective logical operations of...

 when combining search terms. For example, the Meho and Yang study found that Google Scholar identified 53% more citations than Web of Knowledge and Scopus combined, but noted that because most of the additional citations reported by Google Scholar were from low-impact journals or conference proceedings, they did not significantly alter the relative ranking of the individuals. It has been suggested that in order to deal with the sometimes wide variation in h for a single academic measured across the possible citation databases, that one should assume false negatives in the databases are more problematic than false positives and take the maximum h measured for an academic.

Advantages

Hirsch intended the h-index to address the main disadvantages of other bibliometric indicators, such as total number of papers or total number of citations. Total number of papers does not account for the quality of scientific publications, while total number of citations can be disproportionately affected by participation in a single publication of major influence (for instance, methodological papers proposing successful new techniques, methods or approximations, which can generate a large number of citations), or having many publications with few citations each. The h-index is intended to measure simultaneously the quality and quantity of scientific output.

Criticism

There are a number of situations in which h may provide misleading information about a scientist's output: (However, most of these are not exclusive to the h-index.)
  • The h-index does not account for the number of authors of a paper. In the original paper, Hirsch suggested partitioning citations among co-authors. Even in the absence of explicit gaming, the h-index and similar indexes tend to favor fields with larger groups, e.g. experimental over theoretical.
  • The h-index does not account for the typical number of citations in different fields. Different fields, or journals, traditionally use different numbers of citations.
  • The h-index discards the information contained in author placement in the authors' list, which in some scientific fields (but not in high energy physics, where Hirsch works) is significant.
  • The h-index is bounded by the total number of publications. This means that scientists with a short career are at an inherent disadvantage, regardless of the importance of their discoveries. For example, Évariste Galois
    Évariste Galois
    Évariste Galois was a French mathematician born in Bourg-la-Reine. While still in his teens, he was able to determine a necessary and sufficient condition for a polynomial to be solvable by radicals, thereby solving a long-standing problem...

    ' h-index is 2, and will remain so forever. Had Albert Einstein
    Albert Einstein
    Albert Einstein was a German-born theoretical physicist who developed the theory of general relativity, effecting a revolution in physics. For this achievement, Einstein is often regarded as the father of modern physics and one of the most prolific intellects in human history...

     died after publishing his four groundbreaking Annus Mirabilis papers
    Annus Mirabilis Papers
    The Annus Mirabilis papers are the papers of Albert Einstein published in the Annalen der Physik scientific journal in 1905. These four articles contributed substantially to the foundation of modern physics and changed views on space, time, and matter...

     in 1905, his h-index would be stuck at 4 or 5. This is also a problem for any measure that relies on the number of publications. However, as Hirsch indicated in the original paper, the index is intended as a tool to evaluate researchers in the same stage of their careers. It is not meant as a tool for historical comparisons.
  • The h-index does not consider the context of citations. For example, citations in a paper are often made simply to flesh out an introduction, otherwise having no other significance to the work. h also does not resolve other contextual instances: citations made in a negative context and citations made to fraudulent or retracted work. This is also a problem for regular citation counts.
  • The h-index gives books the same count as articles making it difficult to compare scholars in fields that are more book-oriented such as the humanities.
  • The h-index does not account for confounding factors such as "gratuitous authorship", the so-called Matthew effect
    Matthew effect (sociology)
    In sociology, the Matthew effect is the phenomenon where "the rich get richer and the poor get poorer". Those who possess power and economic or social capital can leverage those resources to gain more power or capital. The term was first coined by sociologist Robert K...

    , and the favorable citation bias associated with review articles. Again, this is a problem for all other metrics using publications or citations.
  • The h-index has been found to have slightly less predictive accuracy and precision
    Accuracy and precision
    In the fields of science, engineering, industry and statistics, the accuracy of a measurement system is the degree of closeness of measurements of a quantity to that quantity's actual value. The precision of a measurement system, also called reproducibility or repeatability, is the degree to which...

     than the simpler measure of mean citations per paper. However, this finding was contradicted by another study.
  • The h-index is a natural number
    Natural number
    In mathematics, the natural numbers are the ordinary whole numbers used for counting and ordering . These purposes are related to the linguistic notions of cardinal and ordinal numbers, respectively...

     which reduces its discriminatory power. Ruane
    Frances P. Ruane
    Frances P. Ruane is director of the Economic and Social Research Institute in Dublin, Ireland. She succeeded Brendan Whelan in 2006, after 30 years of teaching economics at Trinity College, Dublin and Queen's University, Kingston, Ontario, Canada. Previously, she worked at the Industrial...

     and Tol
    Richard Tol
    Richard S. J. Tol is a Research Professor at the Economic and Social Research Institute, Dublin, Ireland, where he works in the research areas of energy economics and environmental economics...

     therefore propose a rational
    Rational number
    In mathematics, a rational number is any number that can be expressed as the quotient or fraction a/b of two integers, with the denominator b not equal to zero. Since b may be equal to 1, every integer is a rational number...

     h-index that interpolates between h and h+1.
  • The h-index can be manipulated through self-citations, and if based on Google Scholar
    Google Scholar
    Google Scholar is a freely accessible web search engine that indexes the full text of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes most peer-reviewed online journals of Europe and America's largest...

     output, then even computer-generated documents can be used for that purpose, e.g. using SCIgen
    SCIgen
    SCIgen is a program created by scientists at the Massachusetts Institute of Technology that randomly generates nonsense in the form of computer science research papers, including graphs, diagrams, and citations...

    .

Alternatives and modifications

Various proposals to modify the h-index in order to emphasize different features have been made:
  • An individual h-index normalized by the average number of co-authors in the h-core has been introduced by Batista et al. They also found that the distribution of the h-index, although it depends on the field, can be normalized by a simple rescaling factor. For example, assuming as standard the hs for biology, the distribution of h for mathematics collapse with it if this h is multiplied by three, that is, a mathematician with h = 3 is equivalent to a biologist with h = 9. This method has not been readily adopted, perhaps because of its complexity. It might be simpler to divide citation counts by the number of authors before ordering the papers and obtaining the h-index, as originally suggested by Hirsch.
  • The m-index is defined as h/n, where n is the number of years since the first published paper of the scientist; also called m-quotient.
  • A generalization of the h-index and some other indices that gives additional information about the shape of the author's citation function (heavy-tailed, flat/peaked, etc.) was proposed by Gągolewski and Grzegorzewski.
  • Successive Hirsch-type-index introduced independently by Kosmulski and Prathap. A scientific institution has a successive Hirsch-type-index of i when at least i researchers from that institution have an h-index of at least i.
  • K. Dixit and colleagues argue that "For an individual researcher, a measure such as Erdős number
    Erdos number
    The Erdős number describes the "collaborative distance" between a person and mathematician Paul Erdős, as measured by authorship of mathematical papers.The same principle has been proposed for other eminent persons in other fields.- Overview :...

     captures the structural properties of network whereas the h-index captures the citation impact of the publications. One can be easily convinced that ranking in coauthorship networks should take into account both measures to generate a realistic and acceptable ranking." Several author ranking systems such as eigenfactor
    Eigenfactor
    The Eigenfactor score, developed by Jevin West and Carl Bergstrom at the University of Washington, is a rating of the total importance of a scientific journal...

     (based on eigenvector centrality) have been proposed already, for instance the Phys Author Rank Algorithm.
  • The c-index accounts not only for the citations but for the quality of the citations in terms of the collaboration distance between citing and cited authors. A scientist has c-index n if n of [his/her] N citations are from authors which are at collaboration distance at least n, and the other (N − n) citations are from authors which are at collaboration distance at most n.
  • Bornmann, Mutz, and Daniel recently proposed three additional metrics, h2lower, h2center, and h2upper, to give a more accurate representation of the distribution shape. The three h2 metrics measure the relative area within a scientist's citation distribution in the low impact area, h2lower, the area captured by the h-index, h2center, and the area from publications with the highest visibility, h2upper. Scientists with high h2upper percentages are perfectionists, whereas scientists with high h2lower percentages are mass producers. As these metrics are percentages, they are intended to give a qualitative description to supplement the quantitative h-index.

See also

  • Bibliometrics
    Bibliometrics
    Bibliometrics is a set of methods to quantitatively analyze scientific and technological literature. Citation analysis and content analysis are commonly used bibliometric methods...

  • impact factor
    Impact factor
    The impact factor, often abbreviated IF, is a measure reflecting the average number of citations to articles published in science and social science journals. It is frequently used as a proxy for the relative importance of a journal within its field, with journals with higher impact factors deemed...

  • g-index
    G-index
    The g-index is an index for quantifying scientific productivity based on publication record. It was suggested in 2006 by Leo Egghe.The index is calculated based on the distribution of citations received by a given researcher's publications:...

  • h-b index
    H-b index
    The h-b-index is an extension of the h-index suggested in 2005 by Jorge E. Hirsch of the University of California, San Diego to quantify the scientific productivity of physicists and other scientists based on their publication record...

  • Eddington number (cycling) An earlier metric of the same form.
  • Durfee square
    Durfee square
    In number theory, a Durfee square is an attribute of an integer partition. A partition of n has a Durfee square of side s if s is the largest number such that the partition contains at least s parts with values ≥ s...

    , a quantity defined in the same way for integer partitions

Lists of h-indices

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK