NIST (metric) - AbsoluteAstronomy.com

NIST (metric)

Encyclopedia

NIST is a method for evaluating the quality of text which has been translated using machine translation

Machine translation

Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

. Its name comes from the US National Institute of Standards and Technology

National Institute of Standards and Technology

The National Institute of Standards and Technology , known between 1901 and 1988 as the National Bureau of Standards , is a measurement standards laboratory, otherwise known as a National Metrological Institute , which is a non-regulatory agency of the United States Department of Commerce...

.

It is based on the BLEU

Bilingual evaluation understudy

BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Quality is considered to be the correspondence between a machine's output and that of a human: "the closer a machine translation is to a professional human translation,...

metric, but with some alterations. Where BLEU

Bilingual evaluation understudy

BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Quality is considered to be the correspondence between a machine's output and that of a human: "the closer a machine translation is to a professional human translation,...

simply calculates n-gram

N-gram

In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items in question can be phonemes, syllables, letters, words or base pairs according to the application...

precision adding equal weight to each one, NIST also calculates how informative a particular n-gram

N-gram

In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items in question can be phonemes, syllables, letters, words or base pairs according to the application...

is. That is to say when a correct n-gram

N-gram

In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items in question can be phonemes, syllables, letters, words or base pairs according to the application...

is found, the rarer that n-gram is, the more weight it will be given.

For example, if the bigram "on the" is correctly matched, it will receive lower weight than the correct matching of bigram "interesting calculations", as this is less likely to occur.

NIST also differs from BLEU

Bilingual evaluation understudy

BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Quality is considered to be the correspondence between a machine's output and that of a human: "the closer a machine translation is to a professional human translation,...

in its calculation of the brevity penalty insofar as small variations in translation length do not impact the overall score as much.

See also

BLEU
Bleu
bleu or BLEU may refer to:* the French word for blue* Three Colors: Blue, a 1993 movie* Bilingual Evaluation Understudy, a machine translation evaluation metric* Belgium–Luxembourg Economic Union...
F-Measure
F1 Score
In statistics, the F1 score is a measure of a test's accuracy. It considers both the precision p and the recall r of the test to compute the score: p is the number of correct results divided by the number of all returned results and r is the number of correct results divided by the number of...
METEOR
METEOR
METEOR is a metric for the evaluation of machine translation output. The metric is based on the harmonic mean of unigram precision and recall, with recall weighted higher than precision...
ROUGE (metric)
ROUGE (metric)
ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing...
Word Error Rate (WER)
Word error rate
Word error rate is a common metric of the performance of a speech recognition or machine translation system.The general difficulty of measuring performance lies in the fact that the recognized word sequence can have a different length from the reference word sequence...
Noun-Phrase Chunking

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

x

OK