Metric tree
Encyclopedia
A metric tree is any tree
Tree (data structure)
In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is an ordered directed tree, more specifically an arborescence: an acyclic connected graph where each node has zero or more children nodes and at...

 data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

 specialized to index data in metric space
Metric space
In mathematics, a metric space is a set where a notion of distance between elements of the set is defined.The metric space which most closely corresponds to our intuitive understanding of space is the 3-dimensional Euclidean space...

s. Metric trees exploit properties of metric spaces such as the triangle inequality
Triangle inequality
In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side ....

 to make accesses to the data more efficient. Examples include the M-tree
M-tree
M-trees are tree data structures that are similar to R-trees and B-trees. It is constructed using a metric and relies on the triangle inequality for efficient range and k-NN queries....

, vp-tree
Vp-tree
A vantage-point tree, or vp-tree is a BSP tree that segregates data in a metric space by choosing a position in the space and dividing the data points into two partitions: those that are nearer to the vantage point than a threshold, and those that are not...

s, cover tree
Cover tree
The cover tree is a type of data structure in computer science that is specifically designed to facilitate the speed-up of a nearest neighbor search...

s, MVP Tree
MVP Tree
An MVP tree is an abstract data structure for indexing objects from large metric spaces for similarity search queries....

s, and bk tree
Bk tree
A BK-tree is a metric tree suggested by Walter Austin Burkhard and Robert M. Keller specifically adapted to discrete metric spaces.For simplicity, let us consider integer discrete metric d. Then, BK-tree is defined in the following way. An arbitrary element a is selected as root node. The root...

s.

Multidimensional search

Most algorithms and data structures for searching a dataset are based on the classical binary search algorithm, and generalizations such as the k-d tree or range tree
Range tree
In computer science, a range tree is an ordered tree data structure to hold a list of points. It allows all points within a given range to be efficiently retrieved, and is typically used in two or higher dimensions....

 work by interleaving the binary search algorithm over the separate coordinates and treating each spatial coordinate as an independent search constraint. These data structures are well-suited for range query
Range query
A range query is a common database operation that retrieves all records where some value is between an upper and lower boundary. For example, list all employees with 3 to 5 years experience. Range queries are unusual because it is not generally known in advance how many entries a range query will...

 problems asking for every point that satisfies and .

A limitation of these multidimensional search structures is that they are only defined for searching over objects that can be treated as vectors. They aren't applicable for the more general case in which the algorithm is given only a collection of objects and a function for measuring the distance or similarity between two objects. If, for example, someone were to create a function that returns a value indicating how similar one image is to another, a natural algorithmic problem would be to take a dataset of images and find the ones that are similar according to the function to a given query image.

Metric data structures

If there is no structure to the similarity measure then a brute force search requiring the comparison of the query image to every image in the dataset is the best that can be done. If, however, the similarity function satisfies the triangle inequality
Triangle inequality
In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side ....

 then it is possible to use the result of each comparison to prune the set of candidates to be examined.

The first article on metric trees, as well as the first use of the term "metric tree", published in the open literature was by Jeffrey Uhlmann
Jeffrey Uhlmann
Jeffrey Uhlmann is an American research scientist who is probably best known for his mathematical generalizations of the Kalman filter. Most of his publications and patents have been in the field of data fusion...

 in 1991. Other researchers were working independently on similar data structures, and research on metric tree data structures blossomed in the late 1990s and included an examination by Google co-founder Sergey Brin
Sergey Brin
Sergey Mikhaylovich Brin is a Russian-born American computer scientist and internet entrepreneur who, with Larry Page, co-founded Google, one of the largest internet companies. , his personal wealth is estimated to be $16.7 billion....

of their use for very large databases. The first textbook on metric data structures was published in 2006.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK