Head-driven phrase structure grammar
Encyclopedia
Head-driven phrase structure grammar (HPSG) is a highly lexicalized, non-derivational generative grammar
Generative grammar
In theoretical linguistics, generative grammar refers to a particular approach to the study of syntax. A generative grammar of a language attempts to give a set of rules that will correctly predict which combinations of words will form grammatical sentences...

 theory developed by Carl Pollard
Carl Pollard
Carl Jesse Pollard is a Professor of Linguistics at the Ohio State University. He is the inventor of Head grammar and Higher-order grammar, as well as co-inventor of Head-driven phrase structure grammar . He is currently also working on Convergent Grammar . He has written numerous books and...

 and Ivan Sag
Ivan Sag
Ivan Sag is an American linguist and cognitive scientist. He is the Sadie Dernham Patek Professor in Humanities, Professor of Linguistics, and Director of the Symbolic Systems Program at Stanford University...

. It is the immediate successor to generalized phrase structure grammar. HPSG draws from other fields such as computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 (data type theory
Type system
A type system associates a type with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that no type errors can occur...

 and knowledge representation
Knowledge representation
Knowledge representation is an area of artificial intelligence research aimed at representing knowledge in symbols to facilitate inferencing from those knowledge elements, creating new elements of knowledge...

) and uses Ferdinand de Saussure
Ferdinand de Saussure
Ferdinand de Saussure was a Swiss linguist whose ideas laid a foundation for many significant developments in linguistics in the 20th century. He is widely considered one of the fathers of 20th-century linguistics...

's notion of the sign
Sign (linguistics)
There are many models of the linguistic sign . A classic model is the one by the Swiss linguist Ferdinand de Saussure. According to him, language is made up of signs and every sign has two sides : the signifier , the "shape" of a word, its phonic component, i.e...

. It uses a uniform formalism and is organized in a modular way which makes it attractive for natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

.

An HPSG grammar includes principles and grammar rules and lexicon
Lexicon
In linguistics, the lexicon of a language is its vocabulary, including its words and expressions. A lexicon is also a synonym of the word thesaurus. More formally, it is a language's inventory of lexemes. Coined in English 1603, the word "lexicon" derives from the Greek "λεξικόν" , neut...

 entries which are normally not considered to belong to a grammar. The formalism is based on lexicalism. This means that the lexicon is more than just a list of entries; it is in itself richly structured. Individual entries are marked with types. Types form a hierarchy. Early versions of the grammar were very lexicalized with few grammatical rules (schema). More recent research has tended to add more and richer rules, becoming more like Construction Grammar
Construction grammar
The term construction grammar covers a family of theories, or models, of grammar that are based on the idea that the primary unit of grammar is the grammatical construction rather than the atomic syntactic unit and the rule that combines atomic units, and that the grammar of a language is made up...


.

The basic type HPSG deals with is the sign. Word
Word
In language, a word is the smallest free form that may be uttered in isolation with semantic or pragmatic content . This contrasts with a morpheme, which is the smallest unit of meaning but will not necessarily stand on its own...

s and phrase
Phrase
In everyday speech, a phrase may refer to any group of words. In linguistics, a phrase is a group of words which form a constituent and so function as a single unit in the syntax of a sentence. A phrase is lower on the grammatical hierarchy than a clause....

s are two different subtypes of sign. A word has two features: [PHON] (the sound, the phonetic
Phonetics
Phonetics is a branch of linguistics that comprises the study of the sounds of human speech, or—in the case of sign languages—the equivalent aspects of sign. It is concerned with the physical properties of speech sounds or signs : their physiological production, acoustic properties, auditory...

 form) and [SYNSEM] (the syntactic and semantic information), both of which are split into subfeatures. Signs and rules are formalized as typed
Type theory
In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...

 feature structure
Feature structure
In phrase structure grammars, such as generalised phrase structure grammar, head-driven phrase structure grammar and lexical functional grammar, a feature structure is essentially a set of attribute-value pairs. For example the attribute named number might have the value singular. The value of an...

s.

A Sample Grammar

HPSG generates strings by combining signs, which are defined by their location within a type hierarchy and by their internal feature structure, represented by attribute value matrices (AVMs).
Features take types or lists of types as their values, and these values may in turn have their own feature structure. Grammatical rules are largely expressed through the constraints signs place on one another. A sign's feature structure describes its phonological, syntactic, and semantic properties. In common notation, AVMs are written with features in upper case and types in italicized lower case. Numbered indices in an AVM represent token identical values.

In the simplified AVM for the word "walks" below, the verb's categorical information is divided into features that describe it (HEAD) and features that describe its arguments (VALENCE).

"Walks" is a sign of type word with a head of type verb. As an intransitive verb, "walks" has no complement but requires a subject that is a third person singular noun. The semantic value of the subject (CONTENT) is co-indexed with the verb's only argument (the individual doing the walking). The following AVM for "she" represents a sign with a SYNSEM value that could fulfill those requirements.



Signs of type phrase unify
Unification
Unification, in computer science and logic, is an algorithmic process by which one attempts to solve the satisfiability problem. The goal of unification is to find a substitution which demonstrates that two seemingly different terms are in fact either identical or just equal...

 with one or more children and propagate information upward. The following AVM encodes the immediate dominance rule
ID/LP grammar
An ID/LP grammar is a formal grammar that distinguishes immediate dominance constraints from linear precedence constraints. Whereas traditional phrase structure rules incorporate dominance and precedence into a single rule, ID/LP maintains separate rule sets which need not be processed...

 for a head-subj-phrase, which requires two children: the head child (a verb) and a non-head child that fulfills the verb's SUBJ constraints.



The end result is a sign with a verb head, empty subcategorization features, and a phonological value that orders the two children.

Although the actual grammar of HPSG is composed entirely of feature structures, linguists often use trees to represent the unification of signs where the equivalent AVM would be unwieldy.


Implementations

Various parsers
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

 based on the HPSG formalism have been written and optimizations are currently being investigated. An example of a system analyzing German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

 sentences
Sentences
The Four Books of Sentences is a book of theology written by Peter Lombard in the twelfth century. It is a systematic compilation of theology, written around 1150; it derives its name from the sententiae or authoritative statements on biblical passages that it gathered together.-Origin and...

 is provided by the Freie Universität Berlin. In addition the Grammar Group of the Freie Universität Berlin provides open source grammars that were implemented in the TRALE system. Currently there are grammars for German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

, Mandarin Chinese, Maltese
Maltese language
Maltese is the national language of Malta, and a co-official language of the country alongside English,while also serving as an official language of the European Union, the only Semitic language so distinguished. Maltese is descended from Siculo-Arabic...

, and Persian
Persian language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...

 that share a common core and are publicly available. For Dutch
Dutch language
Dutch is a West Germanic language and the native language of the majority of the population of the Netherlands, Belgium, and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second...

, the wide-coverage dependency parser Alpino has been developed at the University of Groningen
University of Groningen
The University of Groningen , located in the city of Groningen, was founded in 1614. It is one of the oldest universities in the Netherlands as well as one of its largest. Since its inception more than 100,000 students have graduated...

.

Large HPSG grammars of various languages are being developed in the Deep Linguistic Processing with HPSG Initiative (DELPH-IN). Wide-coverage grammars of German, English and Japanese
Japanese language
is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...

  are available under an open-source license. These grammars can be used with the open-source HPSG systems LKB
Linguistic Knowledge Builder
Linguistic Knowledge Builder is a free and open source grammar engineering environment for creating grammars and lexicons of natural languages. Any unification-based grammar can be implemented, but LKB is typically used for grammars with typed feature structures such as HPSG.It is implemented in...

and PET. DELPH-IN grammars can typically be used for both parsing and generation. Treebanks also distributed by DELPH-IN are being used to develop and test the grammars, as well as train ranking models to decide on plausible interpretations when parsing (or realizations when generating).

Enju is a freely available wide-coverage probabilistic HPSG parser for English developed by the Tsujii Laboratory at The University of Tokyo in Japan
Japan
Japan is an island nation in East Asia. Located in the Pacific Ocean, it lies to the east of the Sea of Japan, China, North Korea, South Korea and Russia, stretching from the Sea of Okhotsk in the north to the East China Sea and Taiwan in the south...

. Its robustness sets it apart from most other HPSG parsers.

See also

  • Generalised phrase structure grammar
    Generalised phrase structure grammar
    Generalised phrase structure grammar is a framework for describing the syntax and semantics of natural languages. It is a type of phrase structure grammar, as opposed to a dependency grammar. GPSG was initially developed in the late 1970s by Gerald Gazdar. Other contributors include Ewan Klein,...

  • Lexical-functional grammar
  • Syntax
    Syntax
    In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....

  • Relational grammar
    Relational grammar
    In linguistics, Relational Grammar is a syntactic theory which argues that primitive grammatical relations provide the ideal means to state syntactic rules in universal terms. Relational grammar began as an alternative to transformational grammar....

  • Transformational grammar
    Transformational grammar
    In linguistics, a transformational grammar or transformational-generative grammar is a generative grammar, especially of a natural language, that has been developed in the Chomskyan tradition of phrase structure grammars...


Further reading

  • Carl Pollard
    Carl Pollard
    Carl Jesse Pollard is a Professor of Linguistics at the Ohio State University. He is the inventor of Head grammar and Higher-order grammar, as well as co-inventor of Head-driven phrase structure grammar . He is currently also working on Convergent Grammar . He has written numerous books and...

    , Ivan A. Sag
    Ivan Sag
    Ivan Sag is an American linguist and cognitive scientist. He is the Sadie Dernham Patek Professor in Humanities, Professor of Linguistics, and Director of the Symbolic Systems Program at Stanford University...

     (1987): Information-based Syntax and Semantics. Volume 1: Fundamentals. Stanford: CSLI Publications.
  • Carl Pollard
    Carl Pollard
    Carl Jesse Pollard is a Professor of Linguistics at the Ohio State University. He is the inventor of Head grammar and Higher-order grammar, as well as co-inventor of Head-driven phrase structure grammar . He is currently also working on Convergent Grammar . He has written numerous books and...

    , Ivan A. Sag
    Ivan Sag
    Ivan Sag is an American linguist and cognitive scientist. He is the Sadie Dernham Patek Professor in Humanities, Professor of Linguistics, and Director of the Symbolic Systems Program at Stanford University...

     (1994): Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press. (http://cslipublications.stanford.edu/site/0226674479.html)
  • Ivan A. Sag
    Ivan Sag
    Ivan Sag is an American linguist and cognitive scientist. He is the Sadie Dernham Patek Professor in Humanities, Professor of Linguistics, and Director of the Symbolic Systems Program at Stanford University...

    , Thomas Wasow, Emily Bender (2003): Syntactic Theory: a formal introduction, Second Edition. Chicago: University of Chicago Press. (http://cslipublications.stanford.edu/site/1575864002.html)

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK