Aggregation (linguistics)
Encyclopedia
Aggregation is a subtask of natural language generation
Natural language generation
Natural Language Generation is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form...

, which involves merging syntactic constituents (such as sentences and phrases) together. Sometimes aggregation is also done at a conceptual level.

Examples

A simple example of syntactic aggregation is merging the two sentence
Sentence
Sentence or sentencing may refer to:* Sentence , a grammatical unit of language* Sentence , a formula with no free variables* Sentence , a particular type of musical phrase...

s John went to the shop and John bought an apple into the single sentence John went to the shop and bought an apple.

Syntactic aggregation can be much more complex than this. For example, aggregation can embed one of the consituents in the other; e.g., we can aggregate John went to the shop and The shop was closed into the sentence John went to the shop, which was closed.

From a pragmatic perspective, aggregating sentences together often suggests to the reader that these sentences are related to each other. If this is not the case, the reader may be confused. For example, someone who reads John went to the shop and bought an apple may infer that the apple was bought in the shop; if this is not the case, then these sentences should not be aggregated.

A simple example of conceptual aggregation is replacing Saturday and Sunday by weekend.

Algorithms and issues

Aggregation algorithms must do two things:
  • Decide when two constituents should be aggregated
  • Decide how two constituents should be aggregated, and create the aggregated structure


The first issue, deciding when to aggregate, is poorly understood. Aggegration decisions certainly depend on the semantic
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....

 relations between the constituents, as mentioned above; they also depend on the genre
Genre
Genre , Greek: genos, γένος) is the term for any category of literature or other forms of art or culture, e.g. music, and in general, any type of discourse, whether written or spoken, audial or visual, based on some set of stylistic criteria. Genres are formed by conventions that change over time...

 (e.g., bureaucratic texts tend to be more aggregated than instruction manuals). They probably should depend on rhetorical and discourse structure . The literacy
Literacy
Literacy has traditionally been described as the ability to read for knowledge, write coherently and think critically about printed material.Literacy represents the lifelong, intellectual process of gaining meaning from print...

 level of the reader is also probably important (poor readers need shorter sentences) . But we have no integrated model which brings all these factors together into a single algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

.

With regard to the second issue, there have been some studies of different types of aggregation, and how they should be carried out. Harbusch and Kempen describe several syntactic aggregation strategies. In their terminology, John went to the shop and bought an apple is an example of forward conjunction Reduction
.
Much less is known about conceptual aggregation. Di Eugenio et al. show how conceptual aggregation can be done in an intelligent tutoring system, and demonstrate that performing such aggregation makes the system more effective (and that conceptual aggregation make a bigger impact than syntactic aggregation) .

Software

Unfortunately there is not much software available for performing aggregation. However the simplenlg system does include limited support for basic aggregation. For example, the following code causes simplenlg to print out The man is hungry and buys an apple.


SPhraseSpec s1 = nlgFactory.createClause("the man", "be", "hungry");
SPhraseSpec s2 = nlgFactory.createClause("the man", "buy", "an apple");
NLGElement result = new ClauseCoordinationRule.apply(s1, s2);
System.out.println(realiser.realiseSentence(result));
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK