Please note: In order to keep Hive up to date and provide users with the best features, we are no longer able to fully support Internet Explorer. The site is still available to you, however some sections of the site may appear broken. We would encourage you to move to a more modern browser like Firefox, Edge or Chrome in order to experience the site fully.

Language Modeling for Information Retrieval, Hardback Book

Language Modeling for Information Retrieval Hardback

Edited by W. Bruce Croft, John Lafferty

Part of the The Information Retrieval Series series

Hardback

Description

A statisticallanguage model, or more simply a language model, is a prob­ abilistic mechanism for generating text.

Such adefinition is general enough to include an endless variety of schemes.

However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat­ egories.

The first statisticallanguage modeler was Claude Shannon.

In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text.

To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text.

The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues.

Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly.

Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power.

However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling.

Information

  • Format:Hardback
  • Pages:246 pages, XIV, 246 p.
  • Publisher:Springer-Verlag New York Inc.
  • Publication Date:
  • Category:
  • ISBN:9781402012167
Save 13%

£119.99

£104.05

Item not Available
 
Free Home Delivery

on all orders

 
Pick up orders

from local bookshops

Information

  • Format:Hardback
  • Pages:246 pages, XIV, 246 p.
  • Publisher:Springer-Verlag New York Inc.
  • Publication Date:
  • Category:
  • ISBN:9781402012167

Also in the The Information Retrieval Series series  |  View all