Littera Deusto

Modern Languages, Basque Studies and Humanities

Information extraction (Questionnaire 2)

mayo 19th, 2009 · No hay Comentarios

It is, in natural language processing, a type of information retrieval which is used to automatically extract structured information like categorized and contextually and semantically well-defined data from a certain domain, from unstructured machine-readable documents. Another of their goals is to allow computation to be done on the previously unstructured data and allow logical reasoning to make inferences based on the content of the input data.

Information Extraction is not Information Retrieval: Information Extraction does not recover from a collection a subset of documents which arerelevant to a query: it extracts from the documents salient facts about prespecified types of events, entities or relationships. These facts are entered automatically into a database, which may then be used to analyse the data to give a natural language summary.

As it is very complex, many different communities of researchers are bringing techniques from machine learning, databases, information retrieval, and computational linguistics for various aspects of the information extraction problem.


Sources:

Etiquetas:

  • Etiquetas