Littera Deusto

Modern Languages, Basque Studies and Humanities

Types of Corpora

mayo 12th, 2009 · No hay Comentarios

There are many kinds of Corpora. They can contedn written or spoken uses of a language, modern or old texts, whole books…

The General Corpora is formed by general texts that do not belong to single field, or register. Some Corpora include texts from a particular dialect, or variety of a language. These corpora are called Sublanguage Corpora.

  • A corpora can be composed by texts in a single language or texts in more than one language. If the texts are in the same language such in translations, the corpora is called Parallel Corpus. In this kind of corpora the direction of the translation is not relevant. The texts can have been translated from the A language into the B, or from the b into the A. In most cases the direction of the translation is not known to the user. Some examples of Parallel Corpus are:

The Canadian Hansard[2] (English and French).

WHO bilingual documents[3](English-French, English-Spanish)

  • A corpora that contains a collection of similar texts is called Comparable Corpus. The goal of this type of corpora is to compare the languages or varieties presented in similar circumstances of communication. Some examples:

ICE Project[4] (International Corpus of English)

The LOB Corpus (British English)

Kolhapur Corpus (Indian English)

Now, we include some examples of Spanish Corpus.

This is a list of the Spanish Corpus avalaible in the net. We will be using them for our analysis:

  • CREA (Corpus de la Real Académia) [1]
  • CORDE (Corpus diacrónico Real Academia) [2]
  • Corpus del español (Mark Davies) [3]
  • Archivo Gramatical de la lengua española de Salvador Hernández Ramirez. [4]
  • LEXESP (Corpus parcial del castellano) [5]

Oral Corpora

  • Corpus oral de referencia del español contemporáneo. [6]
  • Corpus de conversación coloquial- Grupo.Val.Co [7]
  • PREESEA, Proyecto para el Estudio Sociolingüístico del Español de España y América. [8]
  • COSER (Corpus oral y sonoro del español rural) [9]

Publicado enlr09

Etiquetas:

  • Etiquetas