Littera Deusto

Modern Languages, Basque Studies and Humanities

Different kinds of Corpora.

mayo 16th, 2009 · No hay Comentarios

I am going to give a brief introduction on these different kinds of corpora. It is not going to be technical because if we would want technical examples we would go to the page itself. It is going to be a description from my own experience, who do these systems work and what can we find in them.

The British National Corpus: Is a corpus of the British English, either spoken or written. The serach engine is simple, you type a word and the corpora will give about 50 examples of that word in different contexts.  

The American National Corpus: Is a corpus of the American English, either spoken or written. It has two releases: the first and the second. The problem is that the page is not up to date and therefore the search engine is not working, that’s why we had to get to another webpage called: The  Corpus of Contemporary American English which works wonders, I must say. 

The Corpus of Contemporary American English: Is a very powerful system which uses the best of the BNC and ANC altogether. The result of this is a great search system in which not only you can find different examples of words in American English, but also can compare two words and their different uses in contexts. A great webpage for students or teachers. The onbly problem is that requires registration, but the process is very simple. I did the registration myself and it only took me 3 minutes.

The International Corpus of English: Is a corpus which has examples of the different varieties of English.

As a matter of fact, the search engine used in the BNC is SARA and the one used in the ANC is Xaira.

Check out our wiki page, the bibliography part, for links to corpora:http://wiki.littera.deusto.es/en/index.php/Lr0809/I

Etiquetas:

  • Etiquetas