Littera Deusto

Modern Languages, Basque Studies and Humanities

The International Corpus of English

junio 6th, 2009 · No hay Comentarios

The International Corpus of English (ICE) project was initiated in 1988 by the late Sidney Greenbaum, the then Director of the Survey of English Usage, University College London. In a brief notice in World Englishes, Greenbaum pointed out that grammatical studies had been greatly facilitated by the availability of two computerized corpora of printed English, the Brown Corpus of American English, and the LOB (Lancaster/Oslo-Bergen) Corpus of British English. Greenbaum continued:

We should now be thinking of extending the scope for computerized comparative studies in three ways: (1) to sample standard varieties from other countries where English is the first language, for example Canada and Australia; (2) to sample national varieties from countries where English is an official additional language, for example India and Nigeria; and (3) to include spoken and manuscript English as well as printed English. (Greenbaum 1988)

In response, linguists from around the world came forward to discuss Greenbaum’s proposal, and ultimately to put it into effect (Greenbaum 1991). The project soon became known as the International Corpus of English (ICE), and was coordinated by Greenbaum until 1996. From 1996 to 2001, ICE was coordinated by Charles Meyer, University of Massachusetts-Boston. It is now coordinated by Gerald Nelson in Hong Kong. The ICE project involves research teams in each of the countries or regions shown below.

Australia 
Cameroon 
Canada 
East Africa (Kenya, Malawi, Tanzania)
Fiji 
Great Britain
Hong Kong
India
Ireland
Jamaica 
Kenya 
Malta

  Malaysia
New Zealand
Nigeria 
Pakistan
Philippines
Sierra Leone 
Singapore
South Africa 
Sri Lanka 
Trinidad and Tobago 
USA

Each ICE team is compiling – or has already compiled – a one million-word corpus of their own national or regional variety of English. Crucially, each team follows a common corpus design and a common annotation scheme, in order to ensure maximum comparability between the components (Nelson 1996). The long-term aim of ICE is to produce up to twenty one million-word corpora, each syntactically analysed according to a common parsing scheme, and supplied with the retrieval software, ICECUP.

Each ICE corpus samples the English of adults (age 18 or over) who have been educated through the medium of English to at least the end of secondary schooling. Furthermore, each component corpus is grammatically analysed using a common grammatical annotation scheme.

resource: The International Corpus of English (2009, June 6).In The International Corpus of English (ICE). Retrieved, 17:24, June 6, 2009, from: http://www.ucl.ac.uk/english-usage/projects/ice.htm  

Etiquetas:

  • Etiquetas