el corpus del espaņol


Corpora
New interface
Corpus size
Compare to other corpora
   CORPES (RAE)
   Larger corpora
Related resources
Researchers

Problems
Contact us




There are several resources that are based on the older version of the Corpus del Espaņol (which was released in 2001), such as:

The older Corpus del Espaņol was quite small, however (only 20 million words for the 1900s). As a result, there were many types of resources that we've created for English, which couldn't be created for Spanish until a much larger corpus was available. With the new two billion word corpus, we can create many of these resources. They will include:

  • Full-text data, which means that you'd have nearly the entire two billion words of data on your machine

  • Updated data similar to the word frequency, collocates, and n-grams data (including the top 40,000 lemmas of Spanish)

  • WordAndPhrase for Spanish, which will allow you to browse through the top 40,000 lemmas to see frequency information, definition, collocates, concordances, and synonyms -- all on one page. In addition, you'll be able to input your own texts and analyze them with the corpus data (available Summer 2017).