Corpus construction.

Texts are automatically extracted from the Internet and we produce corpora using tools developed in our team. The corpora can be monolingual or parallel.

webcorpusak.elhuyar.eus/

Technical features

Tools developed at Elhuyar enable us to detect bilingual documents in the Internet and align them sentence by sentence.

Success stories

Elhuyar’s web corpus.