Language resources and tools.

Web services

  • Modela

    Intelligent machine translation for Basque.

  • Xuxen

    Spelling and grammar checker for Basque

  • TermKate

    Online platform for the creation of specialized dictionaries.

  • Elhuyar Dictionaries

    Online dictionaries: Basque<>Spanish, Basque<>French, Basque<>English

  • Automatic dictionaries

    Web for consulting bilingual dictionaries automatically built by pivot-techniques.

  • Elhuyar web corpusak

    Web to query two large corpora automatically compiled from the web, one Basque and one parallel Spanish-Basque.

  • CorpEus

    This website offers the possibility of searching for word s or terms in Basque on the web, with the results shown as corpus queries in context.

  • Elebila

    Search engine for Basque, the only one that allows you to limit the results to Basque.


Opinion Mining - Sentiment Analysis


Spanish polarity lexicon.


Basque polarity lexicon.

Basque Opinon Dataset

Polarity annotated Basque sentences.

BEC2016 opinion dataset

Basque regional election campaign 2016 opinion dataset - BEC2016. 25.000 Tweets with entity level polarity annotations (pos|neg).

Behagunea Opinion dataset

Tweet collection about the DSS2016 Cultural capital project. Tweets annotated with polarity at message level (pos|neg|neu) in Basque (3000) and Spanish (4754).

EliXa polarity classification models (EliXa 1.0.x)

Models for polarity classification, trained over cultural domain (Behagunea) tweets.
Previous versions: v 0.9.x

EliXa resources (EliXa 1.0.x <=)

language specific resources: polarity lexicons and other resources for text normalization. We currently provide such resources for 4 languages; Basque (eu), Spanish (es), English (en) and French (fr). Also includes pos tagging models for ixa-pipe-pos tool.
Previous versions: v 0.9.x ( Ixa-pipes pos models not  included)

Ixa-Pipes models for EliXa 0.9.x

Ixa-Pipes models used for lemmatization and POS tagging (1.5.0) by EliXa 0.9.x as default models.


Basque-English Parallel corpus

Basque-English parallel corpus automatically gathered using the PaCo2 tool.

Basque-Spanish Parallel corpus

Basque-Spanish parallel corpus automatically gathered using the PaCo2 tool. It contains 640K segments.

Elhuyar web corpus

Corpus of 186M tokens in Basque. Automatically crawled and cleaned from the Web.
Ref: Leturia, I. 2014. The Web as a Corpus of Basque. PhD Thesis. Faculty of Informatics, UPV/EHU. Donostia.