XNLI: Evaluating Cross-lingual Sentence Representations

XNLI is a subset of a few thousand examples from MNLI which has been translated into a 14 different languages (some low-ish resource). As with MNLI, the goal is to predict textual entailment (does sentence A imply/contradict/neither sentence B) and is a classification task (given two sentences, predict one of three labels).

Identifier Task Type Metric License Website Code Download
XNLI NLI / Entailment Accuracy CC BY-NC 4.0

Data Source

Manually annotated sentences. The training and development sets are translated from English.

Data Description

# Train Dev Test
Examples 392,702 5,010 2,490

Label Distribution

train validation test
contradiction 0.333 0.333 0.333
entailment 0.333 0.333 0.333
neutral 0.333 0.333 0.333

Vocabulary Overlap

Number of common words in the row and column divided by the total number of unique words in the row.


   train validation test
train 1.000 0.704 0.691
validation 0.053 1.000 0.258
test 0.087 0.436 1.000


   train validation test
train 1.000 0.712 0.684
validation 0.064 1.000 0.290
test 0.100 0.478 1.000


   train validation test
train 1.000 0.702 0.674
validation 0.075 1.000 0.306
test 0.116 0.496 1.000


language	gold_label	sentence1_binary_parse	sentence2_binary_parse	sentence1_parse	sentence2_parse	sentence1	sentence2	promptID	pairID	genre	label1	label2	label3	label4	label5	sentence1_tokenized	sentence2_tokenized	match
bg	neutral					И той каза: Мамо, у дома съм.	Той се обади на майка си веднага щом училищният автобус го е оставил.	1	1	facetoface	neutral	contradiction	neutral	neutral	neutral	И той каза : Мамо , у дома съм .	Той се обади на майка си веднага щом училищният автобус го е оставил .	True


Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). See the LICENSE file.