CheckThat! Check-Worthiness Estimation

Check-Worthiness Estimation dataset is part of the 2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (Task 1). The aim of the task is to determine whether a piece of text is worth fact-checking. More pricelessly, given a tweet, produce a ranked list of tweets, ordered by their check-worthiness.

Identifier Task Type Metric License Website Code Download
CT21.T1 Check-Worthiness Estimation Average Precision

Data Source

The dataset is collected from Twitter and focuses on COVID-19. The tweets were annotated by three annotators, and disagreements were resolved by majority voting, and then by a consolidator.

Data Description

# Train Dev Test
Examples 2,995 350 357

Label Distribution

train validation test
Normal 0.869 0.823 0.787
Check-worthy 0.131 0.177 0.213

Vocabulary Overlap

Number of common words in the row and column divided by the total number of unique words in the row.

   train validation test
train 1.000 0.661 0.642
validation 0.113 1.000 0.287
test 0.116 0.303 1.000


[1] Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (Shaar et al., 2021) (Shaar et al., CLEF 2021)

  author    = {Shaden Shaar and
               Maram Hasanain and
               Bayan Hamdan and
               Zien Sheikh Ali and
               Fatima Haouari and
               Alex Nikolov,
               Mucahid Kutlu and
               Yavuz Selim Kartal,
               Firoj Alam and
               Da San Martino, Giovanni and
               Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and
               Rub\'{e}n M\'{i}guez and
               Tamer Elsayed and
               Preslav Nakov},
 title  = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 1 on Check-Worthiness Estimation in Tweets and Political Debates",
 year = {2021},
 booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",
 series = {CLEF~'2021},
 address = {Bucharest, Romania (online)},

