Check-Worthiness Estimation dataset is part of the 2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (Task 1). The aim of the task is to determine whether a piece of text is worth fact-checking. More pricelessly, given a tweet, produce a ranked list of tweets, ordered by their check-worthiness.
Identifier | Task Type | Metric | License | Website | Code | Download |
---|---|---|---|---|---|---|
CT21.T1 | Check-Worthiness Estimation | Average Precision |
The dataset is collected from Twitter and focuses on COVID-19. The tweets were annotated by three annotators, and disagreements were resolved by majority voting, and then by a consolidator.
# | Train | Dev | Test |
---|---|---|---|
Examples | 2,995 | 350 | 357 |
train | validation | test | |
---|---|---|---|
Normal | 0.869 | 0.823 | 0.787 |
Check-worthy | 0.131 | 0.177 | 0.213 |
Number of common words in the row and column divided by the total number of unique words in the row.
  | train | validation | test |
---|---|---|---|
train | 1.000 | 0.661 | 0.642 |
validation | 0.113 | 1.000 | 0.287 |
test | 0.116 | 0.303 | 1.000 |
[1] Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (Shaar et al., 2021) (Shaar et al., CLEF 2021)
@InProceedings{clef-checkthat:2021:task1,
author = {Shaden Shaar and
Maram Hasanain and
Bayan Hamdan and
Zien Sheikh Ali and
Fatima Haouari and
Alex Nikolov,
Mucahid Kutlu and
Yavuz Selim Kartal,
Firoj Alam and
Da San Martino, Giovanni and
Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and
Rub\'{e}n M\'{i}guez and
Tamer Elsayed and
Preslav Nakov},
title = "Overview of the {CLEF}-2021 {CheckThat}! Lab Task 1 on Check-Worthiness Estimation in Tweets and Political Debates",
year = {2021},
booktitle = "Working Notes of CLEF 2021---Conference and Labs of the Evaluation Forum",
series = {CLEF~'2021},
address = {Bucharest, Romania (online)},
url={http://ceur-ws.org/Vol-2936/paper-28.pdf}
}
None.