The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form Factuality Evaluation

📰 ArXiv cs.AI

Researchers introduce RiDiC, a pipeline for generating multilingual datasets with controlled popularity distribution for evaluating LLMs' long-form factuality

advanced Published 2 Apr 2026

Action Steps

Utilize Wikipedia and Wikidata as data sources to generate entities with specified characteristics
Configure the pipeline to control popularity distribution and other characteristics
Generate multilingual datasets for evaluating LLMs' long-form factuality
Apply the RiDiC dataset as an example for evaluating LLMs' performance

Who Needs to Know This

NLP engineers and researchers on a team benefit from this pipeline as it helps evaluate the factuality of LLMs' long-form generation, while data scientists and ML researchers can utilize the generated datasets for model training and testing

Key Insight

💡 The RiDiC pipeline enables controlled generation of datasets for evaluating LLMs' long-form factuality, complementing short-form QA datasets