CNN news dataset
edition.cnn.com · JSON
This dataset contains over 27,000 news articles sourced from CNN.com, including full content, metadata, and media fields. Each article is enriched with publish dates, author information, descriptions, and full raw + cleaned content—perfect for media research, sentiment analysis, topic modeling, and natural language processing (NLP) projects.
Last crawled in July 2021, this collection offers a historical snapshot of CNN’s reporting and editorial content.
Use Cases:-
News content analysis
-
Fake news detection & bias tracking
-
Topic classification and clustering
-
Training AI/NLP models
-
Historical news trend research
-
Media monitoring tools
Update Frequency:
Archived — no current updates, great for snapshot-based analysis
Fields
title, url, published_at. last_modified_at, author, short_description, header_image, raw_content, content, crawled_at, _id, source
Pricing
Availability: immediately
Records: 27,000