Data Pre-crawled Datasets Ingredients Dataset – 18K+ Product Records with Ingredients Data from Beauty, Pets, Groceries & Health (CSV for AI & NLP)

Ingredients Dataset – 18K+ Product Records with Ingredients Data from Beauty, Pets, Groceries & Health (CSV for AI & NLP)

target.com Β· CSV

The Ingredients Dataset (18K+ records) provides a high-quality, structured collection of product information with detailed ingredients data. Covering a wide variety of categories including beauty, pet care, groceries, and health products, this dataset is designed to power AI, NLP, and machine learning applications that require domain-specific knowledge of consumer products.

Why This Dataset Matters

In today’s data-driven economy, access to structured and clean datasets is critical for building intelligent systems. For industries like healthcare, beauty, food-tech, and retail, the ability to analyze product ingredients enables deeper insights, including:

  • Identifying allergens or harmful substances

  • Comparing ingredient similarities across brands

  • Training LLMs and NLP models for better understanding of consumer products

  • Supporting regulatory compliance and labeling standards

  • Enhancing recommendation engines for personalized shopping

This dataset bridges the gap between raw, unstructured product data and actionable information by providing well-organized CSV files with fields that are easy to integrate into your workflows.

Dataset Coverage

The 18,000+ product records span several consumer categories:

  • πŸ› Beauty & Personal Care – cosmetics, skincare, haircare products with full ingredient transparency

  • 🐾 Pet Supplies – pet food and wellness products with detailed formulations

  • πŸ₯« Groceries & Packaged Foods – snacks, beverages, pantry staples with structured ingredients lists

  • πŸ’Š Health & Wellness – supplements, vitamins, and healthcare products with nutritional components

By including multiple categories, this dataset allows cross-domain analysis and model training that reflects real-world product diversity.

Key Features
  • πŸ“‚ 18,000+ records with structured ingredient fields

  • 🧾 Covers beauty, pet care, groceries, and health products

  • πŸ“Š Delivered in CSV format, ready to use for analytics or machine learning

  • 🏷 Includes categories and breadcrumbs for taxonomy and classification

  • πŸ”Ž Useful for AI, NLP, LLM fine-tuning, allergen detection, and product recommendation systems

Use Cases
  1. AI & NLP Training – fine-tune LLMs on structured ingredients data for food, beauty, and healthcare applications.

  2. Retail Analytics – analyze consumer product composition across categories to inform pricing, positioning, and product launches.

  3. Food & Health Research – detect allergens, evaluate ingredient safety, and study nutritional compositions.

  4. Recommendation Engines – build smarter product recommendation systems for e-commerce platforms.

  5. Regulatory & Compliance Tools – ensure products meet industry and government standards through ingredient validation.

Why Choose This Dataset

Unlike generic product feeds, this dataset emphasizes ingredient transparency across multiple categories. With 18K+ records, it strikes a balance between being comprehensive and affordable, making it suitable for startups, researchers, and enterprise teams looking to experiment with product intelligence.

Note: Each record includes a url (main page) and a buy_url (purchase page). Records are based on the buy_url to ensure unique, product-level data.

Fields
link, title, category, breadcrumbs, brand, department, avg_rating, reviews_count, reviews, tcin, buy_url, meta_data, current_price, regular_price, savings_in_percent, formatted_current_price_type, currency, upc, primary_image, alternate_images, description, specification, highlights, ingredients, nutrients, serving_size, serving_description, warning, at_a_glance, uniq_id, scraped_at
Pricing
$145.00

Availability: immediately

Records: 18,000

Related Collections
Quick access to filtered data: