From the makers of Foresight-32B
Generate verified datasets at scale.
Quality data is the biggest blocker for most LLM projects. LightningRod makes it easy to generate, transform, and verify datasets grounded in real sources—in just a few lines of Python.
import lightningrod as lr
# Get antitrust news to train a domain expert
seeds = lr.NewsSeedGenerator(
query="antitrust investigation",
start_date="2025-01-01"
)
# Define the scope and style of the questions
questioner = lr.QuestionGenerator(
instructions="Write forward-looking, self-contained questions with explicit dates/entities.",
examples=[
"What is the likely outcome of the DOJ lawsuit?",
"Which specific Sherman Act violations are cited?"
]
)
# Verify answers against live sources
labeler = lr.WebSearchLabeler()
# Run pipeline
pipeline = lr.Pipeline(seeds, questioner, labeler)
dataset = pipeline.batch(100)Built For
SFT Training RL Training RAG Evaluation Model Benchmarking