Show HN: Misata โ€“ synthetic data engine using LLM and Vectorized NumPy

https://news.ycombinator.com/rss Hits: 5
Summary

๐Ÿง  Misata Generate realistic multi-table datasets from natural language. No schema writing. No training data. Just describe what you need. โœจ What Makes Misata Different Feature Faker SDV Misata Natural language input โŒ โŒ โœ… Auto schema generation โŒ โŒ โœ… Relational integrity โŒ โœ… โœ… Business constraints โŒ โŒ โœ… No training data needed โœ… โŒ โœ… Streaming (10M+ rows) โŒ โŒ โœ… ๐Ÿš€ Quick Start pip install misata With Groq (Free, Fast) export GROQ_API_KEY=your_key # Get free: https://console.groq.com misata generate --story " A SaaS with 50K users, subscriptions, and payments " --use-llm With OpenAI export OPENAI_API_KEY=your_key misata generate --story " E-commerce with products and orders " --use-llm --provider openai With Ollama (Local, Free, Private) ollama run llama3 # Start Ollama first misata generate --story " Fitness app with workouts " --use-llm --provider ollama ๐Ÿ“Š Example Output $ misata generate --story "A fitness app with 50K users" --use-llm ๐Ÿง  Using Groq (llama-3.3-70b-versatile) for intelligent parsing... โœ… LLM schema generated successfully! ๐Ÿ“‹ Schema: FitnessApp Tables: 5 Relationships: 4 ๐Ÿ”ง Generating 5 table(s)... โœ“ exercises (10 rows) โœ“ plans (5 rows) โœ“ users (50,000 rows) โœ“ subscriptions (45,000 rows) โœ“ workouts (500,000 rows) โฑ๏ธ Generation time: 2.34 seconds ๐Ÿš€ Performance: 213,675 rows/second ๐Ÿ’พ Data saved to: ./generated_data ๐Ÿ’ป Python API from misata import DataSimulator , SchemaConfig from misata . llm_parser import LLMSchemaGenerator # Generate schema from story llm = LLMSchemaGenerator ( provider = "groq" ) # or "openai", "ollama" config = llm . generate_from_story ( "A mobile fitness app with 50K users, workout tracking, " "premium subscriptions, and January signup spikes" ) # Generate data for table_name , batch in DataSimulator ( config ). generate_all (): print ( f"Generated { len ( batch ) } rows for { table_name } " ) ๐Ÿ”ง CLI Reference # Basic generation (rule-based, no API key needed) misata generate --story " SaaS company with users and subscriptions " # LLM-...

First seen: 2025-12-20 00:20

Last seen: 2025-12-20 04:22