How We Built an AI Generation Engine for a SaaS Creative Suite | Gigaviz
Feb 17, 2026
How We Built an AI Generation Engine for a SaaS Creative Suite
A deep dive into the architecture behind Gigaviz Studio's AI generation engine β GPT-4o-mini for content, DALL-E 3 for images, recharts for visualization, token-gated generation, and fire-and-forget auto-triggers. Real code patterns for SaaS builders.
AI GenerationGPT-4o-miniDALL-E 3SaaS ArchitectureStudioOpenAIToken EconomyCreative SuiteNext.jsTypeScript
How We Built an AI Generation Engine for a SaaS Creative Suite
Most SaaS platforms add AI as an afterthought β a chatbot in the corner, a "summarize" button nobody clicks. We took a different approach: AI generation is the core value engine of Gigaviz Studio. Every document, chart, image, video storyboard, music composition, and dashboard can be generated from a single prompt.
This post covers the architecture decisions, the code patterns, and the lessons learned from building a unified AI generation system that serves 6 different content types inside a multi-tenant SaaS platform.
The problem: 6 content types, 1 interface
Gigaviz Studio is a creative suite with three modules β Office (documents), Graph (charts, dashboards, images, videos), and Tracks (music). Each module has its own database tables, its own API routes, and its own detail pages.
The challenge: how do you add AI generation to all 6 content types without duplicating logic, breaking the existing CRUD patterns, or making the token economy impossible to manage?
We rejected the obvious approach of adding generation sub-routes to each CRUD API. Instead, we built a single unified generation endpoint that handles all 6 types through a discriminated union schema.
One schema. One validation path. Six content types. The discriminated union gives us type safety at the boundary β TypeScript knows exactly which fields exist for each type after validation.
The generation pipeline
Every generation request follows the same 5-step pipeline:
1. Validate β The discriminated union schema parses the request body 2. Authorize β Check user session, workspace membership, and entitlement (`graph`, `office`, or `tracks`) 3. Deduct tokens β Call an atomic token deduction function before doing any AI work (fail fast on insufficient balance) 4. Generate β Call the appropriate AI function (GPT-4o-mini or DALL-E 3) 5. β Update the entity's status to `completed` and write generated data to the database
Persist
If step 4 fails, we roll back the entity status to `failed` so the user can retry. If step 3 fails (insufficient tokens), we return `402` immediately β no AI calls are made.
AI model selection
We use two OpenAI models:
GPT-4o-mini for structured content generation:
Documents (title, sections with heading + body, summary)
Charts (chart type recommendation, labels, datasets, config)
Video storyboards (scenes, narration, script, music suggestion)
Music compositions (structure, instruments, mood, waveform data)
Maps user dimensions to DALL-E's supported sizes (1024Γ1024, 1024Γ1792, 1792Γ1024)
Prepends style instructions to the prompt (e.g., "Create a photo-realistic image: ...")
Returns both the image URL and the revised prompt (DALL-E 3 rewrites prompts for quality)
Why GPT-4o-mini over GPT-4o? Cost. At ~$0.15/1M input tokens vs $2.50/1M, the difference is 16Γ for structured output that doesn't need frontier reasoning. For a token-gated SaaS, cost efficiency is existential.
Token economy integration
Every AI action has a token cost defined in our rate table:
Tokens are deducted before the AI call via an atomic database operation with row-level locking. If two requests race, only one succeeds. The 402 response triggers a clear "Insufficient tokens" message in the UI.
Fire-and-forget: auto-triggering generation on creation
The UX insight that made everything click: users who write a prompt during entity creation want generation to start immediately. They don't want to create the entity, wait for the page to load, then click "Generate."
So after the create form submits successfully, if a prompt was provided, we fire a generation request in the background. The client doesn't wait for the result β it navigates to the detail page immediately.
The user navigates to the detail page immediately. When the generation completes (typically 2-8 seconds), a page refresh shows the result. The entity's status transitions from `draft` β `generating` β `completed`.
Visualization: recharts for real-time chart rendering
AI-generated chart data needs to be rendered β not just stored as JSON. We chose recharts (built on D3) for its React-native API and support for 6 chart types:
Bar, Line, Area β Standard data visualization with dark theme
Pie β Percentage labels with custom color cells
Radar, Scatter β Advanced data patterns
The `ChartRenderer` component accepts the same JSON structure that GPT-4o-mini generates (labels + datasets), transforms it to recharts' record format, and renders with a professional dark-theme tooltip and responsive container.
Canvas-based waveform for music
Music compositions get a visual waveform β a canvas-drawn visualization with gradient bars (purple β cyan), play/pause animation, and progress tracking. The waveform data is a 128-element array of amplitudes generated by GPT-4o-mini.
Why canvas instead of SVG? Performance. 128 animated bars at 60fps would create DOM pressure with SVG elements. Canvas draws directly to pixels.
Dashboard widget grid
AI-generated dashboards produce widget arrays with 4 types: stat (KPI card with trend), chart (embedded ChartRenderer), text (markdown block), and table (rows + columns). A responsive CSS grid renders them with configurable column spans.
Status state machine
Every generatable entity follows the same state machine:
``` draft β generating β completed β β βββ (skip if no prompt) βββ (regenerate loops back to generating) β failed β (retry returns to generating) ```
This is enforced in the API β only entities in `draft` or `completed` status can trigger generation. The detail page renders different UI based on status (loading animation for `generating`, error state for `failed`, full visualization for `completed`).
What we'd do differently
WebSocket for real-time updates. Currently, users need to refresh the page to see generation results. A WebSocket or SSE connection would push the completed state to the client automatically.
Streaming for documents. GPT-4o-mini supports streaming responses. For long documents, streaming tokens to the UI as they generate would feel much faster than waiting for the complete response.
Image caching in Supabase Storage. DALL-E 3 returns temporary URLs. We should download and store images in Supabase Storage for permanence and faster loading.
Queue-based generation. The current approach is synchronous β the API handler waits for OpenAI to respond. For high concurrency, we should use a job queue (like the existing outbox worker pattern) to decouple request handling from AI execution.
Results
The unified generation engine serves 6 content types through 1 endpoint, 1 token system, and 1 status machine. The engine totals approximately 2,800 lines of new code across 8 files covering AI generation logic, the unified API endpoint, chart rendering, dashboard layout, waveform visualization, video storyboard display, and reusable trigger components.
All passing: typecheck, lint, and build.
If you're building AI generation into a SaaS platform, the key insight is this: don't add AI per-feature. Build a unified generation pipeline, gate it with tokens, and let the AI handle the creative heavy lifting while your CRUD layer handles persistence and security.
---
*Gigaviz Studio is live at gigaviz.com. The AI generation engine powers all 6 content types across Office, Graph, and Tracks modules.*