How does the unified generation endpoint work?

A single POST /api/studio/generate endpoint handles all 6 content types using a Zod discriminated union. It validates input, checks entitlements, deducts tokens, calls the appropriate AI model, and updates the entity status.

Why use GPT-4o-mini instead of GPT-4o?

Cost efficiency. GPT-4o-mini costs ~$0.15/1M input tokens vs $2.50/1M for GPT-4o — a 16× difference. For structured output generation that doesn't need frontier reasoning, the quality is sufficient while keeping the token economy sustainable.

What is fire-and-forget auto-triggering?

When a user creates content with a prompt, the generation request is fired in the background immediately after creation. The user navigates to the detail page while AI processes in the background — no extra click needed.

How are tokens deducted for AI generation?

Tokens are deducted atomically via a PostgreSQL RPC (consumeTokens) before any AI call is made. If the balance is insufficient, the request returns 402 immediately. This prevents token overdraft in concurrent scenarios.

How We Built an AI Generation Engine for a SaaS Creative Suite

A deep dive into the architecture behind Gigaviz Studio's AI generation engine — GPT-4o-mini for content, DALL-E 3 for images, recharts for visualization, token-gated generation, and fire-and-forget auto-triggers. Real code patterns for SaaS builders.

AI GenerationGPT-4o-miniDALL-E 3SaaS ArchitectureStudioOpenAIToken EconomyCreative SuiteNext.jsTypeScript

How We Built an AI Generation Engine for a SaaS Creative Suite

Most SaaS platforms add AI as an afterthought — a chatbot in the corner, a "summarize" button nobody clicks. We took a different approach: AI generation is the core value engine of Gigaviz Studio. Every document, chart, image, video storyboard, music composition, and dashboard can be generated from a single prompt.

This post covers the architecture decisions, the code patterns, and the lessons learned from building a unified AI generation system that serves 6 different content types inside a multi-tenant SaaS platform.

The problem: 6 content types, 1 interface

Gigaviz Studio is a creative suite with three modules — Office (documents), Graph (charts, dashboards, images, videos), and Tracks (music). Each module has its own database tables, its own API routes, and its own detail pages.

The challenge: how do you add AI generation to all 6 content types without duplicating logic, breaking the existing CRUD patterns, or making the token economy impossible to manage?

Architecture decision: unified generation endpoint

We rejected the obvious approach of adding generation sub-routes to each CRUD API. Instead, we built a single unified generation endpoint that handles all 6 types through a discriminated union schema.

One schema. One validation path. Six content types. The discriminated union gives us type safety at the boundary — TypeScript knows exactly which fields exist for each type after validation.

The generation pipeline

Every generation request follows the same 5-step pipeline:

1. Validate → The discriminated union schema parses the request body 2. Authorize → Check user session, workspace membership, and entitlement (`graph`, `office`, or `tracks`) 3. Deduct tokens → Call an atomic token deduction function before doing any AI work (fail fast on insufficient balance) 4. Generate → Call the appropriate AI function (GPT-4o-mini or DALL-E 3) 5. → Update the entity's status to `completed` and write generated data to the database

How We Built an AI Generation Engine for a SaaS Creative Suite

How We Built an AI Generation Engine for a SaaS Creative Suite

The problem: 6 content types, 1 interface

Architecture decision: unified generation endpoint

The generation pipeline

AI model selection

Token economy integration

Fire-and-forget: auto-triggering generation on creation

Visualization: recharts for real-time chart rendering

Canvas-based waveform for music

Dashboard widget grid

Status state machine

What we'd do differently

Results