AI Integration

CruzJS provides AIService, an injectable facade for AI capabilities. The underlying provider is determined by the adapter — Workers AI on Cloudflare, Amazon Bedrock on AWS, Vertex AI on GCP, and OpenAI-compatible APIs on Azure, DigitalOcean, and Docker (including Ollama for local inference). All methods return typed results or null on failure — they never throw.

Configuration

Configuration depends on your deployment adapter.

Cloudflare

Set the following environment variables and enable the Workers AI binding:

CF_AI_GATEWAY_ID=your-gateway-id
CF_AIG_TOKEN=your-gateway-token

[ai]
binding = "AI"

For full details, see the Cloudflare AI guide.

AWS (Bedrock)

Amazon Bedrock credentials and region are configured automatically by the AWS adapter. Optionally override the region:

AWS_BEDROCK_REGION=us-east-1

GCP (Vertex AI)

GCP credentials are picked up automatically via Application Default Credentials. No additional environment variables are required when deploying to Cloud Run or Cloud Functions.

Azure, DigitalOcean, Docker (OpenAI-compatible)

Set an API key and, optionally, a custom base URL:

OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://api.openai.com/v1  # default

For local inference with Ollama, point the base URL to your Ollama instance:

OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://localhost:11434/v1

Quick Start

import { AIService } from '@cruzjs/core';

// In a service
@Injectable()
export class MyService {
  constructor(@Inject(AIService) private ai: AIService) {}

  async process(text: string) {
    const summary = await this.ai.chat({
      prompt: text,
      system: 'Summarize the following text concisely.',
      size: 'medium',
    });
    return summary; // string | null
  }
}

Text Generation

const response = await this.ai.chat({
  prompt: 'Explain edge computing in one paragraph.',
  system: 'You are a technical writer.',
  size: 'small',       // 'small' | 'medium' | 'large'
  temperature: 0.7,
  maxTokens: 200,
});

Model Sizes

Size	Model	Best For
`small`	Gemini 2.5 Flash Lite	Simple tasks, fast
`medium`	Gemini 2.5 Flash	General purpose (default)
`large`	Gemini 2.5 Pro	Complex reasoning

Structured Extraction

Extract typed data from unstructured text with automatic Zod validation and retry:

const schema = z.object({
  title: z.string(),
  topics: z.array(z.string()),
  sentiment: z.enum(['positive', 'negative', 'neutral']),
});

const result = await this.ai.extractStructured({
  prompt: articleText,
  system: 'Extract metadata from this article.',
  schema,
  schemaName: 'ArticleMetadata',
  size: 'medium',
  maxRetries: 3,
});

if (result) {
  console.log(result.title);  // Fully typed
}

Embeddings

Generate vector embeddings for search and similarity:

const vectors = await this.ai.embed(['search query', 'document text']);
// vectors: number[][] | null

Model options: 'small' (384d), 'base' (768d, default), 'large' (1024d).

Image Description

const description = await this.ai.describeImage(imageBuffer);
// Or with a custom prompt:
const analysis = await this.ai.describeImage(imageBuffer, 'What objects are in this image?');

Sentiment Analysis

const sentiment = await this.ai.analyzeSentiment('This product is amazing!');
// { label: 'POSITIVE', score: 0.95 } | null

Usage in Routers

export const aiRouter = router({
  summarize: protectedProcedure
    .input(z.object({ text: z.string().max(10000) }))
    .mutation(async ({ input }) => {
      const container = await getAppContainer();
      const ai = container.resolve(AIService);
      return ai.chat({
        prompt: input.text,
        system: 'Summarize concisely.',
      });
    }),
});

Error Handling

All AIService methods return null on failure:

AI not configured (missing env vars)
Workers AI binding not available (local dev)
Network errors or model failures
Schema validation failures (after retries)

const result = await this.ai.chat({ prompt: 'Hello' });
if (!result) {
  // Handle gracefully — provide fallback or skip AI features
}

Local Development

AI Gateway methods (chat, extractStructured) work locally if env vars are set
Workers AI methods (embed, describeImage, analyzeSentiment) require wrangler dev for the AI binding
All methods return null gracefully when unavailable

Best Practices

Check isConfigured() before features that require AI
Use the smallest model size that works for your task
Cache responses with KVCacheService for repeated inputs
Set reasonable maxTokens to control costs
Use extractStructured() over chat() when you need typed output