Api | Computing Arts

LLM API Integration Patterns: Building Reliable AI-Powered Features

Adding an LLM to your application sounds simple: call the API, get a response, display it. In practice, you’re dealing with rate limits, token costs, latency spikes, and outputs that occasionally make no sense. These patterns help build LLM features that are reliable, cost-effective, and actually useful. The Basic Call Every LLM integration starts here: 1 2 3 4 5 6 7 8 9 10 11 from openai import OpenAI client = OpenAI() def ask_llm(prompt: str) -> str: response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": prompt}], temperature=0.7 ) return response.choices[0].message.content This works for demos. Production needs more. ...

CORS Demystified: Why Your API Calls Get Blocked

You’ve built an API. It works perfectly in Postman. Then you call it from your frontend and get: A h i c a s c s e p s b r s e e e s t n e o n b t f l e o o t c n c k h e t d h a e t b y r ' e h C q t O u t R e p S s s t : p e / d l a i r p c e i y s . : o e u x N r a o c m e p ' . l A e c . c c e o s m s ' - C f o r n o t m r o o l r - i A g l i l n o w ' - h O t r t i p g s i : n / ' m h y e a a p d p e . r c o m ' CORS isn’t your API being broken. It’s your browser protecting users. Understanding why it exists makes fixing it straightforward. ...

API Versioning: Breaking Changes Without Breaking Clients

Every API eventually needs to change in ways that break existing clients. Field removed, response format changed, authentication updated. The question isn’t whether you’ll have breaking changes — it’s how you’ll manage them. Good versioning gives you freedom to evolve while giving clients stability and migration paths. The Three Schools of Versioning URL Path Versioning G G E E T T / / a a p p i i / / v 1 2 / / u u s s e e r r s s / / 1 1 2 2 3 3 Pros: ...

API Rate Limiting Strategies That Don't Annoy Your Users

Every API needs rate limiting. Without it, one enthusiastic script kiddie or a bug in a client application can take down your entire service. The question isn’t whether to rate limit — it’s how to do it without making your API frustrating to use. The Naive Approach (And Why It Fails) 1 2 3 # Don't do this if requests_this_minute > 100: return 429, "Rate limit exceeded" Fixed limits per time window are simple to implement and almost always wrong. They create the “thundering herd” problem: all your users hit the limit at minute :00, back off, retry at :01, and create a synchronized spike that’s worse than no limit at all. ...

Defensive API Design: Building APIs That Survive the Real World

Your API will be called wrong. Clients will send garbage. Load will spike unexpectedly. Authentication will be misconfigured. The question isn’t whether these things happen — it’s whether your API degrades gracefully or explodes. Here’s how to build APIs that survive contact with the real world. Input Validation: Trust Nothing Every field, every header, every query parameter is hostile until proven otherwise. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 from pydantic import BaseModel, Field, validator from typing import Optional import re class CreateUserRequest(BaseModel): email: str = Field(..., max_length=254) name: str = Field(..., min_length=1, max_length=100) age: Optional[int] = Field(None, ge=0, le=150) @validator('email') def validate_email(cls, v): # Don't just regex — actually validate structure if not re.match(r'^[^@]+@[^@]+\.[^@]+$', v): raise ValueError('Invalid email format') return v.lower().strip() @validator('name') def sanitize_name(cls, v): # Remove control characters, normalize whitespace v = re.sub(r'[\x00-\x1f\x7f-\x9f]', '', v) return ' '.join(v.split()) Key principles: ...

LLM Prompt Engineering for DevOps Automation

LLMs are becoming infrastructure. Not just chatbots — actual components in automation pipelines. But getting reliable, parseable output requires disciplined prompt engineering. Here’s what works for DevOps use cases. The Core Problem LLMs are probabilistic. Ask the same question twice, get different answers. That’s fine for chat. It’s terrible for automation that needs to parse structured output. The solution: constrain the output format and validate aggressively. Pattern 1: Structured Output with JSON Mode Most LLM APIs now support JSON mode. Use it. ...

API Versioning Strategies: When and How to Break Things

Every API eventually needs to change in ways that break existing clients. The question isn’t whether to version — it’s how to version in a way that doesn’t make your users hate you. Here’s what actually works, with real trade-offs. The Three Schools of Versioning URL Path Versioning G G E E T T / / a a p p i i / / v 1 2 / / u u s s e e r r s s / / 1 1 2 2 3 3 Pros: ...

LLM API Integration Patterns: Building Resilient AI Applications

Integrating Large Language Model APIs into production applications requires more than just calling an endpoint. Here are battle-tested patterns for building resilient, cost-effective LLM integrations. The Retry Cascade LLM APIs are notorious for rate limits and transient failures. A simple exponential backoff isn’t enough — you need a cascade strategy: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 import asyncio from dataclasses import dataclass from typing import Optional @dataclass class LLMResponse: content: str model: str tokens_used: int class LLMCascade: def __init__(self): self.providers = [ ("anthropic", "claude-sonnet-4-20250514", 3), ("openai", "gpt-4o", 2), ("anthropic", "claude-3-haiku-20240307", 5), ] async def complete(self, prompt: str) -> Optional[LLMResponse]: for provider, model, max_retries in self.providers: for attempt in range(max_retries): try: return await self._call_provider(provider, model, prompt) except RateLimitError: await asyncio.sleep(2 ** attempt) except ProviderError: break # Try next provider return None The cascade falls through primary to fallback models, attempting retries at each level before moving on. ...

API Versioning: Strategies for Evolving Without Breaking

APIs are contracts. Breaking changes break trust. But APIs must evolve—new features, better designs, deprecated endpoints. The question isn’t whether to change, but how to change without leaving clients stranded. Why Versioning Matters Without versioning, you have two bad options: Never change: Your API calcifies, accumulating cruft forever Change freely: Clients break unexpectedly, trust erodes Versioning gives you a third path: evolve deliberately, with clear communication and migration windows. Versioning Strategies URL Path Versioning The most explicit approach—version in the URL: ...

Rate Limiting: Protecting Your APIs from Abuse and Overload

Every public API needs rate limiting. Without it, one misbehaving client can take down your entire service—whether through malice, bugs, or just enthusiasm. Rate limiting protects your infrastructure, ensures fair usage, and creates predictable behavior for all clients. The Core Algorithms Fixed Window Count requests in fixed time intervals (e.g., per minute): 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 class FixedWindowLimiter { constructor(redis, limit, windowSeconds) { this.redis = redis; this.limit = limit; this.windowSeconds = windowSeconds; } async isAllowed(clientId) { const window = Math.floor(Date.now() / 1000 / this.windowSeconds); const key = `ratelimit:${clientId}:${window}`; const count = await this.redis.incr(key); if (count === 1) { await this.redis.expire(key, this.windowSeconds); } return count <= this.limit; } } Pros: Simple, memory-efficient. Cons: Burst at window boundaries. Client could hit 100 requests at 0:59 and 100 more at 1:00. ...