Afunana
Afunana Documentation

AI Pipeline & Chat

LLM routing, build stages, specification documents, chat/RAG system, and MCP tools.

← Back to Docs

LLM Provider Routing

Supported Providers

ProviderModelsUse Case
AnthropicClaude Sonnet 4.6, Claude Haiku 4.5Primary provider for all roles
OpenAIGPT-4.1, GPT-4.1-mini, GPT-4oFallback or alternative provider
OllamaAny local modelAir-gapped environments

Role-Based Model Assignment

Each AI task has a "role" with its own model configuration:

RoleDefault ModelFallbackPurpose
builderClaude Sonnet 4.6GPT-4.1Program documentation generation
file_docsClaude Sonnet 4.6GPT-4.1-miniFile/field documentation
system_overviewClaude Sonnet 4.6GPT-4.1Architecture analysis
chat_plannerClaude Haiku 4.5GPT-4.1-miniQuery intent detection, context planning
chat_answerClaude Sonnet 4.6GPT-4.1Full chat responses with source code
chat_answer_simpleClaude Haiku 4.5GPT-4.1-miniQuick answers without code analysis
chat_classifierClaude Haiku 4.5GPT-4.1-miniIntent classification
code_developerClaude Sonnet 4.6GPT-4.1COBOL code generation from specs
spec_docClaude Haiku 4.5GPT-4.1-miniSpecification document generation
attachment_ocrClaude Haiku 4.5GPT-4oImage text extraction (OCR)

Fallback & Extended Thinking

Each role has a primary and fallback model. If the primary fails (rate limit, timeout, error), the fallback is tried automatically.

For complex tasks, "extended thinking" allows the model to reason internally before responding. Configurable budget per role (e.g., BUILDER_THINKING_BUDGET=8000 tokens). Requires Anthropic provider.

Prompt Caching

For Anthropic models, large system prompts (>4000 chars) are automatically marked for ephemeral caching, reducing cost on repeated calls with the same system prompt.

Build Pipeline

Stage 1: Parse

Reads CSV metadata files from the AS/400 extraction: TTPGMOUT.csv (program list), TTPGM2PGM.csv (call relationships), TTPGM2FIL.csv (file access), TTFIL2FLD.csv (field definitions).

Stage 2: Build Program Documents

For each program, the LLM analyzes the source code and generates structured documentation.

Output per program (output_program/{PROGRAM}.json):

{
  "documentMetadata": { "programName", "generatedAt", "language" },
  "programInfo": { "purpose", "businessContext", "module", "description" },
  "io": { "inputs": [...], "outputs": [...] },
  "fileAccess": [{ "fileName", "accessType", "fields": [...] }],
  "callGraph": { "callsThisPgm": [...], "thisPgmCalls": [...] },
  "errorHandling": [{ "code", "description" }],
  "changeImpactReport": { ... }
}

Stage 3: System Overview

The LLM analyzes the complete call graph, file dependencies, and program documentation to generate a system-level architecture document with narrative description, module identification, key business flows, risk assessment, and statistics.

Stage 4: Embedding & Indexing

Stage 5: Auto-Tagging

The AI classifies each program by business function and assigns appropriate tags from the collection's tag set.

Progress Tracking

Specification Document Generation

On-demand generation of formal specification documents at three audience levels:

AudienceTitleContent Focus
businessBusiness SpecificationBusiness purpose, rules, stakeholder impact
analystSystems Analysis DocumentData flows, interfaces, business logic
programmerProgram SpecsIO structures, call graph, error handling, change impact

Documents are cached in the generated_spec_docs table. Stale detection compares generation timestamp vs program JSON modification time. Manual regeneration is available via "Regenerate" button.

Chat System

Afunana's chat interface allows users to ask natural language questions about their codebase using a Retrieval-Augmented Generation (RAG) pipeline.

ModePurposeAPI
AskQ&A about the codebase — no changes/alerts or /alerts/v2
PlanGenerate change plans with approval workflow/alerts/v2 with planning

Query Processing Pipeline

Step 1: Intent Classification

The chat_classifier role (Claude Haiku) determines the query type: bug analysis, feature request, design question, documentation lookup, or code generation.

Step 2: Query Planning

The chat_planner role analyzes the query and determines what context is needed: which programs are relevant, which files to examine, whether source code is needed, and what search queries to run.

Step 3: Context Retrieval (Hybrid Search)

Two search systems run in parallel:

SourceWeight
Source code match0.50
Documentation hit0.35
Code keyword match0.25

Step 4: Source Code Selection

ScenarioMax Lines
Normal query500 lines
Bug analysis1,500 lines
Large program threshold500 lines

Step 5: Response Generation

The chat_answer role (Claude Sonnet) generates the response with extended thinking enabled (10,000 token budget by default). Response includes markdown formatting, code snippets, and line references with citations linking to specific programs and line numbers.

Step 6: Streaming

Responses stream token-by-token to the frontend. Users see the response being generated in real-time. Stop button allows canceling mid-generation.

Chat Session Management

Chat Attachments

TypeFormatsProcessing
ImagesPNG, JPEG, WebP, GIFOCR via vision LLM
TextTXT, CSV, LOGDirect text extraction

Maximum 10 MB per file, 5 files per message. Extracted text added to the query context.

MCP Tools (Model Context Protocol)

Afunana exposes 7 MCP tools for integration with Claude Desktop and other MCP clients:

ToolPurpose
get_alertsChat query with RAG retrieval
build_allTrigger full collection rebuild
get_info_pgmFetch program metadata
get_all_pgmsList all programs
get_all_filsList all files
tool_get_html_treGet call tree HTML
add_doc_to_docsUpload documentation

Chat Configuration

SettingDefaultDescription
CHROMA_SEARCH_K10Semantic search result count
BM25_TOP_K15Full-text search result count
CHAT_ANSWER_MAX_TOKENS12000Max response tokens
CHAT_ANSWER_THINKING_BUDGET10000Extended reasoning budget
CHAT_SOURCE_CODE_MAX_LINES500Max source lines in context
CHAT_SOURCE_CODE_BUG_MAX_LINES1500Max lines for bug analysis
CHAT_CACHE_ENABLEDfalseCache similar responses
CHAT_CACHE_THRESHOLD0.82Similarity threshold for cache hit

Privacy

Chat content logging to audit trail is configurable (AUDIT_LOG_CHAT_CONTENT). When disabled, only the event type and metadata are logged. Chat sessions are per-user and per-collection — users cannot see each other's conversations.