Afunana — System Architecture

High-Level Architecture

+------------------+     +------------------+     +------------------+
|                  |     |                  |     |                  |
|   IBM i / AS400  |<--->|   Afunana App    |<--->|   SQL Server     |
|   (Source System)|     |   (Docker)       |     |   (Docker)       |
|                  |     |                  |     |                  |
+------------------+     +--------+---------+     +------------------+
                                  |
                         +--------+---------+
                         |                  |
                         |   Caddy Proxy    |
                         |   (HTTPS/TLS)    |
                         |                  |
                         +--------+---------+
                                  |
                    +-------------+-------------+
                    |                           |
              +-----+------+            +------+------+
              |            |            |             |
              |  Browser   |            |  VS Code    |
              |  (React)   |            |  Extension  |
              |            |            |             |
              +------------+            +-------------+

Components

1. Frontend (React SPA)

Framework: React 18 + TypeScript + Vite
UI Library: Shadcn UI (Radix primitives + Tailwind CSS)
Routing: React Router v6 with lazy-loaded pages
State: React Context (language, theme) + TanStack Query (server state)
Internationalization: Custom LanguageContext with 500+ translation keys, full RTL support
Theme: Light/dark mode with system preference detection

Key pages: Programs, Files, Tree, System Overview, Data Dictionary, Cross Reference, Chat, Tools, Admin (12+ sub-pages).

2. Backend (FastAPI)

Framework: Python FastAPI + Uvicorn
Auth: JWT (HS256, 15-min idle, 8-hour max session)
Database: SQL Server via pyodbc
AI: Multi-provider LLM routing (Anthropic Claude Sonnet 4.6/Haiku 4.5, OpenAI GPT-4.1/4.1-mini, Ollama)
Search: ChromaDB (semantic embeddings) + BM25 (full-text)
AS/400: JDBC via jaydebeapi + jt400.jar
MCP: Model Context Protocol server (7 tools for Claude Desktop integration)

3. Database (SQL Server 2022)

Containerized or external instance
Tables: users, collections, config, audit log, chat sessions, tags, build history, spec docs, token revocation
TDE encryption at rest (AES-256)
Automated daily backups with 7-day retention

4. Caddy (Reverse Proxy)

Automatic HTTPS via Let's Encrypt
Routes: /api/* to backend, everything else to React SPA
Security headers (HSTS, CSP, X-Frame-Options, etc.)
Gzip compression
Serves static landing site at root domain

5. IBM i Connector

JDBC connection via jt400.jar
Extraction job submission (SBMJOB to TTDOCPGM1)
IFS file download (FTP binary mode)
Source member read/write for plan execution

Data Flow

Source Extraction Flow

IBM i -> TTDOCPGM1 batch job -> IFS output files -> FTP download -> Data/{collection}/

Extracted files:

TTPGMOUT.csv — Program list with metadata
TTPGM2PGM.csv — Call relationships (caller to called)
TTPGM2FIL.csv — Program-to-file usage (read/write/update)
TTFIL2FLD.csv — File-to-field definitions
TTFLDKEY.csv — Key field definitions
Source members to programs/ directory (EBCDIC to UTF-8)

Build Pipeline Flow

CSV files -> Parse -> LLM Analysis (batched) -> JSON output -> Embeddings -> Search indices

Stages:

Parse — CSV to structured data (programs, files, relationships)
Build Documents — LLM generates program JSON (purpose, IO, calls, files, errors)
System Overview — LLM generates architecture analysis
Embeddings — ChromaDB stores semantic vectors for RAG
Auto-Tagging — Classify programs by business function

Request Flow

Browser -> Caddy (HTTPS) -> FastAPI -> Auth middleware -> Route handler -> Response
                                             |
                                        SQL Server (user/collection validation)
                                             |
                                        Data/{collection}/ (program JSON, source code)
                                             |
                                        LLM Provider (if chat or doc generation)

Data Storage

Per-Collection Directory

Data/{collection_name}/
|-- system_overview.json          # Architecture analysis
|-- output_program/               # One JSON per program (structured metadata)
|-- programs/                     # COBOL/RPG/CL source text
|-- programs_csv/                 # Call relationship data
|-- info-from-as400/              # Raw metadata CSVs from extraction
|-- chroma_store/                 # ChromaDB vector embeddings
|-- bm25_store/                   # BM25 full-text indices
|-- build_progress.json           # Live build status
+-- generated_docs/               # User-uploaded documentation

Database Tables

Table	Purpose
`AFUNANA_USERS`	User accounts (username, email, password hash, role, status)
`user_collections`	User-to-collection access mapping
`col_packs`	Collection metadata (name, language, status, AS/400 config)
`app_config`	Runtime configuration (200+ keys, categorized)
`security_event_log`	Immutable audit trail with hash chain
`build_history`	Build status, timing, error tracking
`chat_sessions` / `chat_messages`	Persistent chat history
`generated_spec_docs`	Cached AI-generated specification documents
`collection_tags` / `entity_tags`	Tag definitions and assignments
`revoked_tokens` / `revoked_sessions`	Token/session invalidation

Network Architecture

Port	Service	Access
443	Caddy (HTTPS)	Public
80	Caddy (HTTP redirect)	Public
8001	FastAPI	Internal (127.0.0.1 only)
1433	SQL Server	Internal (container network)
8080	Adminer (DB GUI)	Internal (127.0.0.1 only)
9000	Deploy Receiver	Internal (Docker bridge)

Scalability Considerations

LLM Processing: Configurable parallelism (MAX_PARALLEL_LLM, default 6 concurrent calls)
Build Batching: Programs processed in batches (18 sections, 80K chars per batch)
Caching: In-memory caches for config, programs, collections; session storage for UI state
Database: Connection pooling, parameterized queries, indexed lookups
Frontend: Code splitting, lazy loading, React Query cache (5-min stale, 30-min GC)