Tutorials March 31, 2026 16 min read 3,742 words

Deploy AI Chatbot Docker: Production Setup in 5 Min

Deploy a production AI chatbot with Docker Compose in 5 minutes. Self-hosted 5-container stack with RAG, multi-LLM support, and admin panel. EUR79 one-time.

getagent.chat

Most "deploy AI chatbot with Docker" tutorials show you a toy. A single Python container, a hardcoded OpenAI key, no database, no admin panel, no way to manage anything. You follow the steps, it works on localhost, and then you're on your own the moment you need a real user-facing product.

This guide builds the real thing. We're deploying AI Chat Agent — a production-ready, five-container stack with a Node.js API server, React admin panel, PostgreSQL with pgvector for RAG, Redis, and Nginx — all wired together with a single docker-compose.yml. By the end you'll have a fully operational self-hosted AI chatbot on your VPS: knowledge base, embeddable widget, and support for multiple LLM providers. No monthly SaaS fees. Full data control.

5-Container Production Stack Internet Users nginx ports 80/443 Docker Internal Network server Node.js API :3000 admin React UI :4173 db PostgreSQL + pgvector :5432 redis Cache & Sessions :6379 Reverse proxy API / Logic Management UI Persistence Cache Internal Docker traffic
The full five-container stack: Nginx fronts all traffic, routing to the Node.js API and React admin panel, which share PostgreSQL and Redis on the internal Docker network.

Why Deploy AI Chatbots with Docker?

If you've ever tried to install a Node.js API, a React build server, PostgreSQL, and Redis on a bare VPS manually, you know what dependency hell looks like. Different package versions, conflicting system libraries, environment variables scattered across config files you'll never find again at 2am when something breaks.

Docker solves this by giving each service its own isolated environment. Your API server doesn't care what version of libssl is on the host. Your database doesn't conflict with the one running for another project. Everything is defined in code, version-controlled, and reproducible across any machine that runs Docker. If you're still evaluating which platform to deploy, see our self-hosted chatbot comparison first.

For AI chatbots specifically, the benefits compound:

  • Reproducibility: The exact same stack that runs on your laptop runs on your VPS. "It works on my machine" stops being a sentence anyone says.
  • One-command deploy: docker compose up -d starts every service in the right order with health checks. No manual sequencing.
  • Portability: Move from Hetzner to DigitalOcean to AWS without reinstalling anything. Your docker-compose.yml goes with you.
  • Isolation: A memory leak in the admin panel doesn't take down the API server. Containers fail independently and restart independently.
  • Upgrades without downtime: Pull a new image, restart one container. The rest keep serving traffic.

This matters even more when your stack has a vector database, a background job processor, and a WebSocket server all trying to share the same host. Docker Compose gives you a clean mental model: each container has one job, defined ports, and explicit dependencies. You can read the entire infrastructure in one file.

The alternative — a SaaS chatbot platform — trades this control for a monthly invoice. Compared to Intercom, for example, you're looking at €79/month just to get started with basic features. With a self-hosted Docker setup, that's your entire software cost for life. For the business case behind the switch, see how AI chatbots cut support ticket volume in practice.

Docker Compose AI Chatbot Architecture: The 5-Container Stack

Understanding what you're deploying before you deploy it is the difference between a box you trust and a black box that scares you. Here's what the stack looks like and why each container is there.

Container Data Flow Browser widget / admin HTTPS nginx SSL termination Traffic routing Security headers :3000 :4173 server JWT auth Chat sessions RAG pipeline LLM routing AES-256 encryption admin React 18 SPA 10 management pages Internal access only REST SQL + vectors cache db PostgreSQL 16 pgvector extension Vector embeddings Chat + config data redis Session storage Rate limiting Job queuing LLM APIs OpenAI Anthropic Gemini Ollama API All inter-container traffic stays on the internal Docker bridge network — nothing exposed except nginx ports 80/443
Request flow through the stack: Nginx routes inbound HTTPS to either the API server or the admin panel. The server exclusively handles LLM calls, database reads/writes, and Redis caching. The admin panel communicates with the server only via internal REST.

server — The API Layer (port 3000)

A Node.js + Express application written in TypeScript. This is the brain of the operation: it handles chat sessions, manages bot configurations, processes knowledge base uploads, talks to AI providers, runs rate limiting, and exposes the REST API that the admin panel and the widget both consume. JWT authentication is enforced here. AES-256 encryption protects API keys at rest.

admin — The Management Panel (port 4173)

A React 18 single-page application that gives you 10 pages of controls: bot configuration, AI provider settings, knowledge base management, chat history, lead management, notification settings, and more. It talks exclusively to the server container via the internal Docker network. External users never reach it directly — Nginx decides who can.

db — PostgreSQL 16 + pgvector

The persistence layer for everything: chat sessions, bot configurations, leads, user accounts, and vector embeddings. The pgvector extension turns a standard PostgreSQL instance into a vector database, enabling semantic similarity search for the RAG knowledge base. Cosine similarity search over document chunks — all inside the same container your app already trusts for structured data.

redis — Cache and Session Store

Redis 7 handles session storage, API rate limiting counters, and background job queuing. When 50 users hit the widget simultaneously, Redis absorbs the burst. It also decouples real-time operations from the database, keeping response times low under load.

nginx — The Front Door (ports 80/443)

Nginx terminates SSL, routes traffic to the right container, serves the widget's static JavaScript, and enforces security headers. It's what your domain points to. Everything else is on the internal Docker network and invisible to the outside world.

Prerequisites

Before you run a single command, make sure these are in place. Cutting corners here is how you end up spending three hours debugging something that has nothing to do with the chatbot.

  • VPS with at least 2GB RAM. The full stack idles at around 800MB–1GB, which leaves headroom for traffic. Hetzner CX22 (€4.15/month), DigitalOcean Droplet (€12/month), or Vultr are all reasonable choices for deploying a chatbot to a VPS.
  • Docker Engine 24+ and Docker Compose v2. Run docker --version and docker compose version to confirm. If you're on Ubuntu 22.04 or 24.04, the official Docker install script handles both: curl -fsSL https://get.docker.com | sh.
  • A domain name (optional but recommended). Without a domain, you can still access the admin panel by IP. With one, Nginx handles SSL via Let's Encrypt and your widget loads over HTTPS — which is required for embedding on most modern sites.
  • An AI provider API key. OpenAI is the default (GPT-4o-mini). Anthropic Claude, Google Gemini, or any OpenAI-compatible endpoint also work — you configure this per bot, not globally. You only need one to get started.
  • Git. To clone the repository. That's it.

If you're running this on Windows locally for testing, Docker Desktop covers you. For production, a Linux VPS is strongly preferred.

Step 1: Clone and Configure

Once you have the license, you'll receive access to the repository. Clone it onto your VPS:

git clone https://github.com/your-org/ai-chat-agent.git
cd ai-chat-agent

The repository includes a .env.example file. Copy it and fill in your values:

cp .env.example .env
nano .env

Critical variables to set:

# Application
NODE_ENV=production
PORT=3000

# Database
POSTGRES_DB=ai_chat_agent
POSTGRES_USER=chatbot
POSTGRES_PASSWORD=your_strong_password_here

# Redis
REDIS_URL=redis://redis:6379

# Security — generate with: openssl rand -hex 32
JWT_SECRET=your_jwt_secret_64_chars_minimum
ENCRYPTION_KEY=your_aes_256_key_exactly_32_chars

# AI Provider (choose one to start)
OPENAI_API_KEY=sk-proj-...

# Optional: Domain for SSL
DOMAIN=chat.yourdomain.com
ADMIN_DOMAIN=admin.yourdomain.com

Key points:

  • The ENCRYPTION_KEY must be exactly 32 characters — it's used for AES-256 encryption of API keys stored in the database. Generate it with openssl rand -hex 16 (produces 32 hex chars).
  • JWT_SECRET should be at least 64 random characters. Use openssl rand -hex 32.
  • Never commit .env to version control. It's in .gitignore by default, but double-check.
  • You don't need to set all AI provider keys upfront. You configure which provider each bot uses from inside the admin panel.

Step 2: Deploy Your AI Chatbot with Docker Compose

With the environment configured, bring up the stack:

docker compose up -d

Docker pulls all images on first run, creates the internal network, and starts containers in dependency order: db and redis first, then server once the database is healthy, then admin, then nginx.

Tail the logs to confirm everything starts cleanly:

docker compose logs -f

On a fresh VPS, expect:

  • Image pulls: 1–2 minutes (only first time)
  • Database initialization: 15–30 seconds (schema migrations run automatically)
  • Server ready: ~10 seconds after db is healthy
  • Total (excluding image pulls): Under 60 seconds

Check that all five containers are running:

docker compose ps

You should see output like:

NAME                    STATUS          PORTS
ai-chat-agent-server    Up (healthy)    0.0.0.0:3000->3000/tcp
ai-chat-agent-admin     Up (healthy)    0.0.0.0:4173->4173/tcp
ai-chat-agent-db        Up (healthy)    5432/tcp
ai-chat-agent-redis     Up (healthy)    6379/tcp
ai-chat-agent-nginx     Up              0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp

If any container shows Exit 1, check its logs: docker compose logs server. Nine times out of ten, a misconfigured .env variable is the cause.

Deploy in 5 Minutes 1 Clone git clone repository ~30s 2 Configure cp .env.example set secrets ~1 min 3 Launch docker compose up -d ~60s 4 Setup Bot admin panel config + prompt ~2 min 5 Embed paste widget <script> tag ~30s Live! Total: under 5 minutes
The full deployment sequence from a fresh VPS to a live chatbot widget. Steps 1–3 are terminal commands; steps 4–5 are browser-based.

Step 3: Access Admin Panel and Create Your First Bot

Open http://your-server-ip:4173 (or your configured domain) and log in with the admin credentials from .env.

The admin panel has 10 sections. For your first bot, you need three:

Create a Bot

Go to Bots → New Bot. Give it a name, set the welcome message users will see when the widget loads, and optionally add suggested questions to help users get started. Each bot is fully isolated — separate knowledge base, separate chat history, separate leads, separate configuration.

Configure AI Provider

Go to AI Provider for your bot. Select your provider (OpenAI, Anthropic, Google Gemini, or a custom OpenAI-compatible endpoint), enter your API key (stored AES-256 encrypted), and pick a model. GPT-4o-mini is the default — fast, cost-efficient, and solid for most support and FAQ workloads. Need stronger reasoning? Switch to gpt-4o or claude-3-5-sonnet. You can even run multiple LLM providers simultaneously across different bots for cost optimization and failover resilience.

Write the System Prompt

The system prompt is where the bot's personality and scope live. Keep it focused:

You are a helpful support assistant for [Company Name].
You answer questions about our products, pricing, and features.
If you don't know the answer, say so honestly and offer to connect
the user with a human agent. Be concise and friendly.

Save. Your bot is live — test it from the Dashboard by opening a chat preview.

Step 4: Add Knowledge Base (RAG)

A chatbot limited to its system prompt gives generic answers. The knowledge base feature lets you upload documentation, help articles, product pages, and FAQs — the bot retrieves relevant chunks at query time and answers from your actual content.

This is RAG (Retrieval-Augmented Generation): what separates a useful chatbot from a hallucination machine.

RAG Pipeline: From Document to Grounded Answer INDEXING PHASE (upload time) 📄 Document PDF / HTML / DOCX Chunking 512-token 50-token overlap Embedding text-embedding -3-small pgvector vector store PostgreSQL QUERY PHASE (runtime, per message) User Query "How do I... reset my pass?" Embed Query same model → query vector Similarity cosine search top-K=3 chunks LLM Context system prompt + top-3 chunks + user message LLM grounded answer no hallucinations User reply retrieve Embeddings generated once at upload. Retrieval adds ~50-100ms per query. Accuracy depends on chunk quality, not LLM alone.
The RAG pipeline has two phases: indexing at upload time (top row) and retrieval at runtime (bottom row). The same embedding model must be used in both phases for cosine similarity to work correctly.

How It Works

When you upload a document or crawl a URL, the server chunks the content into 512-token segments with 50-token overlaps, then generates a vector embedding for each chunk using OpenAI's text-embedding-3-small model (or Gemini embeddings if you prefer). These embeddings are stored in PostgreSQL via the pgvector extension.

At query time, the user's message is embedded using the same model, and the top-K most similar chunks are retrieved via cosine similarity search (default K=3). Those chunks are injected into the LLM's context window alongside the system prompt, giving the model accurate, grounded information to work with.

Uploading Content

Navigate to Knowledge Base for your bot. You can:

  • Upload files: PDF, DOCX, and HTML files are all supported. Upload your product manual, pricing guide, or FAQ document directly.
  • Crawl a URL: Enter any URL and the crawler will recursively follow same-domain links, extracting and indexing text content. Useful for indexing your entire help center or documentation site.

After upload, the indexing happens asynchronously in the background. A progress indicator shows you when embeddings are ready. For a 50-page PDF, expect 30–60 seconds.

Test it by asking the bot something specific from your uploaded content. If it answers accurately with source context, RAG is working correctly.

Step 5: Embed the Widget

Your bot is running — now put it on your website. Go to Settings → Widget for your bot. You'll find a snippet like this:

<script
  src="https://your-domain.com/widget.js"
  data-bot-id="your-bot-uuid"
  data-position="bottom-right"
  data-theme="light"
  data-primary-color="#6366f1"
></script>

Paste this before the closing </body> tag. The widget is vanilla JavaScript, ~22KB gzipped, and loads asynchronously — it won't block page rendering.

Customization options you can set via data- attributes or the admin panel UI:

  • Position: bottom-right, bottom-left
  • Theme: light, dark, auto (follows system preference)
  • Primary color: Any hex value to match your brand
  • Welcome message: Displayed when the widget first opens
  • Suggested questions: Quick-tap prompts that appear on first load
  • Lead capture form: Pre-chat form to collect name/email, or mid-chat AI extraction from conversation

Because the widget loads from your own domain, there are no third-party scripts, no cross-origin tracking, and no GDPR compliance issues from a vendor's analytics pipeline running in your users' browsers.

Multi-LLM Support: Switch Between AI Providers

One of AI Chat Agent's most practical advantages over simpler self-hosted solutions is per-bot provider configuration. You're not locked into one LLM for everything.

From the AI Provider page of any bot, you can configure:

  • OpenAI: GPT-4o-mini (default, cost-efficient), GPT-4o (higher reasoning), or any model on the API
  • Anthropic Claude: Claude 3.5 Sonnet, Claude 3 Haiku — excellent for nuanced, context-heavy conversations
  • Google Gemini: Gemini 1.5 Flash (fast), Gemini 1.5 Pro (longer context window, useful for large document RAG)
  • Custom OpenAI-compatible endpoint: Point to any local Ollama instance, a self-hosted LLM, or a third-party provider that speaks the OpenAI API format

This means you can run a cost-sensitive FAQ bot on GPT-4o-mini, a high-stakes sales qualification bot on Claude 3.5 Sonnet, and an internal knowledge base bot on a local Ollama instance — all from the same Docker stack, all managed from the same admin panel.

API keys are stored encrypted (AES-256) in the database. Switching providers is a UI operation — no config file edits, no container restarts.

Production Hardening

Getting the stack running is step one. Running it reliably under real traffic is the actual job. Address these before you go live.

SSL/TLS

If you set DOMAIN in your .env, Nginx is configured to request a Let's Encrypt certificate via Certbot. Run:

docker compose exec nginx certbot --nginx -d your-domain.com -d admin.your-domain.com

Auto-renewal is handled by a cron job inside the Nginx container. Check it's working: docker compose exec nginx certbot renew --dry-run.

Rate Limiting (Already Built In)

The server enforces two rate limits out of the box: 20 chat messages per minute per session, and 100 API requests per minute per IP. These live in Redis and restart cleanly if the container does. Adjust the limits in the admin panel under Settings → Rate Limiting if your use case needs different thresholds.

Security Headers

Helmet.js is configured on the Express server, handling X-Content-Type-Options, X-Frame-Options, Strict-Transport-Security, and CSP headers automatically. SSRF protection is built into the URL crawler to prevent misuse of that endpoint.

Backups

The database is your most critical asset. Set up a daily pg_dump and ship it somewhere off-box:

docker compose exec db pg_dump -U chatbot ai_chat_agent | gzip > backup-$(date +%Y%m%d).sql.gz

Automate this with a cron job on the host. For a busy production system, consider streaming WAL to an S3-compatible bucket with pgBackRest or Barman.

Monitoring

At minimum, set up a free uptime monitor (UptimeRobot, BetterStack) pointed at your widget endpoint. For container-level metrics, docker stats gives you a live view. For persistent metrics, add Prometheus + Grafana as optional containers — the compose file has commented-out examples you can enable.

Cost Comparison: Self-Hosted vs SaaS

The EUR 79 one-time license changes the economics entirely. See our self-hosted vs SaaS chatbot cost breakdown for the full analysis — here's the summary:

5-Year Total Cost of Ownership Self-hosted vs SaaS chatbot platforms (EUR) €0 €1k €2k €3k €4k €5k €6k ~€1,589 AI Chat Agent Self-hosted BEST VALUE ~€1,740 Tidio SaaS +€151 ~€4,740 Intercom SaaS +€3,151 ~€6,000 Drift SaaS +€4,411 AI Chat Agent: €79 license + ~€5/mo VPS + ~€20/mo LLM API (midpoint). SaaS costs = base plan rates, no seat overages. Year 1 AI Chat Agent includes €79 one-time license. Year 2+ is VPS + LLM only. SaaS platforms impose seat limits, conversation caps, and branding restrictions not reflected in base price.
Five-year TCO at median usage. The self-hosted stack's cost advantage widens every year after year one — SaaS pricing scales with your usage, self-hosting does not.
Cost Item AI Chat Agent (Self-Hosted) Intercom Drift Tidio
Software license €79 one-time €79/month €100/month €29/month
Hosting (VPS) ~€5/month Included Included Included
LLM API (moderate use) ~€10–30/month Included (limited) Included (limited) Add-on cost
Year 1 total ~€259–€499 €948+ €1,200+ €348+
Year 2–5 (annual) ~€180–€420 €948+ €1,200+ €348+
5-year total ~€999–€2,179 €4,740+ €6,000+ €1,740+

The SaaS platforms also impose seat limits, conversation limits, and branding restrictions at their base tiers. With the self-hosted stack, you own the infrastructure and there are no per-seat or per-conversation fees. Scale to a thousand simultaneous users and your costs increase by the marginal LLM API usage — not by a pricing tier upgrade.

For a deeper look at how the numbers play out for different business sizes, see the customer support cost breakdown we published alongside this guide.

What Your Self-Hosted AI Chatbot Includes

Here's what you now have running — a production-grade docker compose AI chatbot with:

  • A secure REST API with JWT authentication and AES-256 encrypted credentials
  • A React admin panel where you create bots, upload documents, view chat history, and manage leads
  • A PostgreSQL database with pgvector for semantic search over your knowledge base
  • Redis for session management and rate limiting under traffic spikes
  • Nginx with SSL termination and security headers
  • An embeddable widget approximately 22KB gzipped that you can put on any website with two lines of HTML
  • Support for OpenAI, Claude, Gemini, and any OpenAI-compatible model — configurable per bot, no code changes required
  • Built-in operator takeover so a human agent can jump into any conversation in real time
  • Lead capture and notifications via email, Telegram, or webhooks

That's the same infrastructure larger teams pay hundreds per month to access through SaaS platforms — running on hardware you control, with data that never leaves your server.

The customer service automation tools landscape in 2026 is dominated by vendors charging for features that should be table stakes. A Docker stack like this puts those features within reach of any developer or technical founder — five minutes of setup instead of a SaaS onboarding funnel.

Frequently Asked Questions About Docker AI Chatbot Deployment

How much RAM does a self-hosted AI chatbot Docker stack need?

The full five-container stack (Nginx, Node.js API, React admin, PostgreSQL with pgvector, and Redis) idles at around 800MB to 1GB of RAM. A VPS with 2GB RAM gives you enough headroom for moderate traffic. For high-concurrency use cases with 50+ simultaneous chat sessions, 4GB is recommended.

Can I deploy an AI chatbot with Docker Compose on any VPS provider?

Yes. Any VPS or cloud provider that runs Linux with Docker Engine 24+ works. Hetzner, DigitalOcean, Vultr, Linode, AWS EC2, and Google Compute Engine are all tested and compatible. The docker-compose.yml is provider-agnostic — you can migrate between hosts by copying the compose file and your .env configuration.

Do I need to know Docker to deploy this chatbot?

Basic terminal familiarity is enough. The deployment is three commands: clone the repository, copy the environment file, and run docker compose up -d. The compose file handles container orchestration, dependency ordering, and health checks automatically. You do not need to write Dockerfiles or understand container networking.

Which AI models does this self-hosted chatbot support?

AI Chat Agent supports OpenAI (GPT-4o, GPT-4o-mini), Anthropic Claude (3.5 Sonnet, Haiku), Google Gemini (1.5 Flash, 1.5 Pro), and any OpenAI-compatible API endpoint including local Ollama instances. Each bot can use a different provider, configured entirely from the admin panel without container restarts.

How does the RAG knowledge base work in a Docker chatbot?

When you upload documents (PDF, DOCX, or HTML) or crawl a URL, the server chunks content into 512-token segments and generates vector embeddings stored in PostgreSQL via the pgvector extension. At query time, the user's message is embedded and the top matching chunks are retrieved via cosine similarity, then injected into the LLM's context window for grounded, accurate answers.

Is a Docker-deployed chatbot GDPR compliant?

Self-hosting gives you full data sovereignty. All chat data, user information, and knowledge base content stays on your server. No data is sent to third-party SaaS platforms. The only external calls are to your chosen LLM API provider for generating responses. This architecture makes GDPR compliance significantly simpler than cloud-hosted chatbot solutions.

Try It Before You Buy — Then Own It for Life

Want to see the admin panel and widget before you commit? The live demo is at demo.getagent.chat — same stack as this guide. Create a bot, upload a document, embed a test widget, and see exactly what you're getting.

When you're ready to deploy on your own infrastructure, the AI Chat Agent license is a one-time €79. No monthly fees. No seat limits. No per-conversation charges. You get full repository access, all future updates, and the complete five-container production stack.

Get the license for €79 →

Questions about the stack, scaling, or specific integrations? The blog covers help desk software for small business, SaaS alternatives, and automation workflows. For direct support: support@getagent.chat.