If you’re in B2B SaaS and you’ve spent any time thinking about pipeline, you’ve run into the pitch: put a bot on your site, qualify visitors automatically, route hot leads to sales. That’s the promise of a conversational marketing platform — and it’s a genuinely useful category. The question is why you’d pay $2,000 to $3,000 a month for the infrastructure to do it, when the underlying components are now commodity. AI Chat Agent is one answer to that question: a self-hosted alternative that delivers the same MQL-qualification loop without the SaaS subscription. But before we get to the product, let’s make sure we’re talking about the same problem.

What Counts as a “Conversational Marketing Platform” in 2026

The term gets stretched. Support chat, live chat, customer service bots — all of these involve conversations, but none of them are conversational marketing. The distinction matters: a support tool handles post-sale issues. A conversational marketing platform operates at the MQL stage — it intercepts anonymous or identified visitors during their research phase, qualifies them through dialogue, and hands off lead data to sales or CRM.

Drift built this category, positioning squarely against form-gating: instead of making a prospect fill out a “request a demo” form and wait 48 hours for a response, you engage them the moment they land on your pricing page. Qualified took a similar approach with a heavier Salesforce integration angle. Intercom expanded from support into marketing, layering “Fin AI” on top of its existing customer communication suite.

What they all share: a website widget that can hold a conversation, a mechanism for capturing contact information, and some kind of routing or notification to the sales team. That’s the core. Everything else — firmographic enrichment, Salesforce sync, account-based routing, playbooks — is layered on top, and it’s largely what you’re paying for when you buy the enterprise tier.

For B2B SaaS companies in the sub-$10M ARR range, most of that enterprise layer is excess. You need the core: qualify the visitor, capture their contact, alert your team. A well-configured conversational marketing chatbot does that without the full enterprise stack.

The Cost Problem: Why SaaS Conversational Marketing Pricing Stopped Making Sense

Let’s be direct about the numbers. Drift’s entry-level plans post-SalesLoft acquisition have moved upmarket — you’re looking at $2,000 to $3,000 per month at the tiers where you get meaningful AI capabilities and CRM integrations. Qualified is in a similar bracket. Intercom’s Fin AI comes in lower on paper ($39–$139 per seat per month) but seat-count economics push mid-size teams into four-digit monthly bills quickly.

What are you actually paying for? Break it down: the LLM calls themselves — at current API rates, a busy conversational bot running several thousand messages a month costs somewhere in the range of a few dollars in raw inference. The widget UI, the session management, the lead storage — these are solved engineering problems. The real cost in enterprise conversational marketing software is the go-to-market wrapper: account management, uptime SLAs, out-of-the-box integrations with 50+ tools, enterprise SSO, and the brand name that makes it easier to get budget approved.

None of that is illegitimate. But if you have an engineering team, or you’re comfortable with Docker, you can replicate the core loop at a fraction of the cost. The LLM API is the same one they’re calling. The widget is JavaScript. The database is PostgreSQL. The question is whether you want to operate it. For many teams, the answer is yes — and the tradeoff is worth understanding clearly before committing either direction.

What a Self-Hosted Conversational Marketing Platform Looks Like

The architecture is straightforward. A JavaScript widget (Shadow DOM, ≈25.8 KB gzipped) loads on your site. Visitor opens it, types a message. The widget POSTs to your API server, which constructs a prompt — system instructions, your knowledge base context via RAG retrieval, visitor identity if you’ve injected it, conversation history — and calls your configured LLM provider. The response streams back, gets stored in PostgreSQL, and the widget renders it.

When a lead capture trigger fires (pre-chat gate, mid-chat form, or AI-extracted contact info), the server writes a lead record and fires your configured notification channels: SMTP email, Telegram bot message, or a webhook POST to your CRM.

WebsiteVisitoranonymous / knownWidget JSShadow DOM≈25.8 KB gzipPOST /messagesAPI Serverprompt buildersession mgrlead writerLLM ProviderOpenAI / ClaudeGemini / GroqRAG / pgvectorhybrid retrievalKB groundingLead Captureform / AI extractconsent + identityNotifyEmailTelegramWebhook→ CRM5 containers · your LLM key · PostgreSQL + pgvector · your infrastructure
Self-hosted conversational marketing stack — five containers, your LLM key, your data

The Docker Compose stack for AI Chat Agent looks like this:

services:
  db:
    image: pgvector/pgvector:pg16
    volumes: [postgres_data:/var/lib/postgresql/data]
  redis:
    image: redis:7-alpine
  server:
    image: getagent/server:latest
    ports: ["3000:3000"]
    depends_on: [db, redis]
  admin:
    image: getagent/admin:latest
    ports: ["4173:4173"]
  nginx:
    image: nginx:alpine
    ports: ["80:80", "443:443"]

You bring your own LLM API key. You own the data. The trade-off is real: you’re the one who restarts the container when Redis runs out of memory at 2 AM. If you have no infra comfort at all, that’s a meaningful cost. If you do — or if you have a $5/month VPS and are willing to spend an afternoon setting this up — the economics shift dramatically. For a detailed deployment walkthrough, this guide covers the Docker setup end to end.

Visitor Identity Passthrough: The Lead-Gen Primitive Drift Built Its Brand On

One of Drift’s core value propositions was recognizing returning visitors and skipping the “who are you?” dance for known contacts. The mechanism is simple in principle: if you know who’s on your site (logged-in users, people who clicked an email link), pass that context to the chat widget so it can pre-fill lead data and personalize the opening.

AI Chat Agent exposes this via a client-side identity API. Before the widget loads, you inject a window.aiChatAgent.user object:

// In your app's authenticated layout
window.aiChatAgent = {
  user: {
    name: currentUser.fullName,       // string
    email: currentUser.email,         // string
    phone: currentUser.phone || null, // string | null
    consentGivenAt: currentUser.gdprConsentAt // ISO 8601 timestamp
  }
};

// Then load the widget
const script = document.createElement('script');
script.src = 'https://your-domain.com/widget.js';
script.dataset.bot = 'your-bot-id';
document.head.appendChild(script);

The server sanitizes these fields, stores them on the ChatSession record (visitorName, visitorEmail, visitorPhone), and injects them into the system prompt as a fenced context block — so your bot greets the visitor by name and already knows their context. More importantly, if all required lead fields are satisfied and consent is present, the lead capture form is skipped entirely. The lead record is created from the identity data directly.

For a SaaS product where you want to engage churned users, trial-tier visitors checking out pricing, or enterprise prospects who’ve logged into a sandbox — this passthrough eliminates friction at exactly the moment you want to capture intent.

Logged-inVisitorwindow.aiChatAgent.user name: stringemail: stringphone: string | nullChatSessionvisitorNamevisitorEmailvisitorPhoneconsentGivenAtSystem Promptcontext block injected:Visitor: Alex ChenEmail: alex@co.comConsent: ✓ 2026-06-12Lead Capture form SKIPPEDLead Createdfrom identity datano form shown→ notification fires→ CRM webhookAll required fields present + consent → form skip path
How window.aiChatAgent.user pre-fills the lead form and personalises the system prompt

UTM Passthrough and Campaign Attribution Done Right

If you’re running paid campaigns — LinkedIn Sponsored Content, Google Search, newsletter sponsorships — you need attribution to close the loop. The standard question: “Which channel is generating leads that actually convert?” Without UTM data attached to your leads, you’re guessing.

AI Chat Agent captures UTM parameters automatically on widget load from window.location.search. The five standard parameters (source, medium, campaign, term, content) are extracted, truncated to 200 characters each, and stored in a ChatSession.utm JSON field. That’s the canonical store — they’re never duplicated into other tables, just referenced via the session relation when a lead is created.

The attribution chain looks like this: a visitor clicks your LinkedIn ad (utm_source=linkedin, utm_medium=paid, utm_campaign=q2-enterprise). They land on your pricing page. The widget loads and captures the UTMs. They start a conversation. When they submit the lead form or the AI extracts their contact info, the lead record is created with the session attached. Your webhook payload to the CRM includes the full UTM object:

{
  "lead": {
    "name": "Alex Chen",
    "email": "alex@company.com",
    "capturedAt": "2026-06-12T10:23:41Z"
  },
  "session": {
    "utm": {
      "source": "linkedin",
      "medium": "paid",
      "campaign": "q2-enterprise",
      "term": null,
      "content": "cta-variant-b"
    }
  }
}

That’s the data your sales team and your marketing analyst both need. The UTMs survive through the full session — not just the first pageview — so even if a visitor wanders across multiple pages before chatting, attribution stays intact.

LinkedIn Adutm_source=linkedinutm_medium=paidutm_campaign=q2-entPricing Pagevisitor landswidget loadsUTMs capturedChatSession.utm canonical storesource · medium · campaignterm · contentLead Createdname + emailsession linkedUTMs attachedWebhook→ CRMUTMs preservedin payloadHubSpot / PipedriveUTM params stored once in ChatSession.utm — referenced by every downstream lead and webhookAd clickWidget loadUTM capturedLead writtenCRM notified
UTM parameters captured on widget load survive to your CRM webhook payload

Lead Capture: Pre-Chat, Mid-Chat, or Auto-Detect

How and when you gate on contact information is a UX decision with real conversion implications. A pre-chat form — name and email before the conversation starts — sets expectations clearly but creates friction. Some visitors bounce. Mid-chat forms, triggered after N messages or when the bot detects buying intent, can feel more natural but require tuning. AI auto-extraction, where the bot watches for and pulls contact details from the conversation itself, works well when your use case allows for a more open-ended dialogue.

AI Chat Agent supports all three modes, configurable per bot:

  • Pre-chat gate: widget renders a form (name, email, phone — any configurable subset) before the conversation starts. Clean, predictable, sets the right expectation for sales-focused bots.
  • Mid-chat trigger: bot converses freely, then presents the form after a configured number of exchanges or when triggered by a bot message. Good for discovery-first flows.
  • AI auto-extract: the system prompt instructs the LLM to emit [LEAD:{“email”:”…”,“name”:”…”}] markers when it determines the visitor has shared contact details naturally in conversation. The server parses these markers, strips them from the displayed message, and creates the lead record silently.

Once a lead is captured, notification fires on your configured channels simultaneously. SMTP sends a formatted email to your sales inbox. A Telegram bot drops a message to your team channel with a deep link to the session. A webhook POST hits your CRM endpoint — HubSpot, Pipedrive, whatever accepts an HTTP POST. These channels are per-bot: your enterprise-prospect bot can notify a different Slack-equivalent endpoint than your SMB trial bot.

For teams thinking about ecommerce-style conversational flows rather than B2B lead-gen, the conversational commerce platform post covers that angle specifically — the use cases diverge significantly from what we’re describing here.

Pre-chat GateForm shown beforefirst messageFriction: highConversion: clean & predictableform appears here ↓Mid-chat TriggerForm after N messagesor on intent signalFriction: mediumConversion: tunable[LEAD:{email}] parsed silentlyAI Auto-extractLLM emits marker whencontact shared naturallyFriction: lowestConversion: variable
Three lead capture triggers — pick the right friction profile per bot

Operator Live Reply: Human Takeover Without Losing the AI Context

High-intent enterprise leads are exactly where you want a human in the loop. The problem with most chatbot handoff implementations: the visitor knows they’ve been transferred, the context resets, and the momentum of the conversation breaks. If someone on your pricing page is asking pointed questions about your enterprise tier, you don’t want them to get a “transferring you to an agent” message and then wait in a queue.

The operator takeover model in AI Chat Agent preserves the conversation context and keeps the handoff invisible to the visitor. An admin watching the session list sees a live conversation flagged as high-intent (or spots it manually). They call POST /operator/takeover/:sessionId — this flips the session status from BOT to OPERATOR with an optimistic lock to prevent race conditions. Then POST /operator/reply/:sessionId lets them type a response. The visitor receives it as a bot message — there’s no visual indicator of the handoff.

The admin UI shows the full conversation history, the visitor’s captured identity data, their UTM attribution, and the session duration. The operator reads the context, picks up where the AI left off, and closes the conversation themselves or hands back to the bot. If the operator walks away without releasing, a 2-hour auto-release cron returns the session to bot mode — so you don’t end up with abandoned sessions stuck in operator limbo.

The use case is specific but valuable: an enterprise prospect asking about custom deployment, data residency, or contract terms. These are questions where a trained human closes significantly better than any bot. The takeover mechanism gives you that without breaking the experience.

Multi-LLM: Switch Providers Without Migrating Data

One of the practical risks of SaaS conversational marketing software is LLM vendor lock. The platform picks the model, you get whatever they’ve configured, and when OpenAI releases GPT-5 or Anthropic releases a faster Claude model at half the cost, you wait for the vendor to support it. Meanwhile your inference costs — which they’ve baked into their pricing — don’t change.

AI Chat Agent supports five provider families: OpenAI, Anthropic Claude, Google Gemini, OpenRouter (which surfaces 100+ models through a single API), and any OpenAI-compatible endpoint — Groq, Ollama, a self-hosted Llama instance, whatever. Configuration is per-bot: your lead qualification bot might run on GPT-4o for quality, while your FAQ bot runs on a cheaper fast model to keep costs down.

Switching providers is an API key change in the admin panel. The only caveat: if you switch embedding models and the vector dimensions change, your RAG index needs to be re-embedded. That’s a background job, not a migration. Your lead data, session history, and configuration stay untouched. Multi-LLM setup and model selection strategy is covered in detail separately if you want to dig into the provider tradeoffs.

This matters for cost control. At current rates, cheap-but-capable models run around $0.001–$0.005 per 1K tokens. A bot handling a few thousand messages a month in a realistic B2B context is a few dollars in inference, not hundreds. When that cost is yours to manage rather than bundled into a platform fee, the incentive to optimize is real and the savings are yours.

RAG Grounding: The Bot Refuses When Your KB Doesn’t Cover the Question

Generic LLMs hallucinate. Ask GPT-4 about your specific product’s enterprise pricing model or your SLA terms and it will generate a plausible-sounding answer with no grounding in reality. For a conversational marketing bot on your website, that’s a liability: a prospect asking about your data residency options gets confidently wrong information and you’ve damaged trust at the worst possible moment.

The RAG implementation in AI Chat Agent uses a hybrid retrieval approach: dense vector search (pgvector cosine similarity) combined with lexical search (full-text), fused via Reciprocal Rank Fusion. Results are reranked by an LLM and expanded with neighboring chunks for context coherence. The key behavior is the similarity threshold gate: when retrieved chunks score below a configured similarity floor, the system instructs the bot to decline rather than guess.

”I don’t have that specific information in my knowledge base. For questions about enterprise contract terms, you can reach our sales team at sales@yourcompany.com.”

That’s the correct behavior. It’s also the behavior that differentiates a properly grounded RAG bot from a generic ChatGPT wrapper. For deeper context on how RAG-grounded bots handle knowledge boundaries, this post on RAG for customer support covers the retrieval mechanics in detail.

Each response includes per-page source attribution — the bot can cite which KB document it’s drawing from, which adds credibility in a sales context. You upload markdown files through the admin panel. The system chunks, embeds, and indexes them. Updates are incremental.

The Cost Math: Self-Hosted vs SaaS Over 3 Years

Here’s the table that makes the decision concrete. These numbers assume a single-team deployment — one product, one primary use case, realistic message volumes for a B2B SaaS with moderate inbound traffic.

Cost itemSelf-hosted (AI Chat Agent)SaaS (Drift/Qualified bracket)SaaS (Intercom Fin AI bracket)
License / subscription€79 one-time$2,000–3,000 / mo$39–139 / seat / mo
Infrastructure€60–360 / yr VPSincludedincluded
LLM API cost€3–36 / yr (your key)included (capped/throttled)included (capped/throttled)
Year-1 total≈€150–450≈$24,000–36,000≈$6,000–18,000
Year-3 total≈€220–1,150≈$72,000–108,000≈$18,000–54,000
Year-3 Total Cost ComparisonCost (USD/EUR)€450$18k$72k≈€450Self-hostedAI Chat Agent3-yr total≈$18,000IntercomFin AI bracket3-yr total≈$72,000Drift / Qualifiedbracket3-yr totalSelf-hostedIntercomDrift / QualifiedAxis break applied — bars not to linear scale. Drift bracket is 160× the self-hosted cost.
Year-3 total cost: self-hosted vs SaaS conversational marketing platforms

The break-even is immediate. If you would have paid a SaaS platform anything at all, you’re ahead in month one. The self-hosted range is wide because VPS costs vary (a $5/month shared instance vs. a dedicated €30/month server) and LLM costs depend entirely on your message volume and model choice. Even at the high end of the self-hosted range — €1,150 over three years — you’re comparing against six figures for the enterprise SaaS alternatives. See also the detailed Drift comparison for a feature-by-feature breakdown beyond just pricing.

When Self-Hosted Doesn’t Win

Honest answer: sometimes SaaS is the right call. Here’s when self-hosted loses:

  • No Docker comfort on your team. If nobody on your team has deployed a Docker Compose stack before and you don’t want to learn, the operational burden will erode the cost advantage. SaaS wins on total effort.
  • You need a vendor SLA. If your procurement process requires a signed uptime guarantee with financial penalties, a self-hosted open-core product can’t provide that. You need a vendor with a contract.
  • You need 50+ pre-built integrations. Native Salesforce two-way sync, HubSpot workflow triggers, Marketo lead scoring integration — if you need these working out of the box without engineering effort, enterprise SaaS platforms have them. Webhook-based integration is powerful but requires your team to build the glue.
  • Enterprise SSO and SAML are requirements. Large-org procurement often requires SAML/SCIM provisioning. Self-hosted AI Chat Agent doesn’t ship that today.
  • Multi-team with permission hierarchies. The multi-bot architecture handles isolation well, but if you need role-based access across a 20-person marketing team with granular permissions, enterprise SaaS platforms are built for that. The admin interface here is designed for small teams.

If none of those apply to you — you have a small technical team, you own your infrastructure, and your integration needs are webhook-addressable — then self-hosted is the better economic choice. The sales engagement platform post is worth reading if you’re evaluating where conversational marketing fits in your broader GTM stack.

Implementation: From Zero to Lead Capture in 30 Minutes

Here’s the actual sequence, assuming you have Docker installed and an OpenAI or Anthropic API key:

  1. Pull and start the stack. Clone or extract the Docker Compose files, copy .env.example to .env, set your database password and a JWT secret, then run docker compose up -d. The admin panel comes up on port 4173.
  2. Set your LLM API key. Log into the admin panel (default credentials in your .env), navigate to Settings → AI Providers, paste your OpenAI or Anthropic key, and select your preferred model. Save.
  3. Create your first bot. Click “New Bot”, give it a name, write your system prompt (describe your product, your ICP, how you want it to qualify visitors), and configure the lead capture mode — pre-chat form for simplicity to start.
  4. Upload your knowledge base. Drag markdown files into the KB section — your product docs, FAQ, pricing page copy, whatever you want the bot to be able to answer from. The system indexes them in the background; you’ll see chunk counts update.
  5. Configure notifications. Add your email address for SMTP alerts, or paste a Telegram bot token and chat ID for instant lead pings. Test with the “Send test notification” button.
  6. Drop the widget snippet on your site. Copy the two-line embed code from the bot’s “Install” tab. Paste it before </body> on your marketing site. If you have user identity data, inject window.aiChatAgent.user before the script tag in your authenticated layout.
  7. Verify end-to-end. Open your site in an incognito window, start a conversation, submit the lead form. Check that the lead appears in the admin panel and that your notification fired.

The full setup including domain, TLS via Let’s Encrypt (Caddy handles this automatically), and widget embed typically takes under an hour. The operational surface is small: a five-container Compose stack on a $5–$30/month VPS is stable for the message volumes a typical B2B SaaS site generates. For GDPR compliance considerations — data retention, deletion endpoints, consent tracking — this post covers the specifics.

If you want to see the full experience before committing, the live demo at demo.getagent.chat has a working admin panel with sample bots and sessions. The license is €79 one-time — that’s the entire cost of the software, no seat fees, no monthly subscription, no usage limits imposed by the platform. You bring your own infrastructure and your own LLM key, and you own everything that follows.