Skip to content

I architect LLM systems that cut inference costs 12x

— and scale to 800K+ users.

David Dominguez

CTO · LLM Engineer · Ships from Zero

Let's talk
Scroll down

Experience

Closer AIBogotá
  • CTO
    Sept 2025 — Present

    Most AI projects fail on cost before they scale. Architected and shipped an AI sales agent from zero, reducing per-conversation cost from $0.25 to $0.02 (12x) while sustaining sub-second response times. Built a routing layer across 20+ LLM providers that dynamically balances cost and quality per conversation turn. Own hiring, technical budget, product roadmap, and report directly to investors.

12x
Inference Cut
20+
LLM Providers
<1s
Latency
MercadoLibreBogotá
  • Technical Lead
    May 2024 — Sept 2025

    Led a 10-person engineering team that built and scaled the Affiliates program within MercadoLibre’s ecosystem from first user to 800K+, with zero SLA breaches across 16 months. The program grew to contribute 1% of MercadoLibre’s global revenue. Responsible for core architecture decisions, sprint delivery, and team growth.

800K+
Users Scaled
99.9%
Uptime Held
$0
SLA Penalties
AguaBogotá
  • CTO
    Aug 2022 — May 2024

    As CTO, designed a fully client-side architecture for a low-code platform — keeping infrastructure costs near zero at scale. Built two custom engines: a 2D rendering layer for the visual editor and a transpiler that converted user designs into production code. Owned hiring, technical budget, product roadmap, and reported directly to investors.

~$0
Infra Cost
2
Custom Engines
100%
Client-Side
Mr. PinkBogotá
  • Lead Developer
    Dec 2021 — Aug 2022

    Served as the primary technical lead for the agency’s largest enterprise account, building direct trust with their CTO. Led a team of 6 engineers delivering their core platform rebuild.

  • Senior Developer
    Aug 2021 — Dec 2021

    Diagnosed a delivery bottleneck across client projects and built an automated scaffolding platform that cut project setup time in half, compressing timelines across the full portfolio.

2x
Output Velocity
6
Engineers Led
1
Enterprise Retained
CredibancoBogotá
  • Senior Web Developer
    Dec 2020 — Aug 2021

    Credibanco’s proprietary POS hardware required C — capping terminal development to a small specialist team. Built a Python-to-C transpiler that opened POS development to the company’s broader Python developer base. Then led the deployment rollout across thousands of payment terminals nationwide.

10x
Dev Pool
1000s
Terminals
C→Python
Stack Opened
Other dev jobsBogotá
  • Developer
    Oct 2018 — Dec 2020

    Built production applications in TypeScript, React, and Node.js across 10+ client projects — from MVPs to full-stack platforms — developing the full-stack depth that informed later architecture work at scale.

10+
Projects
2+
Years
3
Core Stack

Work

[AGT-001]2025

Closer AI

AI sales agent that closes pipeline while you sleep

Built the core intelligence layer of an AI sales agent that handles full conversations autonomously — reasoning through objections, maintaining context across sessions, and routing to the optimal model per turn. Reduced per-conversation cost from $0.25 to $0.02 (12x) while sustaining sub-second response times.

12x
Cost Cut
20+
AI Models
<1s
Response
TypeScript·Node.js·LLM Routing·Semantic Search
usecloser.ai
[LIB-001]2025

LLM Rate Limiter

Hard spending limits across every AI provider — zero infrastructure

Built a TypeScript library that enforces hard spending and usage caps across multiple AI providers — preventing budget overruns before they happen. Supports automatic failover to backup models when limits are hit and recycles unused capacity in real time. Drop-in installation, no servers required. Currently in production at Closer AI and open-sourced.

0
Extra Infra
$0
Overages
Multi
Provider
TypeScript·Node.js·Redis·LLM APIs
github.com/daviddominguezh/llm-rate-limiter
[LIB-002]2025

LLM Markdown WhatsApp

Ship AI on WhatsApp in one function call

Built a zero-configuration TypeScript library that converts raw AI output into properly formatted WhatsApp messages — handling lists, product cards, links, and Spanish punctuation automatically. One function call replaces days of custom formatting work per integration.

1
Function Call
0
Config
2-3d
Per Integration
TypeScript·Node.js·NLP·WhatsApp
github.com/daviddominguezh/llm-markdown-whatsapp
[LIB-003]2025

LLM Graph Builder

Persistent, relational memory for AI agents that make real decisions

Built a TypeScript library that gives AI agents a structured knowledge graph memory — capturing how entities relate to each other across sessions, not just isolated facts. Addresses the memory problem that limits most agents to single-session tasks. Powers Closer AI’s conversation memory in production.

Typed
Schema
0
Setup
Live
Powers Closer AI
TypeScript·Node.js·Graph DB·Agents
github.com/daviddominguezh/llm-graph-builder
[PLT-001]2023

Agua

A low-code IDE built on near-zero infrastructure spend

Architected the client-side IDE for a low-code platform from the ground up — including the 2D rendering engine and the transpiler that converted user designs into production code. The fully client-side architecture eliminated server costs entirely, keeping infrastructure spend near zero at scale.

~$0
Infra Cost
0
Servers
2
Custom Engines
React·TypeScript·Graphics Engine·Compiler
[SCL-001]2024

MercadoLibre

MercadoLibre’s Affiliates program — from first user to 800K, 1% of global revenue

Led a 10-person engineering team building MercadoLibre’s Affiliates program through its full growth phase — from first user to 800K. The program grew to contribute 1% of MercadoLibre’s global revenue. Responsible for core architecture decisions under real growth pressure: observability, scaling strategy, reliability thresholds. Zero SLA breaches across 16 months of 800x growth.

800K
Users
10
Engineers
0
SLA Breaches
Java·React·TypeScript·Docker
mercadolibre.com.mx/l/afiliados
[SYS-001]2021

Credibanco POS

Turned a 10-person bottleneck into a platform hundreds could build on

Built a Python-to-C transpiler that opened POS terminal development from a small C-specialist team to the company’s broader developer base. Deployed across thousands of payment terminals nationwide. The transpiler removed a structural hiring and knowledge bottleneck that had been limiting the company’s hardware roadmap for years.

10x
Dev Pool
1000s
Terminals
Nationwide
Deploy
Python·C·Embedded·Jenkins
[DEV-001]2021

K8s Scaffolding Platform

Eliminated the hidden tax every engineer paid on every new project

Identified that engineers were rebuilding identical scaffolding from scratch on every client engagement. Designed and built an automated Kubernetes-based platform that cut setup time in half and standardized delivery quality across the full engineering team.

2x
Faster Setup
0
Manual Steps
100%
Team Adoption
Kubernetes·Docker·Node.js·CI/CD

Education

Universidad de los Andes/ Bogotá
Computer Science & Software Engineering (B.S.)
Jan 2018 — Dec 2022
QS Top 200 Worldwide · #1 in Colombia · Top 10 Latin America · ABET Accredited

My Philosophy

01

Adoption is an engineering problem.

The hardest tools to adopt are the ones that ignore how people already work. I build to fit existing workflows, existing habits, existing mental models — because the hidden cost of software is never the license. It’s the change management, the resistance, the workarounds your team builds around tools they were told to use.

02

Speed is a strategy. Waiting is a decision.

Every week a tool isn’t in production is a week your team is working around the problem instead of past it. I optimize for sustainable velocity — scoped releases, real users, real feedback — without sacrificing reliability. The compounding cost of delay is invisible until it isn’t. But so is the cost of shipping something that breaks trust.

03

Privacy is a design constraint, not an afterthought.

I treat data privacy as an architectural decision, not a compliance checkbox. At Agua, the fully client-side architecture meant zero data egress — both a product requirement and a privacy guarantee. At Closer AI, I face the inverse problem: third-party LLM calls are inherently data-leaving events. Privacy there means governing what leaves, where it goes, and how long it’s retained. The approach changes with the problem. The discipline doesn’t.


Stack

TypeScript
React
Node.js
Jest
CI/CD
MongoDB
Docker
AWS
GCP
MySQL
Python
AI Agents
Kubernetes
React Native
Java
LLM Routing

David Dominguez

Contact

David Dominguez

I take one engagement at a time and work with teams where engineering decisions directly impact business outcomes. If your next 90 days depend on serious engineering judgment, this is a conversation worth having.

l.david.dominguez.12@gmail.com