I architect LLM systems that cut inference costs 12x

— and scale to 800K+ users.

David Dominguez

CTO · LLM Engineer · Ships from Zero

Let's talk

Scroll down

Experience

Closer AI/ Bogotá

CTO
Sept 2025 — Present
Most AI projects fail on cost before they scale. Architected and shipped an AI sales agent from zero, reducing per-conversation cost from $0.25 to $0.02 (12x) while sustaining sub-second response times. Built a routing layer across 20+ LLM providers that dynamically balances cost and quality per conversation turn. Own hiring, technical budget, product roadmap, and report directly to investors.

12x

Inference Cut

20+

LLM Providers

<1s

Latency

MercadoLibre/ Bogotá

Technical Lead
May 2024 — Sept 2025
Led a 10-person engineering team that built and scaled the Affiliates program within MercadoLibre’s ecosystem from first user to 800K+, with zero SLA breaches across 16 months. The program grew to contribute 1% of MercadoLibre’s global revenue. Responsible for core architecture decisions, sprint delivery, and team growth.

800K+

Users Scaled

99.9%

Uptime Held

SLA Penalties

Agua/ Bogotá

CTO
Aug 2022 — May 2024
As CTO, designed a fully client-side architecture for a low-code platform — keeping infrastructure costs near zero at scale. Built two custom engines: a 2D rendering layer for the visual editor and a transpiler that converted user designs into production code. Owned hiring, technical budget, product roadmap, and reported directly to investors.

~$0

Infra Cost

Custom Engines

100%

Client-Side

Mr. Pink/ Bogotá

Lead Developer
Dec 2021 — Aug 2022
Served as the primary technical lead for the agency’s largest enterprise account, building direct trust with their CTO. Led a team of 6 engineers delivering their core platform rebuild.
Senior Developer
Aug 2021 — Dec 2021
Diagnosed a delivery bottleneck across client projects and built an automated scaffolding platform that cut project setup time in half, compressing timelines across the full portfolio.

Output Velocity

Engineers Led

Enterprise Retained

Credibanco/ Bogotá

Senior Web Developer
Dec 2020 — Aug 2021
Credibanco’s proprietary POS hardware required C — capping terminal development to a small specialist team. Built a Python-to-C transpiler that opened POS development to the company’s broader Python developer base. Then led the deployment rollout across thousands of payment terminals nationwide.

10x

Dev Pool

1000s

Terminals

C→Python

Stack Opened

Other dev jobs/ Bogotá

Developer
Oct 2018 — Dec 2020
Built production applications in TypeScript, React, and Node.js across 10+ client projects — from MVPs to full-stack platforms — developing the full-stack depth that informed later architecture work at scale.

10+

Projects

Years

Core Stack

Work

[AGT-001]2025

Closer AI

AI sales agent that closes pipeline while you sleep

Built the core intelligence layer of an AI sales agent that handles full conversations autonomously — reasoning through objections, maintaining context across sessions, and routing to the optimal model per turn. Reduced per-conversation cost from $0.25 to $0.02 (12x) while sustaining sub-second response times.

12x

Cost Cut

20+

AI Models

<1s

Response

TypeScript·Node.js·LLM Routing·Semantic Search

usecloser.ai

[LIB-001]2025

LLM Rate Limiter

Hard spending limits across every AI provider — zero infrastructure

Built a TypeScript library that enforces hard spending and usage caps across multiple AI providers — preventing budget overruns before they happen. Supports automatic failover to backup models when limits are hit and recycles unused capacity in real time. Drop-in installation, no servers required. Currently in production at Closer AI and open-sourced.

Extra Infra

Overages

Multi

Provider

TypeScript·Node.js·Redis·LLM APIs

github.com/daviddominguezh/llm-rate-limiter

[LIB-002]2025

LLM Markdown WhatsApp

Ship AI on WhatsApp in one function call

Built a zero-configuration TypeScript library that converts raw AI output into properly formatted WhatsApp messages — handling lists, product cards, links, and Spanish punctuation automatically. One function call replaces days of custom formatting work per integration.

Function Call

Config

2-3d

Per Integration

TypeScript·Node.js·NLP·WhatsApp

github.com/daviddominguezh/llm-markdown-whatsapp

[LIB-003]2025

LLM Graph Builder

Persistent, relational memory for AI agents that make real decisions

Built a TypeScript library that gives AI agents a structured knowledge graph memory — capturing how entities relate to each other across sessions, not just isolated facts. Addresses the memory problem that limits most agents to single-session tasks. Powers Closer AI’s conversation memory in production.

Typed

Schema

Setup

Live

Powers Closer AI

TypeScript·Node.js·Graph DB·Agents

github.com/daviddominguezh/llm-graph-builder

[PLT-001]2023

Agua

A low-code IDE built on near-zero infrastructure spend

Architected the client-side IDE for a low-code platform from the ground up — including the 2D rendering engine and the transpiler that converted user designs into production code. The fully client-side architecture eliminated server costs entirely, keeping infrastructure spend near zero at scale.

~$0

Infra Cost

Servers

Custom Engines

React·TypeScript·Graphics Engine·Compiler

[SCL-001]2024

MercadoLibre

MercadoLibre’s Affiliates program — from first user to 800K, 1% of global revenue

Led a 10-person engineering team building MercadoLibre’s Affiliates program through its full growth phase — from first user to 800K. The program grew to contribute 1% of MercadoLibre’s global revenue. Responsible for core architecture decisions under real growth pressure: observability, scaling strategy, reliability thresholds. Zero SLA breaches across 16 months of 800x growth.

800K

Users

Engineers

SLA Breaches

Java·React·TypeScript·Docker

mercadolibre.com.mx/l/afiliados

[SYS-001]2021

Credibanco POS

Turned a 10-person bottleneck into a platform hundreds could build on

Built a Python-to-C transpiler that opened POS terminal development from a small C-specialist team to the company’s broader developer base. Deployed across thousands of payment terminals nationwide. The transpiler removed a structural hiring and knowledge bottleneck that had been limiting the company’s hardware roadmap for years.

10x

Dev Pool

1000s

Terminals

Nationwide

Deploy

Python·C·Embedded·Jenkins

[DEV-001]2021

K8s Scaffolding Platform

Eliminated the hidden tax every engineer paid on every new project

Identified that engineers were rebuilding identical scaffolding from scratch on every client engagement. Designed and built an automated Kubernetes-based platform that cut setup time in half and standardized delivery quality across the full engineering team.

Faster Setup

Manual Steps

100%

Team Adoption

Kubernetes·Docker·Node.js·CI/CD

Education

Universidad de los Andes/ Bogotá

Computer Science & Software Engineering (B.S.)

Jan 2018 — Dec 2022

QS Top 200 Worldwide · #1 in Colombia · Top 10 Latin America · ABET Accredited

My Philosophy

010203

Adoption is an engineering problem.

The hardest tools to adopt are the ones that ignore how people already work. I build to fit existing workflows, existing habits, existing mental models — because the hidden cost of software is never the license. It’s the change management, the resistance, the workarounds your team builds around tools they were told to use.

Adoption is an engineering problem.

Speed is a strategy. Waiting is a decision.

Every week a tool isn’t in production is a week your team is working around the problem instead of past it. I optimize for sustainable velocity — scoped releases, real users, real feedback — without sacrificing reliability. The compounding cost of delay is invisible until it isn’t. But so is the cost of shipping something that breaks trust.

Privacy is a design constraint, not an afterthought.

I treat data privacy as an architectural decision, not a compliance checkbox. At Agua, the fully client-side architecture meant zero data egress — both a product requirement and a privacy guarantee. At Closer AI, I face the inverse problem: third-party LLM calls are inherently data-leaving events. Privacy there means governing what leaves, where it goes, and how long it’s retained. The approach changes with the problem. The discipline doesn’t.

Stack

TypeScript

React

Node.js

Jest

CI/CD

MongoDB

Docker

AWS

GCP

MySQL

Python

AI Agents

Kubernetes

React Native

Java

LLM Routing

Contact

David Dominguez

I take one engagement at a time and work with teams where engineering decisions directly impact business outcomes. If your next 90 days depend on serious engineering judgment, this is a conversation worth having.

l.david.dominguez.12@gmail.com

GitHubGitHub LinkedInLinkedIn