Building Privacy-First AI: Why 42Gen Chose Multi-Model over Vendor Lock-In

How an early-stage startup is solving for speed, privacy, and cost-control with a multi-model approach.

Nov 14, 2025

TL;DR: When you’re building an AI assistant that knows everything about someone’s home, your architecture decisions become privacy decisions. Here’s how 42Gen’s engineering team solved for speed, privacy, and cost-control with a multi-model approach.

42Gen is tackling something most AI companies ignore: helping people manage their physical world without surrendering privacy to big tech.

Their AI assistant answers everyday questions:

Should I buy a new TV? Will it fit my space?
What filter in my house needs changing?
Where’s the manual for my water heater?
What should I do with all the things in my garage?
How do we help mom with 40 years of stuff when she moves to a retirement home?

David Samuelson, 42Gen’s co-founder and CTO says: “A persistent adult question is like, oh, man, is there a filter that I don’t know about that I should change? We don’t know a lot of the time.”

The challenge? This requires intimate knowledge of people’s homes. It’s the kind of data you might not want to share with large corporations.

David spent seven years at Google. He’s a proud former employee. But: “You don’t want to give large companies a lot of really, really personal data, especially when they have so much already of your online life.”

Their solution: on-device reasoning + tightly scoped, anonymized queries to LLMs only when necessary.

The Challenge: Tiny team, big ambitions

Current team:

3 co-founders (2 technical)
1 design contractor
2 moonlighting developers

Timeline: Targeting pilot launch in early January

Challenge: Time is the scarcest resource. Small team. Developers work irregular hours. They need every productivity multiplier they can get.

Solution: AI-first from day one.

“My philosophy doc that I published before I even started said, ‘We must be AI first. Here’s what that means. Here’s how we sprinkle it all in,’” David explains. “We should be using AI in all the places we can for ideation, for code, for this, for that.”

Why Kilo Code: Multi-model or bust

While at Google, David pushed for multi-model support in Gemini CLI. “I believed it had to be multi-model. Otherwise, it doesn’t matter how good the CLI is, because people are just going to copy it and make it multi-model.”

Google went a different direction. David’s conviction remained.

When searching for AI coding tools, his requirements:

Plain connection to OpenRouter
Mix and match models
Keep all context local
Zero vendor lock-in

When he found Kilo Code: “Oh! This is it! This is the thing. They’re doing what I hoped someone was doing. Great! I want to use them.”

What sealed the deal

Model Freedom

“I do not like vendor lock-in,” David says. “All the features that these big companies are making to try and lure you in and get vendor lock-in on their flagship models is not something I’m interested in.”

His philosophy: Use LLMs as commodities. “The model layer is critical because that’s where you decide, okay, how much do I rely on large companies for the magic? Is it just like, hey, you built this cool foundational model. That’s all I want it to be. I’ll manage all the context massaging locally, so I can take that logic with me when I switch providers.”

Privacy and customization

“I want the future to be private. I want people to have control of their own digital destinies.”

Example: David built custom voice integration with Open Whisper. One MacBook shortcut opens a terminal with pre-loaded context. He speaks his coding instructions, and Kilo Code executes the job in secure containers and creates a PR.

Voice stays local: “No one’s recording my voice anymore. It’s not going to servers.”

Transparent, fair pricing

“With Kilo, we’re paying OpenRouter pricing. So the cost is low.”

The math: AI compute is cheap. Founder time is expensive.

“My time costs more than the AI doing it. Anything we can actually offload, and the more we can make the LLMs work harder, we should do it.”

With Kilo Teams dashboard, David tracks exact costs and makes time-vs-money tradeoffs: “How much does this cost? Because the game of how much time do I need to spend to make this give me the right outcome.” Usually? Let the AI iterate.

The dashboard also shows team adoption: “Seeing the use per person is very helpful because you can say, hey, are you leveraging the tools like you should be?”

One developer wanted to stick with GitHub Copilot due to existing workflow integration. David explained the reasoning for Kilo Code. The developer tried it, and is now a happy user.

The learning curve: Who adapts best

Getting results with AI coding tools requires a mental shift. David has noticed patterns in who adapts best.

Unexpected winners: Technical writers

“It’s interesting that technical writers could be potentially much more valuable now because they see things very clearly. They’re crisp.” Technical writers think comprehensively about edge cases and articulate requirements precisely, and these are skills that translate directly to better AI prompting.

The real skill is metacognition

“Am I delegating these high level tasks? Am I delegating very precise, like crystal clear, tiny tasks? When I’m reviewing this thing, am I trying to capture more context so I can teach the LLM to do the next job better?” Each mode requires different approaches and has different time investments. This metacognition mirrors Kilo’s different agent modes, for example, knowing when to use Orchestrator for complex delegation versus Code mode for precise implementation.

Human code review is a non-negotiable

“We have to have humans review the code. I’m still in tech lead mode. I have to review it because I have to know and actually understand what all this stuff is doing so I can modify it. Code still needs to be understandable by humans, and some humans actually need to actually understand it right now.”

Without that understanding? You hit “the cliff”, suddenly realizing you don’t understand your own codebase and can’t modify it effectively.

The Results

42Gen is shipping toward their January pilot with working demos in hand.

A guest post by

Alex Gold

Kilo Code Blog

Discussion about this post