#042 Cactus shrunk Gemini to 14MB on-device, Rossmann grabbed Bambu's fork, DuckDB beat Postgres 32x

YC outfit Cactus just distilled Google’s Gemini into a 14MB tool-calling model small enough to live on a smartwatch and fast enough to fine-tune on a laptop. They open-sourced the weights, training code, and dataset scripts the same day.

Co-founder Henry Ndubuaku’s bet: tool routing is retrieval-and-assembly, not reasoning, so a transformer’s heaviest blocks were dead weight. If the bet holds, your next agentic mobile app stops calling an API for routing.

In today’s indie hacker news:

🌵 Cactus shrunk a tool-calling model down to watchface size
⚖️ Rossmann’s Fulu Foundation absorbed the fork Bambu killed
🦆 DuckDB shipped Quack, a remote protocol that ate Postgres on bulk
🖱️ DeepMind rebuilt the mouse pointer as a Gemini agent
📒 Obsidian’s tiny team automated review at app-store scale

TOP STORIES

ATTENTION IS ALL YOU NEED, LITERALLY

Cactus’ Needle drops every MLP block and beats every named rival on single-shot tool calls.

Cactus Needle distilled Gemini tool calling into a 14MB on-device model

The story: Needle is a 26M-parameter network YC S25 outfit Cactus Compute distilled from Gemini 3.1 Flash Lite for one job: match a user query to a tool, emit the JSON, stop. The architecture note drops every MLP block. The model is attention plus gating, eight decoder heads, four KV heads, d=512, 8192 BPE. Pretraining ran 200B tokens across 16 TPU v6e chips in 27 hours. Specialization on synthesized function-call data took 2B tokens and 45 minutes.

The details:

Runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices via the Cactus SDK (Hugging Face mirror live)
Outperforms FunctionGemma-270M, Qwen-0.6B, Granite-350M, and LFM2.5-350M on single-shot calls. Ndubuaku notes those rivals are conversational generalists, so the win is narrow-domain
Synthetic post-training data covers 15 device categories the team auto-generated: timers, messaging, navigation, smart home, gallery search
Weights, training code, and dataset-generation scripts ship MIT-licensed on GitHub
Ndubuaku, co-founder: “It is for building agentic capabilities into very small devices like phones, glasses, watches and more”

Why builders care: Drop the routing model directly into your iOS, Android, or wearable agentic app and skip the cloud call for actions like “set timer” or “message Maya”. The vendor lock-in argument for hosted tool-routing APIs just got a lot weaker. Contrarian read: benchmarks are self-reported, and Gemini Nano is suspiciously absent from the comparison.

STREISAND PRINTS HARDER

Bambu’s C&D handed Louis Rossmann’s Fulu Foundation the fork. Geerling’s takedown hit HN #1.

Fulu Foundation absorbed the OrcaSlicer fork Bambu Lab killed

The story: Jeff Geerling’s open-source critique landed yesterday and hit HN #1 within hours. The chain reaction: Louis Rossmann’s Fulu Foundation took custody of the killed OrcaSlicer-bambulab fork Pawel Jarczak shut down last week under C&D pressure. The mirror is two days old and already carries 4.6x the stars of Jarczak’s original repo. Bambu’s response argues the fork “injected falsified identity metadata” and risked overloading their cloud. Jarczak’s defense: the User-Agent string Bambu calls impersonation is copied verbatim from Bambu’s own AGPL-licensed BambuStudio source.

The details:

Geerling’s post reached 1,142 HN points and 373 comments in roughly 13 hours (HN thread)
Fulu mirror sits at 1,391 stars and 357 forks two days in; Jarczak’s original peaked at 299
Rossmann’s Fulu Foundation and GamersNexus pledged $20K combined toward Jarczak’s legal defense, with GamersNexus also mirroring the fork with his permission
2022 reverse incident: Bambu’s own fork accidentally routed BambuStudio telemetry to Prusa’s servers. Prusa never sent a C&D back
Jarczak is fundraising $500 for a Klipper-compatible printer to redirect his work to fully open hardware. Bambu acknowledges 734 forks of BambuStudio exist

Why builders care: AGPL protects code sharing, not access to a vendor’s cloud, and Terms of Service at the API layer can override license rights. That’s the lever Bambu pulled. Foundation custody is the new defensive move: shift legal exposure off an individual maintainer onto a well-funded nonprofit so future cease-and-desists become economically irrational.

QUACK ATE POSTGRES

DuckDB shipped Quack, a remote protocol that beat Postgres 32x on a 60-million-row transfer.

DuckDB Quack is a remote client-server protocol that beat Postgres 32x

The story: The Quack protocol runs on plain HTTP, talks DuckDB’s own internal serialization (the same format the WAL uses), and lets one instance act as a server with many clients writing concurrently. The team designed it for single-round-trip execution: a small query goes server-to-client in one HTTP request after handshake. The Postgres wire protocol’s row-based encoding murders bulk performance; Arrow Flight SQL needs two round trips per query. Default port is 9494, localhost-only out of the box, SSL via a reverse proxy. Auth tokens are server-generated at startup and pluggable via a callback that can be a SQL macro.

The details:

60M-row (76GB CSV) transfer on AWS m8g.2xlarge: 4.94s Quack, 17.40s Arrow Flight, 158.37s Postgres
5,434 transactions per second at 8 parallel threads vs 4,320 for Postgres and 1,358 for Arrow Flight
DuckDB-Wasm in a browser tab can talk to a remote EC2 DuckDB instance because the wire is plain HTTP
Available now in DuckDB v1.5.2 via core_nightly (duckdb-quack repo); production release targeted for v2.0 in fall 2026
DuckLake catalog server integration is queued, so DuckDB itself can replace a Hive Metastore for indie-scale data stacks

Why builders care: If your stack is already DuckDB for local dev and you’ve been stuck between hosted MotherDuck and a self-managed Postgres replica for cross-process access, this is the zero-rewrite shared-database path. One DuckDB process on a Hetzner box, point your app servers and Wasm clients at it, get sub-second analytics across multiple writers without Kafka in front.

POINT, DON’T PROMPT

DeepMind rebuilt the 50-year-old mouse pointer as a Gemini agent, shipping fall 2026 on Googlebook.

DeepMind reimagined the mouse pointer as a Gemini agent

The story: DeepMind’s post frames the AI pointer around four ideas: keep the user in flow across apps, replace prompts with visual context, treat “this” and “that” as first-class commands, and convert pixels into structured entities the agent can act on. Two live demos already work in Google AI Studio: image editing without prompts, and a map-based location finder. The same pattern is shipping into Chrome as “Gemini pointer” and onto Google’s new Googlebook (the Android-based laptop replacing Chromebook) as “Magic Pointer” this fall, where it continuously reads cursor context and surfaces actions before you ask.

The details:

Two interactive demos live in AI Studio today (image edit + map find): the only hands-on deliverable from the announcement
The post shipped without a paper, benchmarks, a public SDK, or any latency numbers
HN reception was mixed at 165 points: commenters caught demos pointing at the wrong objects and noted text-based actions ran slower than typing
Prior art: Richard Bolt’s 1980 “Put That There” demo combined speech with pointing 45 years ago. HN flagged that immediately
Open-protocol equivalent today: AG-UI from CopilotKit lets you wire bidirectional agent-UI state into any web app right now

Why builders care: If you’re building agentic features in a web or desktop app, the principle to copy today is using cursor location and selection state as primary input instead of dragging users into a chat sidebar. AG-UI ships this pattern without waiting for Google’s SDK. Cursor 3 already proved the model works inside an IDE; it’s coming for every SaaS category.

THE 7-PERSON APP STORE

Obsidian’s tiny team automated plugin review at 4,000-plugin scale and made paid plugins first-class.

Obsidian rebuilt its plugin community site and legitimized paid plugins

The story: Obsidian’s Future of Plugins post launches a new Community site that runs automated security and code-quality scans on every plugin version, not just initial submissions. The system cleared 2,300+ queued submissions within days of launch. Three monetization labels are now first-class on the page: Free (donations OK), Optional Payments (feature unlocks or paid tie-ins), and Paid (payment required for primary features). Steph Ango (kepano) says the seven-person team has been on this for nearly a year.

The details:

120M total plugin downloads since the API launched in 2020
Safety scorecards render on every plugin detail page, including malware-scan output and severity flags
Developer dashboard auto-migrates GitHub-hosted projects and adds a sponsorship-link slot to profiles
Manual review continues for popular, featured, and community-flagged plugins. Automation runs the long tail
Sandboxing is still roadmap. Ango: plugin disclosure declarations are “the first step towards permissions”

Why builders care: Paid-plugin labels finally let developers charge money without gray-area workarounds, and the sponsorship-link profile is the closest Obsidian comes to an app-store revenue model without taking a cut. Tradeoff: every version you ship gets re-scanned, so a false positive can tank your safety scorecard. This is the same legitimization moment VS Code had when it green-lit paid extensions: the addon market stops being a hobbyist sideline.

🎮 Local LLMs keep leaking onto stranger hardware. Maddie Dreese got a Karpathy-derived TinyStories-260K transformer running on a stock Game Boy Color (442 upvotes on r/LocalLLaMA): INT8 fixed-point math, KV cache stashed in cartridge SRAM because the GBC’s 32KB work RAM is too small. Output is gibberish but the loop executes end to end. Same day: a graduation cap running Rust on an ATtiny85 with 48 WS2812B LEDs, built in two hours with avr-hal. The “will it run on a Pi” era is over.

🥷 Everyone is rebuilding Claude Code from scratch. Since the TypeScript source got exposed on March 31, at least 8-10 OSS reimplementations have shipped: build-your-claude-code-from-scratch (Python, pip-installable), open-claude-code (nightly decompile), and the curated collection-claude-code-source-code directory. OpenClaw OS takes the other angle: a session/artifact/cron interface for agents that runs inside Telegram, Discord, or Slack.

💰 The AI micro-SaaS resale market is splitting in two. Danny at AutoText broke down how he got to $25K/mo at roughly 90% margin building unsexy QuickBooks Desktop integrations nobody wanted to touch, with zero paid acquisition. Meanwhile r/indiehackers has someone offloading a full AI restaurant SaaS (voice ordering, table booking, source + license) for $1,000 because they “don’t want it sitting unused”. Operators bootstrapping on dull tooling; sellers offloading skeletons.

DRAMA

TRUST FALL, NO MAT

💵 “My accountant stole $60,000+ and ran.”

u/ContactCold1075 spotted the gap while reviewing runway before a fundraising call. The accountant of 18 months had full account access and ran three overlapping schemes: unauthorized transfers, fictitious vendor invoices with suspiciously round numbers, and a fake contractor paid monthly. The accountant won’t pick up. Her rental was vacated a month earlier. Top comment is the grimmer benchmark: another founder lost $2M over three years to a similar trusted bookkeeper, prosecution dragged two years, the chase killed the company before any conviction landed.

Why builders care: If one human can move money out, eventually one will. Read-only access for everyone except the founder. Bank alerts on every transaction over a low threshold. Monthly statement review by someone outside the bookkeeper’s reporting line. None of it is heavy and it’s the difference between a $60K hit and a company-killer.

FIRST DOLLAR

SOFT-LAUNCH PAYDAY

📱 PostPeer hit $200 MRR four weeks in.

@Jonathan_Geiger is selling PostPeer, a social-media posting API aimed at devs who’d rather call an endpoint than ship a scheduling UI. 13 paying customers (nine subs, four one-time), 194 total users, three 5-star reviews. Polar handles payments. Distribution is SEO plus a free companion tool at socialkit.dev. No Product Hunt push yet, which means there’s still a launch spike in the tank.

FOURTEEN BEATS THE KYC WALL

🏋️ A 14-year-old wired up his first payment gateway.

A South African solo dev spent six months building FitTrack Elite (AI form correction via skeleton mapping plus budget nutrition plans). Major payment processors rejected him for age and KYC. Resolution: his father registered the business, then Dodo Payments wired up via a virtual USD account. Stack: Firebase, Vercel, React, AI-assisted debugging. No revenue yet. The milestone is infrastructure-live, which for a teenage solo founder is the harder unlock.

STACK OF THE DAY

📊 TabPFN-3 (free for research, paid for commercial)

TabPFN-3 from Prior Labs is a pretrained tabular foundation model that scales to 1M training rows on a single H100 via row-chunking. Hits 0.2s predictions on a million samples, claims a 93% win rate over classic ML (XGBoost/LightGBM/CatBoost) on TabArena, and runs roughly 20x faster than TabPFN-2.5. Free for research and internal eval; commercial use needs a license. Prior Labs is mid-acquisition by SAP with a four-year, €1B+ investment, so the IP has a deep-pocketed home.

Not sponsored. We just feature tools builders would actually use.

BOOKMARKED TODAY

🧠 Why senior devs fail to communicate their expertise. Tuhin Nair’s thesis: seniors frame problems as complexity management when the business cares about uncertainty reduction. 427 HN points. The framing reframe matters more than the wording.

💳 Copilot Max is $100/mo. Pro and Pro+ get flex allotments. Pro is $10 base + $5 flex; Pro+ is $39 + $31. Variable usage top-up that adjusts as model economics shift, effective June 1, auto-applied to existing subscribers. Copilot’s pricing now mirrors how Cursor and Claude Code already bill.

🛡️ CERT released six serious dnsmasq CVEs. Affects “pretty much all non-ancient versions” of dnsmasq, which lives inside home routers, Linux distros, Android, and most embedded boxes. Patched 2.92rel2 is out, stable 2.93 follows ASAP. Audit your home lab and self-hosted stack.

Curated by AI, built by a human.

Cactus shrunk Gemini to 14MB on-device, Rossmann grabbed Bambu's fork, DuckDB beat Postgres 32x

TOP STORIES

ATTENTION IS ALL YOU NEED, LITERALLY

STREISAND PRINTS HARDER

QUACK ATE POSTGRES

POINT, DON’T PROMPT

THE 7-PERSON APP STORE

TRENDING TODAY

DRAMA

TRUST FALL, NO MAT

FIRST DOLLAR

SOFT-LAUNCH PAYDAY

FOURTEEN BEATS THE KYC WALL

STACK OF THE DAY

📊 TabPFN-3 (free for research, paid for commercial)

BOOKMARKED TODAY

Get the daily indie hacker digest

You're in.