#085 Alibaba drained 28.8M calls from Claude, OpenAI taped out a chip, Google halved computer use

Alibaba ran 28.8 million fake API calls through Claude to clone Anthropic’s most restricted model. When Anthropic caught it and told Congress, the government killed global access to two models in 48 hours. Legitimate users lost access overnight with no warning.

February’s distillation wave from three Chinese labs totaled 16.7 million exchanges. Alibaba alone nearly doubled that in a single six-week campaign.

In today’s indie hacker news:

Alibaba drained Claude through 28.8M fake API calls
OpenAI taped out its first chip in 9 months flat
Google baked computer use into Flash at half Claude’s price
Elastic cut 7% of engineering while revenue grew 17%
First Dollar: 5 months of nothing, then paying clients in one week

TOP STORIES

25,000 FAKE ACCOUNTS, ONE TARGET

🕵️ Alibaba ran the largest known distillation attack on Claude, and Anthropic went straight to Congress

Alibaba ran the largest known distillation attack on Claude, and Anthropic went straight to Congress

The story: Reuters obtained a letter Anthropic sent to the Senate Banking Committee on June 10 detailing the attack. Over 44 days, operators affiliated with Alibaba’s Qwen AI division created 25,000 fraudulent accounts and ran 28.8 million exchanges against Claude’s Mythos Preview. The method: adversarial distillation, querying the model with constructed prompts to train a cheaper clone. Two days after the letter, Commerce forced Anthropic to disable Mythos and Fable globally.

The details:

Alibaba’s account infrastructure alone matched what DeepSeek, Moonshot, and MiniMax needed three companies to build in February.
Anthropic now runs an ANTI_DISTILLATION_CC system that silently injects decoy tool definitions into API responses for flagged accounts.
No specific Qwen model has been named as the beneficiary. Alibaba has not commented.
The response followed the June 12 export-control pattern: models vanished globally with no advance notice.

Why builders care: If your pipeline makes high-volume, structured queries to Claude, Anthropic’s decoy injection system could flag you. The coordinated intelligence-sharing between Anthropic, OpenAI, and Google means flagged behavior now crosses lab boundaries.

NINE MONTHS FROM SKETCH TO SILICON

🔥 OpenAI and Broadcom taped out a custom inference chip in 9 months

OpenAI and Broadcom taped out a custom inference chip in 9 months

The story: OpenAI unveiled its first custom chip: a reticle-sized inference ASIC called Jalapeño. Broadcom handled the silicon design, TSMC fabricates, and tape-out took nine months. Engineering samples are running GPT-5.3-Codex-Spark at production frequency. Broadcom CEO Hock Tan told Bloomberg the chip delivers 50% cheaper inference per token versus current Nvidia GPUs. No independent benchmarks exist.

The details:

Richard Ho, who leads OpenAI’s hardware program, previously ran part of Google’s TPU program.
Prototype late 2026, production ramp 2027. The chip won’t be sold externally.
OpenAI committed to 10 gigawatts of deployment with Microsoft by 2029. One gigawatt supports roughly one million H100-equivalent GPUs.
OpenAI’s own AI models helped accelerate chip design, a loop that, if it holds, compresses future generations.

Why builders care: The 50% cost claim is 12-18 months from your API bill. But if it holds at scale, agent loops that multiply inference calls get fundamentally cheaper. Watch the 2027 API pricing cycle.

COMPUTER USE AT COMMODITY PRICES

🖱️ Google baked computer use into Gemini 3.5 Flash at half Claude’s price

Google baked computer use into Gemini 3.5 Flash at half Claude's price

The story: Computer use is now a built-in tool on Gemini 3.5 Flash, Google’s cheapest model. No separate model, no waitlist, no surcharge. Standard Flash pricing applies: $1.50/M input tokens, $9.00/M output, roughly half what Claude Sonnet 4 charges. Three environments: browser (17 actions), desktop (17 actions), and mobile/Android (9 actions). Google is treating computer use as a commodity feature.

The details:

Reference implementation on GitHub with Docker and Playwright. Prototype in an afternoon.
New safety features: user confirmation for irreversible actions, auto-stop on detected prompt injection.
Live demo at gemini.browserbase.com on Browserbase and Stagehand.
Google’s payments team says the predecessor auto-recovered 60% of failing UI tests that took days to fix manually.

Why builders care: Twice the agent steps for the same budget. Browser, mobile, and desktop in one model. Google treating this as table stakes puts pricing pressure on Claude.

THE AI-EFFICIENCY LAYOFF PLAYBOOK

📉 Elastic cut 281 engineers while posting $1.74B revenue and 17% growth

Elastic cut 281 engineers while posting $1.74B revenue and 17% growth

The story: CEO Ash Kulkarni announced a 7% reduction, cutting 281 employees from a headcount of 4,019. He framed it as “changing with” AI and automation. The same day, Chief Product Officer Ken Exner resigned. Despite the cuts, Elastic expects total headcount to grow by year-end: sales teams are exempt and expanding.

The details:

FY2027 guidance targets $1.985B-$2.000B, a slight deceleration from FY2026’s pace.
Net revenue retention sits at 112%, meaning existing customers keep spending more.
Engineering is consolidating into three core areas under the CEO. With the CPO gone, the dedicated product voice at the top disappears.

Why builders care: Elasticsearch powers millions of apps for full-text search, log aggregation, and increasingly vector search for RAG pipelines. The engineering consolidation could slow roadmap velocity. OpenSearch 3.6+ (Apache 2.0, Linux Foundation-backed) is the hedge if you’re concerned.

🤖 AI slop is drowning open source - Greptile published “PR spam today looks like email spam in the early 2000s” (191 HN pts, 112 comments). Jazzband (84 Python projects, 150M+ monthly downloads) shut down permanently in March. Ladybird browser halted all public PRs on June 5. One developer estimates reviewers spend 12x longer checking AI PRs than generating them.

🚀 Open-source models are beating closed-source on code - r/LocalLLaMA is on fire: Gemma4-26B uncensored shipped with MTP support (35-53% faster), Gefen claims 8x memory savings as an AdamW drop-in, and MiniMax M3 hit 59% on SWE-Bench Pro in early June, beating GPT-5.5.

📊 272 signups, zero activation - Multiple indie founders hit the same wall this week. One got 272 signups and struggled with everything after (r/indiehackers, top post). Another spent $677 on edtech ads and made $110 back. CAC has risen 60% since 2020 while retention costs rose only 12%.

FIRST DOLLAR

COLD CALLS KILLED THE ASSUMPTION

💰 5 months building the wrong thing, then $420 MRR in one week of actual selling

The Kubes team spent five months building AI lead generation. Zero revenue. Then they picked up the phone. A real estate prospect doing $100K/month told them: “We already get leads. The problem is follow-up.” They pivoted mid-week to lead follow-up. Four B2B clients in seven days. $420 MRR. Cold calling broke the streak.

STACK OF THE DAY

🛠️ RubyLLM - First serious Ruby framework for every major AI provider. Supports OpenAI, Anthropic, Gemini, chat, tools, agents, embeddings, and streaming. 3.7K GitHub stars. 359 HN points. If you’re a Rails shop that’s been duct-taping HTTP calls to Claude, this replaces the glue code. Free, open source.

Not sponsored. We just feature tools builders would actually use.

BOOKMARKED TODAY

📌 MCP Authentication: What I Got Wrong - Three days debugging why an MCP server worked in Claude Desktop but failed everywhere else. Practical auth lessons for anyone shipping MCP integrations.

📌 45C cooling design cuts data center water use to near zero - Nvidia’s liquid cooling approach for AI data centers. 234 HN points, 164 comments. Relevant if you’re thinking about infrastructure costs at scale.

📌 Ask HN: Where is our profession going? - A founder who ran a 3-person software company visited a 15-person shop and was shocked: the code is no longer the source of truth, and engineers ask Claude to explain what Claude wrote. The thread is part existential crisis, part roadmap.

Curated by AI, built by a human.

Alibaba drained 28.8M calls from Claude, OpenAI taped out a chip, Google halved computer use

TOP STORIES

25,000 FAKE ACCOUNTS, ONE TARGET

NINE MONTHS FROM SKETCH TO SILICON

COMPUTER USE AT COMMODITY PRICES

THE AI-EFFICIENCY LAYOFF PLAYBOOK

TRENDING TODAY

FIRST DOLLAR

COLD CALLS KILLED THE ASSUMPTION

STACK OF THE DAY

BOOKMARKED TODAY

Get the daily indie hacker digest

You're in.