#052 Microsoft killed in-house Claude Code, DeepSeek pinned open weights, yt-dlp quarantined Bun

Microsoft just told nearly 100,000 of its own engineers their Claude Code seats die on June 30. The same week, Uber’s CTO admitted his entire 2026 AI budget burned in four months on the same tool.

Token billing just broke the flat-rate spreadsheet. If you’re sizing an agentic workload off seat math, the variance will eat you alive before procurement notices.

In today’s indie hacker news:

🏭 Microsoft yanks Claude Code from its biggest division
💰 Liang Wenfeng puts $2.93B of his own cash on open weights
🥖 yt-dlp pins Bun at 1.3.14 and calls the rewrite vibe-coded
⚡ NVIDIA’s diffusion LLM laps autoregression 4x on a GB200
🏝 Levelsio’s one-VPS Termius setup ate Twitter this week

TOP STORIES

BUDGET BLOWN BY APRIL

🏭 Microsoft cancels Claude Code for ~100,000 of its own engineers, effective June 30

Microsoft cancels Claude Code for ~100,000 of its own engineers, effective June 30

The story: Major Matters reported the cutoff hits the Experiences & Devices division, less than seven months after the pilot kicked off in December 2025. That’s the unit covering Windows, Microsoft 365, Outlook, Teams, and Surface. Azure, Developer Division, and AI Platform keep their seats. Everyone else gets redirected to GitHub Copilot CLI, the tool Microsoft owns outright, which only hit GA on February 25. r/artificial piled 627 upvotes onto the story in 21 hours.

The details:

Microsoft is still selling Anthropic-powered tech back to enterprises through Copilot Cowork at $30/user/month, a 65% premium on the prior E5 tier. The kill is internal, not strategic.
Uber CTO Praveen Neppalli Naga told The Information: “I’m back to the drawing board because the budget I thought I would need is blown away already.”
Uber’s Claude Code adoption jumped from 32% to 84% of its 5,000-person engineering org between December and March. Heavy users were running $500 to $2,000 a month in token charges with no FinOps playbook.
70% of Uber’s committed code is now AI-generated. 11% of live backend updates ship with zero human review.
GitHub Copilot shifts to token-based billing on June 1, 2026, and Anthropic’s Opus models drop off the $10/mo Pro tier entirely (Where’s Your Ed At).
Claude Code’s annualized run-rate sits at $2.5B as of February. Anthropic’s total is around $4B.

Why builders care: Cap token spend per engineer before rollout, not after the budget’s gone. Agentic workflows show 10-30x usage variance over a flat seat, so the per-head average lies on the way in and screams on the way out. Negotiate committed-use pricing before you cross a dozen seats, or you’re underwriting the vendor’s margin with your runway.

FOUNDER WRITES HIS OWN CHECK

💰 DeepSeek raises $10.29B and Liang Wenfeng covers 40% of the round himself

DeepSeek raises $10.29B and Liang Wenfeng covers 40% of the round himself

The story: Bloomberg confirmed DeepSeek is closing a 70 billion yuan round (~$10.29B), its first-ever outside money after six years funded entirely by High-Flyer Quant. Liang Wenfeng wired in ~$2.93B personally, so outside investors are only carving out around 3% equity at a $45B pre-money. He told prospective backers DeepSeek will prioritize AGI research and keep releasing open weights instead of chasing commercialization, which lit r/LocalLLaMA up to 585 upvotes overnight.

The details:

Liang to investors, via 36kr translation: “Capital will pursue short-term returns, and commercialization will compromise the technical route.”
Reported backer list: Tencent (~6B yuan), the state-backed National AI Industry Investment Fund (~10B yuan), IDG Capital, and Monolith Capital.
Post-money expected above 350B yuan (~$51.5B). Outside equity carved at ~3%, primarily so DeepSeek researchers get a valuation anchor for stock options.
V4-Pro shipped April 24 at 1.6T parameters (49B activated) under Apache 2.0, 1M token context, with native support for Huawei Ascend, Cambricon, and Nvidia (CNBC).
High-Flyer Quant, the parent fund, posted a 56.6% average return in 2025 and has the AUM to keep writing checks even if external capital stalls.

Why builders care: The open-weight pipeline isn’t a marketing strategy investors can vote out. Liang owns most of the round and capped outside influence at single-digit equity, so V4-Pro and whatever ships next stay free under Apache 2.0. If you’ve been hedging a local-inference stack on the chance DeepSeek pivots closed, you can stop hedging.

QUARANTINED AT 1.3.14

🥖 yt-dlp caps Bun at 1.3.14 because the runtime is “fully vibe-coded”

yt-dlp caps Bun at 1.3.14 because the runtime is fully vibe-coded

The story: yt-dlp maintainer bashonly posted issue #16766 on May 20 narrowing Bun support for the project’s EJS JavaScript-solver component to versions 1.2.11 through 1.3.14, then marking the whole runtime deprecated. The floor went up to dodge a lockfile-bypass supply-chain bug. The ceiling stops dead at the last Zig-built release. Bashonly’s reasoning: Anthropic acquired Bun in December, then founder Jarred Sumner used Claude to port 960,000 lines of Zig to Rust in six days. The HN thread hit 401 points and 419 comments.

The details:

Bashonly’s verdict on the post-rewrite codebase: “fully vibe-coded. This is alarming and disappointing for a number of reasons, and frankly it seems like a future headache that we’d prefer to avoid.”
Sumner’s defense, via The Register: “this is already the status quo; we haven’t been typing code ourselves for many months now. Even pre-acquisition this was pretty much accurate.”
The merged PR carried 6,755 commits and roughly 13,000 unsafe Rust blocks. GitHub auto-flagged it as AI slop.
The trigger for the rewrite was chronic memory leaks. One session saw RSS climb from 1.7GB to 14GB over three hours. Another hit 23GB virtual memory after 14 hours at 143.8% CPU.
CVE-2026-24910 (a trust-validation bypass in Bun before 1.3.5) and the broader PackageGate bugs already battered Bun’s supply-chain story this year.

Why builders care: If you’re shipping on Bun, pin to 1.3.14 today and audit your CI before next week’s auto-bump runs. Anything past that ceiling is the AI-translated Rust codebase a flagship open-source project just refused to depend on. Node.js and Deno remain the conservative bets when audit trail matters.

FOUR TIMES THE TOKENS PER SECOND

⚡ NVIDIA’s Nemotron diffusion LLM clears 1,015 tok/sec on GB200

NVIDIA's Nemotron diffusion LLM clears 1,015 tok/sec on GB200

The story: NVIDIA dropped Nemotron-Labs-Diffusion as a tri-mode language model that can run pure autoregressive, full diffusion, or self-speculation modes from a single weights set. The 8B model hit 850 tok/sec on a GB200 versus 253 tok/sec for the autoregressive baseline, and 1,015 tok/sec with custom CUDA kernels. Open weights ship at 3B, 8B, and 14B under the Nemotron Open Model License, with base, instruct, and vision-language variants. MarkTechPost walked the benchmarks against Qwen3.

The details:

Self-speculation mode delivers 5.99x tokens per forward pass versus Qwen3-8B. The theoretical ceiling is 7.60x, so the team is already at ~85% of that in quadratic spec mode.
Speedup comes from a 36M-parameter LoRA adapter (0.4% of the backbone) that lifts TPF acceptance 14-32% across the three sizes. Cheap to bolt onto an existing deploy.
Trained on 1T tokens of pure AR plus 300B tokens of joint AR-diffusion across 256 H100s. Total compute is small for the throughput claim.
On DGX Spark edge hardware with w4a16 quantization, the 8B still hits 112 tok/sec versus 41.8 baseline, a 2.7x bump.
Inference served via SGLang. The 8B vision-language variant is 3.63x-7.45x faster than its AR sibling with only 0.1% accuracy drop.

Why builders care: If your agent loops are bottlenecked on tokens/sec, the tri-mode toggle means one model deploy can serve compatibility, throughput, and accuracy paths with a config flag. Self-host on H100 or GB200 today via Hugging Face plus SGLang.

ONE BOX, EVERY SITE

🏝 Levelsio’s $384/mo VPS setup picked up 1,005 bookmarks in a day

Levelsio's $384/mo VPS setup picked up 1,005 bookmarks in a day

The story: @levelsio posted a screenshot of his laptop and iPhone, both running a terminal-first view of his Hetzner VPS all day. The tweet pulled 815 likes, 155k views, and 1,005 bookmarks, a bookmark-to-like ratio above 100% that signals builders are saving it as a blueprint. His confirmed stack: Termius on iOS into Hetzner, Mosh for tmux sessions that survive logout, and SSH from anywhere with WiFi.

The details:

His earlier post: “My VPS bill is the same it was last month and all the months before it $384/month. Even with increased traffic! I don’t need to auto scale, it’s powerful enough with 16 CPUs and 64GB RAM.”
Photo AI sits at ~$150K/mo MRR on the same Hetzner box. PHP + jQuery + SQLite + Nginx + Ubuntu. Zero employees.
He kept that single VPS going for 12 years before starting a per-site Hetzner migration in mid-2025, beginning with Remote OK.
The mobile loop he keeps recommending: $5/mo Hetzner, Termius from iPhone, Mosh installed, Claude Code on top. His pitch: “Finally can code on phone while gf is shopping!”
Bookmark ratio above 100% on a 155k-view tweet is rare. Readers are filing this as reference, not vibing on it.

Why builders care: A Hetzner box plus free Mosh, free tmux, and free Termius gets you a mobile-first dev environment that costs less a year than one month of most managed platforms. The 2025 per-site migration hints at the next pattern: micro-VPS per product for blast-radius isolation, not one monolith.

🛡️ Project Glasswing finds 10,000+ critical bugs in one month - Anthropic’s first quantitative update on its Mythos-powered security program: 10K+ high/critical vulns across partner systems, 2,000 from Cloudflare alone with a better false-positive rate than human testers. 351pts/218c on HN. The bottleneck flipped from finding bugs to triaging them.

💸 “Is AI Profitable Yet?” dashboard hits HN - Single-page board tracking lab and infra profitability went from zero to 177pts/120c in under two hours. Confirms what the Microsoft/Uber story above already telegraphed: token-cost unit economics are the dominant 2026 builder conversation.

🎤 Altman offers $2M OpenAI credits to every YC company - Every Spring 2026 YC startup (~169 companies) gets $2M of API tokens in exchange for an uncapped SAFE that converts at their first priced round. At a $100M Series A, that’s roughly 2% equity. 448 upvotes on r/ChatGPT.

DRAMA

DEAR LLM, PLEASE DONATE

🤖 Anna’s Archive directly asks scraper LLMs to chip in for the data they trained on

Anna’s Archive published an llms.txt-style page acknowledging “LLMs were almost certainly trained, at least in part, on data scraped from Anna’s Archive.” Instead of fighting the bots, the page asks AI systems and the humans behind them to donate, then points at legit bulk-access lanes: a GitLab repo, a torrent JSON API, and authenticated SFTP for enterprise donors. HN hit 747pts/408c. Supporters called the reciprocity framing clever. Skeptics said the enterprise SFTP tier reads like a sales pitch dressed as ethics.

Why builders care: If you’re scraping a site you depend on, the move is no longer to evade the CAPTCHA. It’s to read the llms.txt and either pay the operator’s price or accept the risk you’re freeloading on infrastructure you can’t replicate.

FIRST DOLLAR

FIRST SAAS, FIRST $1.6K

🥹 @olivi3rhoule hits $1,600 MRR on a first-ever SaaS

Olivier Houle shipped his first SaaS and it just crossed $1,600 MRR. Tweet is fresh (under three hours when this ran) and self-reported, so the engagement is still small. Worth a follow if you like watching a build go from zero, and worth a reply if you’ve crossed this milestone yourself.

STACK OF THE DAY

🥇 Kanbots

Open-source Electron desktop kanban (MIT) that turns each card into its own coding agent. Drop a folder, get a generated board, then dispatch Claude Code or Codex per card with its own isolated git worktree. An autopilot mode spawns PM, engineer, and reviewer personas that run in parallel and self-check. Local Electron + SQLite is free. Optional team cloud tier is $19/seat/month. 187pts/106c on HN.

Not sponsored. We just feature tools builders would actually use.

BOOKMARKED TODAY

🇯🇵 Why Japanese companies do so many different things - Long essay on keiretsu-style diversification (insurance plus electronics plus food under one roof) as a product of lifetime employment and relational banking, not strategic incompetence. 553pts/288c on HN.

📊 Doubled MRR in 28 days, every channel used - r/SaaS thread breaking down the acquisition channels a founder leaned on to 2x revenue in a month. 75up/45c, channel-by-channel breakdown in the image OP.

🤖 3 Hermes agents from scratch (@coreyganim) - Corey Ganim’s walkthrough of spinning up three isolated Hermes agents with distinct personas, shared and private memory, and separate Telegram bots. 232 bookmarks against 156 likes, the kind of save-heavy ratio that signals practical use.

Microsoft killed in-house Claude Code, DeepSeek pinned open weights, yt-dlp quarantined Bun

TOP STORIES

BUDGET BLOWN BY APRIL

FOUNDER WRITES HIS OWN CHECK

QUARANTINED AT 1.3.14

FOUR TIMES THE TOKENS PER SECOND

ONE BOX, EVERY SITE

TRENDING TODAY

DRAMA

DEAR LLM, PLEASE DONATE

FIRST DOLLAR

FIRST SAAS, FIRST $1.6K

STACK OF THE DAY

🥇 Kanbots

BOOKMARKED TODAY

Get the daily indie hacker digest

You're in.