Your inference bill keeps refusing to fall, and Epoch AI just published the line item that explains why: high-bandwidth memory now makes up 63% of what an AI chip costs to build. The expensive part of a GPU stopped being the logic that does the math. It’s the memory bolted next to it.
Three companies make all of that memory, their 2026 capacity is already spoken for, and they pushed contract prices up about 20% anyway. So the floor under cloud GPU pricing is set by a supply chain you can’t route around, no matter how fast the chips themselves get cheaper.
In today’s indie hacker news:
- 💰 Memory is now 63% of an AI chip’s cost, owned by three firms
- 🤖 A solo dev’s DeepSeek agent claims a $1.38 day, HN isn’t buying the name
- 🧱 A benchmark says naming Postgres drops your agent 19 points
- 🐧 AMD walls Linux behind $1,200 on its free chip tool
- 🎙️ A free browser DAW, and what Telegram Stars actually pay
TOP STORIES
THE PART THAT ISN’T THE CHIP
💰 Memory grew to nearly two-thirds of AI chip component costs

The story: Epoch AI’s component tracker weighs the bill of materials across the chips NVIDIA, AMD, Google, and Amazon actually shipped, by production volume. HBM’s share climbed from 52% in early 2024 to 63% by the end of 2025, while everything else got relatively cheaper: advanced packaging slid from 19% to 15%, the logic die TSMC etches held flat near 13%. The trend has one direction. Researcher Venkat Somala expects memory’s share to climb again this year “as memory supply remains tight and prices rise.”
The details:
- Total component spend across the four designers doubled in a year, from roughly $22B to $52B. HBM alone was about $20B of that $30B jump, per Epoch’s tracker.
- SK Hynix, Samsung, and Micron are the only firms that make HBM. SK Hynix holds about 62% of that market and told investors its memory lines were “essentially sold out” for 2026 back in October.
- The 2026 contract increase isn’t a blip either. It rides on top of an even steeper HBM3E surge in late 2025, with demand still outrunning what the three makers can pour into fabs.
- On an NVIDIA B200, the 192GB of memory runs about $3,000 of a ~$6,400 build cost. Roughly half the chip, in one component.
- The squeeze leaks downstream. One HN commenter paid $279 for 96GB of DDR5 in late 2023; the same kit runs over $1,000 now as AI demand eats consumer supply.
Why builders care: If you’re modeling inference costs for the year, this is the line that won’t budge. New memory fabs take 18 to 36 months to stand up, so supply can’t chase demand and renters feel it as GPU rates that refuse to fall. The contrarian read on HN is that there’s no physics stopping a 3x hardware cost drop, just that bottleneck, with Chinese maker CXMT cited as the wildcard if it ever cracks leading-edge stacking.
Work from any WiFi like it's your home network. NordVPN's Meshnet runs a free private mesh between your laptop, dev box, and home server. SSH from a café without exposing a port, the way you'd use Tailscale. The paid VPN on top lets you test geo-fenced Stripe checkouts or feature flags from any country.
We get a cut if you sign up. Only added for tools we use ourselves.
THE $1.38 DAY THAT HN PICKED APART
🤖 A solo dev’s DeepSeek-only agent claims 99.82% cache hits, and the name set HN off

The story: Reasonix is a terminal coding agent on GitHub from a developer who goes by “esengine.” It’s MIT-licensed, it runs only on DeepSeek’s API, and despite the “DeepSeek native” framing it has nothing to do with DeepSeek the company. That framing is exactly what the HN thread flagged first: the top comment, on a 470-plus-point post, reads “this is not an agent by DeepSeek, so the title is misleading.” The pitch underneath is real engineering, though. Most agent loops reshuffle the prompt every turn and blow their provider’s cache; Reasonix freezes a byte-identical prefix so DeepSeek’s automatic caching keeps hitting.
The details:
- The headline figures come from one user’s dashboard on a single day: 435M input tokens, a 99.82% cache rate, an actual bill of $1.38 against $61 at the uncached rate. The author labels it a case study, not a benchmark, and so should you.
- The discount isn’t Reasonix magic. DeepSeek’s V4 Flash charges $0.0028 per million cached tokens versus $0.14 uncached, a 50x gap any agent that holds its prefix steady can claim.
- HN commenters say OpenCode and Pi already hit 97 to 98.6% cache rates on the same models, which is the open question: what does locking to one vendor actually buy you.
- The repo is 34 days old and already past 6,800 stars and 387 forks. It also carries 174 open issues, the usual tax on a viral launch.
- README and Discord ship bilingual in English and Simplified Chinese, so the likely core audience is Chinese-speaking devs.
Why builders care: The bet here is the opposite of model-agnostic. Give up portability, tune every layer to one provider’s caching quirk, and a $60 day of agent work collapses to pocket change. Whether that beats a tool you already run is unproven, but 6,800 stars in a month says the appetite for sub-cent coding loops is bottomless.
THE DATABASE CLIFF
🧱 A new benchmark measures how hard AI agents fall once you add a real database

The story: Three researchers at EURECOM and the University of Basilicata built a backend-generation benchmark and named the effect they found “constraint decay.” They handed seven models 100 tasks built on the RealWorld Conduit API spec, then dialed up the requirements in layers: framework only, then add an architecture, then a database, then an ORM. Accuracy fell off a shelf as the constraints stacked, averaging a 30-point drop from the loosest setup to the strictest. Agents can still code. They just can’t reliably close the gap between “passes some tests” and “ships a correct system” once real-world structure shows up.
The details:
- Naming PostgreSQL as the backend was the single most damaging instruction, a 19-point average hit. SQLite cost 14, a clean-architecture rule cost 9, and ORM syntax barely registered at under 2.
- The best fully-constrained config passed 78.6% of individual assertions but only 8.3% of tasks end to end. Reading per-test pass rates as “it works” is how you get burned.
- Lightweight, explicit frameworks won big: Express, Koa, and Flask beat convention-heavy Django and FastAPI by 25 to 32 points. The frameworks that hide behavior cost agents the most.
- Data-layer mistakes, bad queries plus ORM runtime errors, drove about 47% of all logic failures. It’s the part to review by hand.
- HN practitioners recognized it instantly. One described Claude Opus “calcifying” on whatever architecture it picked first, which is why bolting constraints on mid-session goes badly.
Why builders care: This is a cheat sheet for where your agent breaks. Put the database and architecture in the spec up front instead of retrofitting them, because the model anchors hard on its first decision. Reach for explicit frameworks when you let an agent scaffold from scratch. And the most-quoted tip from the thread: hand it a code example in the style you want, since models pattern-match far better than they parse a written spec.
YOUR OS IS NOW A PAYWALL
🐧 AMD drops Linux from Vivado’s free tier, and the cheapest way back is $1,200/year

The story: Starting with Vivado 2026.1, AMD’s free BASIC tier for its FPGA design suite runs on Windows only. Linux is gone from the free row, confirmed on AMD’s own licensing page. The question that hit #1 on HN at 307 points was blunt: why kill Linux on the free tier while keeping Windows? An AMD forum moderator answered that it’s “a marketing decision,” citing internal surveys where “close to 70% of the customers are still using Windows.” That’s the whole rationale: most people are on Windows, so the minority can pay.
The details:
- The cheapest paid tier that restores Linux is CORE at $1,200/year node-locked, or $1,800 for a floating seat.
- BASIC didn’t just lose Linux. It also drops to limited simulation and loses ChipScope debug, so it’s a weaker free tool than the old ML Standard edition on two fronts.
- The last free Linux version, 2025.2, keeps working and stays supported until 2026.3 ships. It just won’t get new device support after that.
- The new ladder runs five tiers, from the $0 BASIC up through a $10,000 GOLD perpetual license.
- Intel’s Quartus Prime Lite and Lattice’s tools still ship free on Linux, so AMD just handed competitors a recruiting pitch aimed at students and hobbyists.
Why builders care: This isn’t an FPGA story, it’s a free-tier-erosion story, and the move is the tell: your OS choice is now a billing lever. Anyone whose CI pipeline, course, or weekend project leans on a vendor’s free tier should clock the pattern, the same one Unity and HashiCorp ran. A commenter teaching FPGA design said he’s switching vendors, which is the real cost here. The damage isn’t this quarter’s revenue, it’s the next generation of engineers learning on someone else’s tools.
TRENDING TODAY
🖥️ Is NVIDIA still the default for local LLMs in 2026? - A 252-upvote r/LocalLLaMA thread landed on “best for the price, usually not.” The crowd’s value picks: Apple Silicon’s unified memory runs a 32B model on a ~$1,600 Mac Mini at 22W with no VRAM ceiling, and AMD’s R9700 at ~$1,100 keeps getting called the bang-for-buck card. NVIDIA still wins raw throughput, but commenters peg RTX 5090 street prices at $3,700-plus, which ties straight back to today’s memory shortage. The default is cracking.
🥊 Qwen3.6 vs Gemma4, the weekend small-MoE bake-off - r/LocalLLaMA spent the weekend pitting two local models against each other, and there’s no clean winner per the thread. Commenters say Qwen holds up for coding and long tool-calling sessions while Gemma is faster and better at prose but starts hallucinating tool schemas past ~60K context. Treat the model names and numbers as community reports, not vendor benchmarks. The genuinely new thing in the threads is hipEngine, an open ROCm inference engine that fits a full 256K context into 24GB on a 7900 XTX.
FIRST DOLLAR
THE $9 SCREENSHOT NOBODY POSTS
🪙 “$9.1 MRR. single digits. but i cried when i saw it.”
That’s the actual title u/akhtar_btw gave their r/SaaS post, and it pulled 112 upvotes of pure solidarity. Three months from zero to $9.10 a month and $32.90 in lifetime revenue, posted with the dashboard to prove it. Nobody screenshots this number, which is exactly why it lands harder than the milestone brags. One stranger decided the work was worth paying for. Right behind it, another builder closed their first-ever customer in person, pitching an AI styling app while wearing an outfit it generated. Validation from a real human beat any analytics dashboard, twice in one day.
DRAMA
PICK A SIDE, APPARENTLY
⚔️ “I don’t understand why people say Claude is better than ChatGPT”
A non-coder’s r/ChatGPT post drew 277 comments on 219 upvotes, the ratio that means people came to argue. OP grants Claude the coding crown but says ChatGPT wins daily use on image generation and voice. The pushback wasn’t about features, it was about feel: “Claude sounds like a normal person, ChatGPT sounds like someone who’s patronizing me.” Others split it cleanly, one’s a tool for quick tasks, the other’s a thing you think with. The loyalty is dividing by job, not by leaderboard, and “which model is best” keeps losing to “best for what.”
STACK OF THE DAY
🎙️ Audiomass
A free, open-source audio editor that runs entirely in your browser, now with multitrack. Layer channels, drag clips, crossfade overlaps, record onto an armed track, and bounce to a single file, all in vanilla JavaScript on the Web Audio API with nothing uploaded anywhere. The author calls it “Photopea for audio,” and that’s the right frame: it’s built for quick podcast and video cuts, not to replace a full DAW. The whole thing is about 98KB of JS, no install, no account. Show HN put it at 226 points.
Not sponsored. We just feature tools builders would actually use.
BOOKMARKED TODAY
⭐ What Telegram Stars actually pay - A dev who built a paid Telegram bot published the real net math: after Apple or Google take 30% and Telegram takes its cut, you net about $0.013 per Star on stars that retail near $0.02. Minimum withdrawal is ~$13 in TON, with the first payout held 21 days. The transparency post you bookmark before pricing a bot.
🌿 Defeating Git rigour fatigue with Jujutsu - ikesau makes the case for jj over raw Git: commit messily in flow, then reorganize the whole history at the end in one pass with no merge-conflict risk. Easier than an interactive rebase, and it kills the mental tax of keeping commits clean while you’re still figuring out the code. A solid nudge to try jj this weekend.
🎧 Greg Brockman on The Knowledge Project - Shane Parrish interviews OpenAI’s president on the Napa offsite that set the company’s decade plan, the 72-hour Altman-firing crisis, and the “Phoenix” backup plan. The builder-relevant line: Brockman says it’s now “hard to know what percent is NOT” AI-generated of OpenAI’s own code.