(Sigue Sigue) Sputnik!
DeepSeek and the Fiery End of the AI Era (or, the Totally Normal Evolution of Technology)
Postcards from the Future started as part of the process of creating a new class on the future of business models for the University of Michigan Center for Entrepreneurship, and is now where I spend time thinking about how the new generation of AI will change how we live and do business.

A couple of weeks ago the AI world was absolutely rocked when DeepSeek, a tiny startup based in China, published a white paper, detailing the performance of a new Open Source Large Language Model (LLM) that delivers performance comparable to the latest models from OpenAI, Meta, and others for a small fraction of the cost.
Financial markets went crazy as Nvidia (NVDA) posted the biggest one-day loss in market history. The end of the world as we knew it was nigh, and entire industries were holding their breath, watching their version of the asteroid overhead heading for their Yucatan.
Take a Breath
When I started reading the hyperbolic headlines around DeepSeek, I thought that this would be really fun to write about. AI, financial markets, pop culture: what could be better? But, best to wait a couple of weeks, though, after the initial fires had a chance to burn out.
So much to unpack, so many angles.
But first …
Sigue Sigue
Seeing “Sputnik” flashing across the chyrons, of course, reminded me of new wave/synth pop trend jumper Sigue Sigue Sputnik, formed by the former bassist from Generation X (the punk band that predated the naming of the actual generation). And, an older, wiser and slightly more erudite me, realized that those two weird words at the beginning of the band name that nobody really knew how to pronounce in 1986 were actual real words from a foreign language, (in Spanish, “Sigue sigue” translates to “go go”).
Never being one to shy away from a stretched-to-its-absolute-limits metaphor, I immediately realized that, if the talking heads (the pundits, not the band) want to go there, then, what the heck? What could be a more apropos rallying cry for the current AI insanity than Go, Go, Sputnik!
Warning: before you Google it, people of a certain age (and quite possibly only those in my friend group from the 80’s) may remember Sigue Sigue Sputnik’s synth-laden banger that dominated the dance floors for at least week or two in 1986 (maybe?). This tune was a member of that unique class of songs where you only need to hear the first four notes to know the entire song. Really. Four. That’s it. Turn it off before the fifth, please, or it will never leave your head. I still carry its burden some 40 years later. You’re welcome.
Cutting to the Chase
Q: Was DeepSeek a fundamental shock to the progress of AI? Did everything change on January 22?
A: By all rational accounts, it was a completely normal (and expected) example of how technology evolves.
Q: Did the DeepSeek announcement destroy the financial markets?
A: No. NASDAQ and the S&P 500 are both up (~4%) year-to-date, and even the main focus of the market attention, Nvidia, is flat (actually, down a smidge, 0.03%) YTD, as I write this.
So, what happened and why? Let’s scroll back to late January …
Cue Chicken Little:
No more giant competitive moats made of cash to protect the megatech bros, absolutely everything has changed!
All the promises Sam and Zuck and Elmo have made that they’re the only ones rich enough to save us, now are … false?
<pause>
Hyperbolic Hyperbole
It’s a Sputnik Moment!
Sputnik moment:
Idiom: The moment when a country or a society realizes that it needs to catch up with apparent technological and scientific developments made by some other country .... [via Wiktionary]
So, DeepSeek is a giant wake up call? Well, yes, I think it likely is, but maybe not in the way the prognosticators intended as their headlines flew around the web. And, unfortunately, a wake up call that the world doesn’t seem ready to heed.
Attention! We Live in an Attention Economy
First of all, a major contributor to the firestorm is the fact that we are living in an attention economy. The incentives in our current information economy are structured towards attention and attention only, definitely not veracity, and certainly not value. Veracity is dead, hyperbole is the lingua franca.
Reason 1: Veracity is dead, hyperbole is the lingua franca.
So, anytime something new breaks, seemingly everyone with access to a microphone (or social media account) is jumping to Jim Cramer every tidbit of information:
Jim Cramer (see also Stephen A. Smith):
verb: to communicate ideas in a such a way that elevated volume idrives a narrative, not actual facts
The first thing that I thought while digging into DeepSeek (and reading the white paper) was: “Wait, the global financial markets are reacting to a White Paper? A marketing document?” and, “didn’t China just self-report perfect financial performance for 2024?” Organizations exaggerate public results all the time, for a variety of reasons. I’m not claiming to debunk DeepSeek’s reported results, I’m just saying any rush to judgement is silly. Didn’t anyone read TechCrunch in the go-go days of the internet? Everyone knows the early marketing hype rarely matches the customer reality.
Everyone who has ever read TechCrunch knows that the early marketing hype rarely matches the customer reality.
Also, who benefits from this narrative? The funds that didn’t participate in OpenAI’s last round? Big funds trying to squash upstart new AI-only funds?
The Myth of the Tiny Startup
We all love a good David vs. Goliath story, but another part of the DeepSeek narrative that sparked my spidey senses was the consistent emphasis on the fact that this earth shattering work was done by a “tiny startup.” DeepSeek, as a company, was founded in December 2023, but it was spun out of an internal ongoing effort at China’s 4th largest hedge fund, $14B High-Flyer, and the white paper lists 192 individuals as contributors. Mmmkay. A technical team with ~200 people with billions behind it does not a “tiny startup” make.
Reason 2: Everyone loves a David vs. Goliath story
But is There any There, There?
One of the reasons I wanted to wait to dive into this story was to wait until some calmer minds joined the discussion. Sure Sam quickly agreed that DeepSeek “did a couple of nice things” and the advancements were “not surprising at all,” but he just closed a $6.6B investment round that I’m sure was based on the huge financial moat he has built to deter competition, so I’m guessing he had some explaining to do when the news hit.
As far as I can discern, there continue to be significant questions about the actual cost of creating the R1 model, and there has yet to be any peer-reviewed performance data. One of the calmer voices throughout these turbulent times has been Dario Amodei, the CEO of Anthropic. Obviously, he has a significant vested interest in this market, but his analyses seem to be refreshingly balanced. In a piece Amodei wrote shortly after the big spike of news on DeepSeek, two things jumped out at me:
"DeepSeek produced a model close to the performance of US models 7-10 months older, for a good deal less cost (but not anywhere near the ratios people have suggested)".
and,
“DeepSeek-V3 is not a unique breakthrough or something that fundamentally changes the economics of LLMs; it’s an expected point on an ongoing cost reduction curve.”
“... an expected point on an ongoing cost reduction curve” aligns much better with my (decades of) experience in tech than Andreessen’s “SPUTNIK MOMENT” quip. He should really know better. Of course there is going to be a flood of innovations, breakthroughs, surprises, and outright lies. With a projected compound annual growth rate of almost 40% in the AI market over the next few years, the financial incentives are simply too great to think otherwise.
A Sputnik Moment? Andresseen should know better
With some time under our belts, consensus appears to center around a couple of things:
the $6 million training cost claimed by DeepSeek is very likely “far from the truth,” and,
the technical innovations demonstrated by DeepSeek can most fairly be characterized as incremental.There's nothing that fundamentally changes our understanding of how these models can be created or run. Solid work, for sure, but no major trajectory changes for the overall technology.
Specifically the two main technical innovations that seem to drive DeepSeek’s success are:
Multihead Latent Attention (MLA) – MLA is a logical extension of the original Transformer paradigm that addresses memory challenges associated with running LLMs by encoding the Key-Value and Position matrices in ways that dramatically reduce the memory footprint of an executing model (blah, blah, jargon, jargon. Net: learn linear algebra well, and you will go far in life)
Reason 3: Linear algebra FTW
and,
Distillation - distillation is a process by which existing LLMs are used to pre-train new models, significantly reducing the overall cost of creating the new model. There continues to be discussion around if and how DeepSeek used distillation, and whether an existing publicly available model was used in the process (including use that may be considered to be in violation of IP rights).
Reason 4: (maybe)
StealingCopying someone else’s starting point reduces training cost
Show Me the Money Perspective
So, “high net,” (as the cool kids say):
The financial world went temporarily nuts over a freakin’ marketing memo, but quickly reverted to “normal” (whatever “normal” means these days)
Innovation in AI continues to happen at a blistering pace
Things that sound too good to be true, still are too good to be true
We still are not materially closer to AGI (because AGI isn’t actually a thing), but these systems will only continue to become more capable and performant
Oh look, a squirrel! AI ‘godfather’ predicts another revolution in the tech in next five years
First thing I’ve read all the way through. Love the perspective, and the writing :) kudos
have you ever considered writing for a living?
(I actually was able to follow most of your right-minded/justified snark)