Google officially declares war

By: rootdata|2026/05/21 22:10:32

Author of this article: Chengbei Xugong Data support: Gougu Big Data

The 2026 Google I/O Developer Conference gives the impression of only two words: arrogance.

Not only did they seamlessly stuff AI agents into all core traffic entry points like search, browsers, mobile phones, and smart glasses, but they also continuously released three major innovations: Gemini 3.5 Flash, the video model Omni, and the brand new AI assistant Spark.

After showcasing their strengths, they proudly announced that Gemini's monthly active users surpassed 900 million; they also officially announced significant price reductions.

The meaning is straightforward: I am stronger than you, and I am cheaper than you.

Isn't this a declaration of war?

01

The most stunning reveal at the conference was undoubtedly the debut of Gemini 3.5 Flash.

Normally, "Pro" represents the backbone, while "Flash" signifies lightweight and speed.

In terms of model parameters, 3.5 Flash is indeed smaller than 3.1 Pro, but in almost all inference and coding benchmark tests, the former performed surprisingly better:

In the complex mathematical reasoning GSM8K test, 3.5 Flash scored 95.8%, surpassing 3.1 Pro's 93.2%; in the full version of the SWE-bench code generation capability, 3.5 Flash achieved a solution rate of 38.4%, far exceeding 3.1 Pro's 32.1%...

Why?

According to the "Gemini 3.5 Technical Report" released by DeepMind, there are two core technologies that are most important.

Extreme Knowledge Distillation: Google did not simply rely on stacking computing power to train Flash; instead, they used the never-before-released "Gemini 3.5 Ultra" as a teacher model to perform dimensionality reduction distillation on Flash.

According to a tweet analysis by DeepMind's chief scientist Jeff Dean, the fine-tuning ratio of 3.5 Flash on high-quality logical chain datasets improved by 400% compared to the previous generation.

This means it inherits the "logical brain" of a super-large model, rather than a rote "knowledge base."

Brand new MoE architecture (Mixture of Experts): Inside 3.5 Flash, Google adopted a more finely-grained expert network.

Traditional MoE may have only 8 or 16 experts, activating only 1-2 at a time, which is sufficient to support trillion-parameter scale models.

According to an analysis in a16z's 2026 AI infrastructure investment memo, 3.5 Flash employs 256 micro-experts, activating up to 4 of the most efficient during each inference.

This allows it to cover an extremely large multimodal feature space while maintaining an extremely low activation parameter count.

In terms of TTFT (Time to First Token), 3.5 Flash has already reached under 65 milliseconds.

And a human blink takes 100-150 milliseconds.

In short, when it operates as an agent, from a human physiological perspective, there is no noticeable pause at all.

For developers who need to frequently call tools, engage in multiple rounds of reflection, and require extremely low latency, this is the perfect super agent foundation.

Only with such extreme engineering optimization can one establish dominance in "edge deployment" in a fiercely competitive environment.

The first is the native multimodal Gemini Omni Flash.

Omni means all-powerful, corresponding to the earlier GPT-4o; just by the name, one can feel the intense competition.

At least in terms of performance, Gemini Omni Flash is far more qualified to use the "o" character than GPT-4o.

Early versions like Sora or Gemini 1.5 were essentially patchwork, converting speech to text and then text to visuals.

But the Omni released this time is truly a native end-to-end multimodal alignment. It can not only natively understand the temporal coherence and physical laws in videos but also reduce the industry average delay from 400-600 milliseconds to 120 milliseconds.

For example, during the conference: a user wearing a camera pours water, and as the cup is about to overflow, Omni can say "stop stop stop!" 0.5 seconds before the water spills.

This real-time inference of the physical state of the real world may seem simple, but it is significant: AI has officially evolved from a chatbot on the screen to an auxiliary tool in the real world.

Even if it is still in its early stages.

The second is the intelligent assistant Spark.

According to a report from The Verge interviewing the Vice President of Android Engineering, Spark has been granted control over the native API of the Android 17 system.

In short, complex processes that previously required opening many apps can now be completed without lifting a finger; just instruct Spark, and it can handle everything for you, even sending messages, organizing emails, summarizing schedules, tracking web dynamics, identifying hidden charges on bills, batch processing documents, and so on...

In other words, with the AI assistant, we will hardly need apps anymore; any complex operation is simplified into a single command.

The third is smart glasses.

Why glasses again?

At least from Google's perspective, seamless access to vision and hearing is the ultimate host for multimodal large models.

These glasses have no flashy appearance, focusing entirely on practical capabilities:

Micro-OLED full-color waveguide lenses weighing only 4 grams, with a light transmittance of up to 85%;

Equipped with a self-developed lightweight Gemini edge chip, local inference latency ≤12ms, capable of real-time translation, image recognition, and scene analysis without needing to connect to the internet;

Natively linked to the Spark agent, synchronizing mobile and cloud data to provide personalized services such as schedule reminders, real-time translation, and environmental alerts.

In short, it bypasses the smartphone screen, integrating the agent into the human first-person perspective through glasses.

There is simply too much content; Google seems to have emptied all its trump cards at once, declaring a truth to the market:

An algorithm without an entry point is nothing.

The era of rolling out model parameters and benchmark scores is over; pure model providers no longer have a moat. The future is a four-dimensional space battle of "edge + cloud + ecosystem + hardware."

Stuffing AI into a family bucket is actually reshaping the entire internet's traffic distribution logic: from "users actively searching/clicking" to "AI agents actively distributing services."

For a vast number of developers and small to medium enterprises, this is excellent news, as the underlying computing power and models have become extremely cheap, allowing everyone to focus on innovation at the application layer.

But other competitors are probably just cursing at this moment.

02

When they casually announced from the stage that "Gemini's monthly active users have officially surpassed 900 million," it caused quite a stir in the audience.

900 million is more than the combined MAUs of all competitors in the U.S.

How did they achieve this?

The answer is simple and brutal: force-feeding.

Google does not need to spend advertising money to acquire users like independent AI companies; it just needs to add an icon next to the address bar in the Chrome browser, integrate a shortcut key in the bottom navigation bar of 3 billion Android phones, and push updates throughout Google Workspace...

The customer acquisition cost is essentially zero.

More critically, over the next period, the 900 million active users' gazes while using smart glasses to view products, the logic corrected while processing tasks with Spark, and interactions with the Omni visual model will generate a massive amount of high-quality, multimodal real-world feedback data, all of which will nourish Gemini 4.

This creates an extremely solid barrier: the better the model is to use -> the more users it attracts -> the more data it generates -> the better the model becomes.

To quickly strengthen this closed loop, Google directly announced a price war against all competitors: the AI Ultra package was slashed from $249.99/month to $99.9/month.

The input price for 3.5 Flash's million tokens dropped to $0.02, and the output price for a million tokens is $0.08.

What kind of incredible price is this?

In comparison, the average prices for models of similar levels in the industry are around $0.15-0.2 for input and $0.6-1 for output.

Chopping the numbers, the top clients process about 1 trillion tokens daily. Shifting 80% of the workload to Gemini 3.5 Flash for a year could save over $1 billion.

Why dare to sell AI at such a low price?

The biggest reliance is: vertically integrated computing power infrastructure.

Including giants like OpenAI and Anthropic, they may seem glamorous, but essentially they are still "computing power tenants," needing to buy computing power from Microsoft and Amazon, who in turn have to pay the old Huang.

Google has its own TPU, and combined with the extremely efficient MoE sparse activation of 3.5 Flash, it has compressed computing power costs to the extreme.

They can fully leverage their heavy asset advantages to strike down pure algorithm companies.

The logic is clear.

Basic large models are rapidly commodifying. Just like water and electricity, have you ever seen a water company making exorbitant profits?

Google is not afraid that large models themselves do not make money because they can earn it back through search ads, cloud services, and commissions from the Android ecosystem.

But for companies like OpenAI, Anthropic, Cohere, and Mistral that rely solely on selling large model APIs, this is impossible.

Investors are probably now wanting to press Ultraman's head and ask: "Google's API price is only one-tenth of yours, and its performance is better than yours. How do you expect your business model to work?"

The competitive landscape across multiple industries will thus enter an accelerated reshuffling period.

AI vendors must quickly find cheaper sources of computing power or start making chips themselves.

Next is Apple, which is still building in isolation.

The combination of smart glasses + Omni video large model + Spark's native system-level takeover undoubtedly threatens the iPhone.

According to Macquarie's "Consumer Electronics Trend Forecast Report": In the next three years, the proportion of screenless interactions based on vision/voice is expected to jump from the current 8% to 35%.

If users become accustomed to completing daily work and entertainment using glasses and voice, the usage time of screens will inevitably be significantly reduced.

If Apple cannot produce sufficiently impressive wearable devices to counter (Vision Pro is too heavy and expensive, destined to be a toy for a minority), its monopoly on entry points in the mobile internet era will face unprecedented challenges.

This is not iteration; it is revolution.

Google has thrown down the gauntlet to all competitors with technology, traffic, and price as three weapons.

At this moment, is there anyone still mocking it for having the disease of a large enterprise?

-- Price

On the surface, it seems like a good deal for Hyperliquid with doubled revenue, but in reality, Coinbase has obtained something more valuable: a global distribution channel for USDC. In a situation where it is besieged domestically and locked out by USDT overseas, embedding stablecoins into the larg...

It is Bankless that needs Ethereum, not Ethereum that needs Bankless

The role of Bankless is being replaced by a more decentralized, specialized, and diverse "narrative network."

Real Madrid vs Athletic Bilbao: Can Los Blancos Close Out the Season with a Home Win? (LALIGA Preview)

Real Madrid vs Athletic Bilbao lineups, standings, and stats for May 23, 2026. Real Madrid look to finish this LALIGA season strong at the Bernabéu. Full preview inside.

a16z invested $356 million to aggressively acquire HYPE, surpassing Paradigm to become the largest external holding institution

Eight months later, the price of HYPE is approaching its previous high, and institutions like a16z, Goldman Sachs, and Grayscale are collectively taking action. What is their intention?

Futures Trading Hours Explained: How Smart Traders Cut Futures Fees and Earn More Cryptocurrency in 2026

Most futures traders focus on entries and ignore the fees quietly killing profits. Learn smarter futures trading strategies, TradingView setups, and how to earn back up to 45% in trading fees.

Beast Industries Acquires Step – Expanding Fintech Horizons

Key Takeaways Beast Industries, led by YouTube celeb MrBeast, has acquired the teen-focused fintech banking app Step, aiming…

MrBeast’s Strategic Acquisition and Bitcoin’s Critical Threshold: An In-Depth Analysis

Key Takeaways Bitcoin faces crucial threshold levels, notably $55,000 and $60,000, which may determine its future trajectory, including…

BankrCoin Reaches New All-Time High Following Major Exchange Listing

Key Takeaways BankrCoin (BNKR) recently surged to a new all-time high of $0.00094 after being listed on a…

Bitcoin Could Face Price Drop as Analysts Predict $55K Support Challenge

Key Takeaways Analysts forecast a potential Bitcoin price drop to as low as $55,000 if current support levels…

Bitcoin Faces Possible Decline to $55K as Market Volatility Persists

Key Takeaways Analysts predict Bitcoin might decline to $55,000 if it fails to maintain current support levels. Technical…

BankerCoin Soars: BNKR Token Achieves New Heights

Key Takeaways BankerCoin’s (BNKR) price hit a record high with a market cap exceeding $102 million. The token…

Bitcoin Analysts Predict Possible Drop to $55,000 if Key Support Breaks

Key Takeaways Analysts predict a potential drop to $55,000 if Bitcoin’s support levels fail. The probability of Bitcoin…

Bitcoin Analysts Predict Potential Drop to $55K Amid Market Fluctuations

Key Takeaways Analysts foresee a potential decrease in Bitcoin’s price to $55,000 if key support levels are broken.…

BNKR’s Recent Surge Marks New Heights in Cryptocurrency Market

Key Takeaways BNKR Token Peak: BNKR reached an all-time high of $0.0011 on July 31, 2025. Significant Market…

Ethereum Price Plummets as Panic Selling Rises

Key Takeaways Ethereum’s price has dropped steeply by 29% over the past week, sinking below $2,000 and hitting…

Analysts Predict Bitcoin Could Fall to $55K if Key Support Fails

Key Takeaways Analysts caution that Bitcoin could face a significant drop if its current support level is breached,…

Bitcoin Price Predicted to Possibly Drop to $55K

Key Takeaways Analysts highlight the potential for Bitcoin’s price to plummet to $55,000 if current support levels fail.…

Analysts Warn Bitcoin Could Drop to $55K If Key Support Levels Break