China's Great Firewall Broke VMess: Inside the ML-Powered DPI Upgrade of 2025-2026

2026-05-177 min read

ChinaDPIMachine LearningVMessGreat Firewall

How China's Great Firewall Broke VMess: Inside the ML-Powered DPI Upgrade of 2025-2026

In September 2025, China deployed a significant upgrade to the Deep Packet Inspection infrastructure that underpins the Great Firewall. Now, in May 2026, the consequences are becoming crystal clear. A comprehensive technical report released by RaccoonLine on May 15 reveals that the Great Firewall has achieved an 80% detection rate for VMess — a protocol designed specifically for censorship circumvention — by identifying unique packet timing characteristics through machine learning models. This marks one of the most significant shifts in the censorship arms race in years.

The Machine Learning Turn: From Signature Matching to Behavioral Analysis

For decades, DPI systems relied on static signatures. A protocol like OpenVPN had a recognizable HARD_RESET opcode. WireGuard's handshake initiation produced a distinctive on-wire pattern. IKEv2's SA_INIT packet was trivially fingerprintable. These approaches — known as deep packet inspection of the first generation — could be foiled by obfuscation layers that scrambled or disguised the packet payload.

China's September 2025 upgrade changed the game. Instead of merely inspecting packet contents, the new GFW middleboxes began analyzing packet timing, inter-packet arrival intervals, connection duration patterns, and traffic volume rhythms. Machine learning models, trained on vast datasets of legitimate and tunneled traffic, learned to distinguish VMess flows from ordinary HTTPS even when the payload itself looked innocuous.

The results were dramatic. Within weeks of deployment, VMess — which had been the workhorse of censorship circumvention in China since the mid-2010s — saw its detection rate climb to 80%. Users who had relied on VMess for years suddenly found themselves disconnected. The GFW had learned to see not just what the packets contained, but how they behaved.

Active Probing: The GFW Fights Back

The upgrade went beyond passive traffic analysis. RaccoonLine's report documents aggressive active probing by the Great Firewall. When a suspicious connection is detected — unusual handshake timing, anomalous packet sizes, ML-flagged flow patterns — the GFW does not simply block it. It actively probes the destination server to determine whether it is a proxy or VPN endpoint.

Standard VPN servers respond to these probes in ways that confirm their function. An OpenVPN server sends a recognizable reset. A WireGuard endpoint replies with a distinctive handshake response. A Shadowsocks server reveals its proxy nature through its response pattern. Once confirmed, the server's IP address is added to the GFW blocklist — often within hours of going live. A freshly deployed VPN server in most configurations is blocked before the end of its first day of operation.

This active probing infrastructure is part of what the leaked Geedge Networks documents from September 2025 called the Tiangou Secure Gateway (TSG) — the flagship DPI and filtering product that forms the backbone of China's censorship apparatus. Geedge's products have already been exported to Kazakhstan, Pakistan, Myanmar, and Ethiopia, turning what began as a domestic censorship tool into a commercial export.

Why VLESS + REALITY Survived

While VMess crumbled, one protocol combination remained operational across all tested locations — Beijing, Shanghai, Shenzhen, and Chengdu, across China Telecom, China Mobile, and China Unicom. VLESS, paired with the REALITY transport, consistently survived GFW detection.

The secret lies in REALITY's approach. Unlike traditional TLS-based obfuscation that generates its own certificate, REALITY borrows the TLS certificate of a real, widely-visited website. When the GFW's active probing system queries a VLESS+REALITY server, it receives the same cryptographic response that the legitimate site would give. There is no synthetic certificate to fingerprint, no unusual handshake parameter to flag. From the perspective of the DPI system, the traffic is indistinguishable from HTTPS to a popular website.

VLESS itself contributes to this invisibility by carrying no distinctive overhead pattern. Unlike VMess, which included protocol-specific headers that ML models could learn to detect, VLESS traffic contains no static markers. Correct configuration is essential: VLESS servers must use TLS transport, select an appropriate front domain for REALITY, and integrate with a CDN to handle IP-based blocking.

The Fixed-Endpoint Problem and Decentralized Routing

Even a perfectly configured VLESS+REALITY server on a fixed IP address accumulates behavioral signals over time. The same IP appears in connection logs across multiple users. Traffic volume patterns cluster around specific IP ranges. Packet timing signatures aggregate into recognizable profiles. Eventually, the GFW's behavioral models flag the IP, and it joins the blocklist.

This is where decentralized routing becomes critical. Wandering Flow routing — cycling connections through different peer-to-peer nodes rather than maintaining a fixed server endpoint — eliminates the single-IP target. There is no fixed endpoint for the GFW to probe. P2P residential node IPs provide an additional advantage: data center IP ranges are among the first entries on GFW blocklists, while residential IPs blend into the ocean of ordinary consumer internet traffic.

QUIC Censorship: A Parallel Front

The GFW's evolution extends beyond VPN detection. Since April 2024, the Great Firewall has been decrypting QUIC Initial packets at scale and matching Server Name Indication (SNI) values against its blocklist. When a forbidden SNI is detected, the GFW drops all subsequent UDP packets sharing the same server IP, destination IP, and destination port for over 100 seconds of residual blocking. Independent research from USENIX Security 2025 documented this behavior and found that the QUIC blocklist is a distinct list — about 60% of the DNS blocklist in domain count — suggesting independent policy maintenance.

However, the QUIC censorship implementation has documented weaknesses. It does not track Connection IDs, using instead the UDP 4-tuple with a 60-second timeout. It does not reassemble QUIC Initials split across multiple datagrams — a behavior inadvertently exploited by Chrome's September 2024 changes. It explicitly unblocks QUIC connections that carry an Encrypted Client Hello (ECH) extension, creating a deliberate bypass that may reflect internal policy tensions within China's censorship apparatus.

What This Means for the Future

The September 2025 GFW upgrade and the subsequent VMess crackdown represent a paradigm shift. For the first time, behavioral ML analysis — not just signature matching — has been deployed at nation-state scale against censorship circumvention tools. The infrastructure behind this, built by Geedge Networks and the Chinese Academy of Sciences' MESA Lab, is being actively exported to other authoritarian states.

For users and circumvention tool developers, the lesson is clear: protocols must now be designed not only to hide what data they carry but to randomize how that data behaves. Fixed packet timing, predictable handshake sequences, and consistent traffic volume patterns are all exploitable signals. The next generation of circumvention tools will need to incorporate stochastic timing, connection pattern randomization, and decentralized endpoint rotation as first-class design principles, not afterthoughts.

The cat-and-mouse game continues. VLESS+REALITY works today — but as the GFW's ML models grow more sophisticated, the window for any single protocol's effectiveness narrows. The censorship arms race is far from over.

Source: RaccoonLine Technical Report Details Evolution of China's Great Firewall Following 2025 DPI Updates