
Source | Tech Planet
By Zhang Ningyi
In the era of large AI models, tokens have become the most important medium of exchange. Participants use them to write code, produce short videos, design promotional posters... every step requires tokens.
However, token prices are far from affordable. An AI SaaS entrepreneur could easily spend tens of thousands of yuan worth of tokens in a single month—a cost so high that ordinary entrepreneurs and small-to-midsize developers are crying out: 'We can’t take it anymore.'
Thus, token relay services have emerged. Acting as intermediaries for AI computing power, these token relays enable users to access mainstream large models—both domestic and international, such as GPT, Gemini, Claude, and Deepseek—at significantly lower costs.
On May 1, Justin Sun, founder of TRON, launched B.AI, an AI model aggregator, promoting it with the slogan: 'One API key gives you access to Claude, GPT, Gemini, and the full suite of leading Chinese large models.' He also publicly announced daily subsidies of at least RMB 100,000 and 10 billion tokens. On May 7, Sun stated that the platform had surpassed 1.7 million registered users.
Joining him was Fu Sheng, founder of Cheetah Mobile, who created Easyrouter—an AI model aggregator that enables users to conveniently access over 40 major large models, including GPT, Claude, Gemini, and DeepSeek, through a unified interface, using discounts and reward points to attract new users.
The involvement of high-profile figures seems to further confirm the hype around token aggregators. But for ordinary people, is this really a golden opportunity for overnight riches? Is the myth of 'earning millions a month by running a token aggregator' a genuinely replicable path to wealth?
Amid a chaotic and unregulated market, the answer appears far from clear.
Is running a token aggregator a good business?
For individual users or small and medium-sized enterprises in the AI industry, token aggregators significantly reduce the cost of using large models. For example, even the cheapest GPT Plus subscription costs around RMB 140 per month, yet many of these users consume very few tokens—often obtaining one million tokens through an aggregator for just a few RMB.
Moreover, it’s relatively difficult for these users to directly access foreign large models. On one hand, many are unfamiliar with exchange rate differences, price disparities across countries’ model subscriptions, and overseas official procurement policies, resulting in significant information asymmetry. On the other hand, as models continuously evolve and update, the optimal model for different tasks varies, forcing users to top up accounts across multiple platforms—an inconvenient and wasteful process.
Token aggregators solve these problems perfectly: users can recharge in one place and access multiple models, saving substantial time and effort.
Xiao Ping, an AI practitioner, told Tech Planet that official platforms like Google and OpenAI do not typically work directly with individual small users. Instead, they wholesale access to tier-one resellers—authorized local service providers—who obtain the latest model APIs and then supply downstream customers at a markup. Ordinary aggregators or developers connect through these tier-one resellers, securing lower costs than official retail pricing.
Beyond this traditional aggregator model, sourcing channels for token aggregators in the market are incredibly diverse.
Many large-model platforms offer free credits to businesses upon registering a corporate account. For example, platforms like OpenAI provide support to startups, allowing them to use the service for several months at no cost. Some individuals register accounts in bulk and pool them into an 'account pool,' so that when users consume tokens, they draw from the combined free quotas of all accounts in the pool.
Employees at some major tech firms work for companies that already secure extremely low token prices—or even unlimited usage—and they privately set up proxy services, plugging in their own API keys. This results in very low costs, enabling them to resell access through agents.
Others purchase large numbers of premium accounts on major model platforms and share each account among 20 users to split costs and earn a margin. However, these loophole-exploiting channels often face the risk of official account bans.
These proxy services can develop their own downstream resellers, distributing tokens at low prices across multiple tiers, with each layer profiting from the difference between procurement and resale prices. They can also recruit agents for promotion—providing a low-cost token API endpoint and issuing multiple sub-API keys to agents. Agents top up their accounts and then sell tokens, typically earning commissions of 20% to 40%.
Operating a proxy service involves no significant technical barriers; ready-made open-source frameworks are available, allowing a single individual to deploy such a setup without heavy capital investment, while targeting a highly motivated and precise customer base. On the surface, it appears to be a viable business opportunity for ordinary individuals.
Some people earn millions per month, while others shut down their operations within a week.
However, the current moment is not the optimal time for ordinary individuals to enter the token proxy business.
Coach Xiaoping told Tech Planet that indeed some individuals have made substantial profits from this business, but most such cases occurred three months ago—when the sector was less crowded and token usage was experiencing explosive growth. 'I know someone who once achieved monthly earnings of one million RMB through token proxy services, but their revenue streams were diverse—they didn’t rely solely on token arbitrage. A significant portion came from charging students for training and offering coaching services.'
The first method is arbitraging token prices, yielding profit margins exceeding 50%. For instance, with 100 Claude Max accounts, each shared among 20 users paying $50 per month, monthly revenue reaches $100,000, translating to a net profit of approximately 560,000 RMB. If tokens are procured at a 30% corporate discount and then marked up for retail sale, processing hundreds of millions of tokens daily could generate daily profits in the tens of thousands.
The second method involves capturing idle prepaid funds. Many users top up on proxy platforms seeking lower prices but ultimately don’t fully consume their purchased tokens. For example, with a 199 RMB subscription plan, 90% of users fail to exhaust their allocated token quota, allowing the operator to retain the unspent balance as working capital.
The third approach is the one that truly enables rapid and substantial profit generation: enrolling students or offering AI coaching services. For example, posting on social media platforms claims like 'Earn several thousand yuan a month through token reselling' to drive traffic into private channels—first distributing free materials, then upselling to tiered offerings where students send red packets of 88 yuan, 199 yuan, etc., for 'one-on-one empowerment.' Meanwhile, enterprise-focused coaching services—helping companies deploy AI agents or use token relay stations—can fetch several thousand to even tens of thousands of yuan per deal.
Coach Xiao Ping told Tech Planet that due to users’ strong demand for reliability in token relay stations, individuals with an established private-domain audience are better suited for this business. One AI content creator on Twitter, who already had a solid follower base, launched a relay station and filled up 11 groups (each capped at 200 members)—totaling 2,000 to 3,000 users—in under 24 hours after a single post, achieving an exceptionally high conversion rate.
However, competition in this sector has intensified significantly, making it increasingly unsuitable for small players. With low technical barriers and ever-tightening pricing pressure, profit margins from legitimate sourcing channels now hover around just 30%. The arbitrage model is becoming harder to sustain. Those with technical capabilities, resources, or strong content creation skills may still find some room for profit—but for those merely seeking to act as middlemen capturing price differentials, opportunities have narrowed drastically. This space has effectively become dominated by large operators or those monetizing through student enrollment. An increasing number of ordinary participants are exiting the market; some even shut down their relay stations within a week due to slim margins and a lack of stable clientele.
Nevertheless, numerous small vendors continue to reap windfall profits through practices like exploiting promotional discounts or diluting service quality. According to observations by Tech Planet, one relay station quotes downstream clients 0.18 RMB per US dollar worth of tokens, selling 100 million tokens for 36 RMB—while its upstream supplier charges only 10 RMB per 100 million tokens at that tier, yielding a profit margin exceeding 70%. Such relay stations often suffer from serious compliance and quality issues and face a high risk of account suspension.
In May 2026, the operator of a token relay station named 'Xigua’s Pi' was criminally detained by Shanghai police and later released on bail after 37 days. This operator had mass-registered or purchased accounts for overseas large language models such as GPT and Claude on foreign websites, aggregated them, and then resold API access to domestic users via offshore servers, enabling Chinese users to access foreign AI services.
This practice essentially facilitates users’ circumvention of internet restrictions—a legal violation closely resembling unauthorized VPN operations. Both activities illegally establish cross-border communication channels, crossing clear regulatory boundaries. Many token relay stations employ this method, exposing themselves to significant legal risks.
Relay stations vary wildly in quality: widespread issues include substandard service and data leaks.
Due to diverse sourcing channels and the pursuit of higher profits, many individual-run relay stations deceive users by diluting token quality. Some even resell user data or abscond with customer payments outright.
Stars 404, a partner at hvoy.ai, told Tech Planet that finding a trustworthy relay station has become increasingly difficult.
Many relay stations offer 'diluted' tokens. 'Dilution' refers to secretly substituting cheaper or restricted models in the backend—for instance, replacing foreign models with lower-cost domestic alternatives. The most common example is using DeepSeek in place of overseas models.
A user shared on social media that when using Gemini 3.1 Pro via a reseller API, the model’s chain-of-thought reasoning disappeared entirely—yet other Gemini models used during the same period did not exhibit this issue. Meanwhile, when using Claude 4.6 Opus multiple times, the chain-of-thought outputs were in Chinese, despite the user never having set any preference for Chinese reasoning. This strongly suggests that low-cost, domestically developed models are being mixed in as substitutes.
Therefore, beyond just checking the model name, users should also compare the response style, context length, error messages, and billing details provided by the reseller against those of the original model to assess authenticity. Stars 404 has built a website, hvoy.ai, based on these metrics to help test token purity. Backend data from the site shows that among tested resellers, Claude models have the highest likelihood of being adulterated—around 60%. Generally, the cheaper the service, the less reliable the quality and the lower the purity.
An even more serious issue is data leakage. One user discovered that while using an account provided by a reseller, the model could retrieve and reference files previously uploaded by other users sharing the same account—not the user’s own files. Moreover, due to potential AI hallucinations, the model might even directly provide links to other users’ files in new conversations.
Stars 404 told Tech Planet that many resellers have been approached with offers to purchase user data, primarily by third parties seeking data for model distillation. However, in his communications with resellers, he found that most avoid storing such data because they know selling it is illegal—though the risk of information leakage remains significant.
Additionally, it’s worth noting that many resellers simply disappear. An AI industry professional told Tech Planet that some small-scale resellers only offer GPT access; if OpenAI suspends a large number of accounts over a short period, a wave of these sites tend to vanish. Others abscond with funds after hitting peak revenue, leaving consumers unable to recover their money.
Beyond these issues, using token resellers carries numerous other risks—such as inflated token counts leading to excessive charges, truncated context lengths, shortened official memory windows, automatic downgrades during peak web-search usage, and switching to lower-tier models during high-traffic periods. Unscrupulous vendors exploit these tactics for exorbitant profits, making it extremely difficult for ordinary users to detect such practices.
Currently, the token reseller market remains highly unregulated, making it hard to guarantee either product quality or user data security. Long-term, standardized regulation and market oversight will be essential.
Risk Disclaimer: The above content only represents the author's view. It does not represent any position or investment advice of Futu. Futu makes no representation or warranty.Read more
Comments
to post a comment
1
1
