Teng Yan
November 7, 2024

DIN: Data Intelligence Network

Data collection & labelling is big business

Today’s deep dive is brought to you by DIN, a data intelligence network. Enjoy!

The Data Gold Rush

During the California Gold Rush of the mid-1800s, thousands of people chased the promise of untapped wealth in a new frontier.

People who had never been wealthy suddenly found themselves with fortunes, stories of rags to riches became commonplace, and entire industries and cities sprang up to support the rush. Infrastructure developed at a breathtaking pace, reshaping the American landscape.

The parallels with Crypto AI are hard to ignore.

Most Crypto AI products today are still in development or running on testnets—indicators that we’re firmly in the infrastructure build-out phase.

Investors and builders are laying the groundwork, positioning themselves for the potential growth surge. The tools, networks, and protocols being established now are the foundation for what could become a sprawling, decentralised AI ecosystem.

If the analogy holds, we’re witnessing the early stages of a digital gold rush—one that could be just as transformative as its 19th-century counterpart.

So imagine my surprise when I stumbled across a Crypto AI project claiming over 700,000 daily active users. Not monthly—daily. In a field as nascent as this, such user metrics are virtually unheard of. Naturally, I had to dig in and figure out what was actually happening under the hood.

This project? DIN, a “Data Intelligence Network”

Crypto Supercharges Data Networks

Source: Andy Scherpenbeg

I’ve been closely watching data networks in Crypto AI, and it’s clear they’re addressing a critical pain point in the AI landscape: access to valuable datasets.

Today, many of the most valuable data sources are tightly controlled by centralized entities, who charge steep fees to access them.

For example:

  • Reddit inked a $60M/year licensing deal with OpenAI to provide access to its user-generated content.
  • X (formerly Twitter) no longer offers free API access to developers, now charging between $100 to $42,000 (no joke) per month for tweet data.

The message is clear: Corporates recognize that data is the new battleground, and they’re locking down control to maximize profit.

Crypto offers a potential solution—a way to break free from the centralized stranglehold over valuable datasets.

Crypto data networks take a fundamentally different approach, aiming to build high-quality, decentralized datasets without the bottlenecks of traditional models. By leveraging tokens, these networks can incentivize large-scale data labelling efforts, motivating individuals to contribute to mass data collection or even organising efforts to scrape the web for training data (did someone say… Grass?).

Meanwhile, blockchains provide transparency, creating a framework to track data ownership and provenance. This ensures that contributors are fairly compensated whenever their data is used, establishing a new paradigm where data value is shared rather than monopolized.

DIN’s Vision

DIN is one of the teams that’s tackling the data problem head-on.

At its core, DIN is a data layer that collects and validates both on-chain and off-chain data—while using the blockchain as the settlement layer.

The big idea? Give ownership of data back to users and let them earn rewards for what they contribute to the system.

How Does DIN Work?

At first glance, this diagram might seem complex, but let’s break it down.

There are three main actors in the DIN network:

  1. Data Collectors
  2. Data Validators
  3. Computation Nodes

To get a better sense of how the data collectors and validators work, let’s dive into xData, DIN’s primary live product today.

#1: xData — Data Collection

xData is DIN’s flagship platform designed to collect, organize, and store data from social media platforms like X—without relying on the API. It operates on a decentralized network, ensuring ownership and privacy for users. It was launched in April 2024 on opBNB (a Layer-2 on BNB Chain).

xData’s Chrome Extension

xData makes data collection for users fun and rewarding through gamified mechanics. Here’s a quick look at how it works:

  1. Users install a browser plugin, sign in with their wallet and link their X account
  2. Users can tag interesting tweets by replying to the tweet and tagging the @din_lol account
  3. Users earn “wafers” for tagging tweets, which are points that can be converted into tokens at TGE.
  4. There are several gamification mechanics. Each user has a limited number of tweets they can tag (storage), but they can increase their storage space by spending wafers. Users also have to spend wafers every 24 hours to keep their account “unlocked” and earn more wafers.
  5. DIN releases missions around specific keywords or labels, and community members search for tweets in real-time and tag them according to the specific label.

The permissionless nature of xData means that any user worldwide can participate in data collection and annotation and earn rewards/income, regardless of nationality. For now, data collection happens off-chain, with tagged tweets stored on BNB green field, a decentralized data layer on BNB Chain.

#2: Chipper Nodes — Data Validation

The next natural question is: how do you ensure the quality and integrity of user-submitted data? After all, someone could run an AI bot to randomly tag tweets that don’t match the specified labels just to maximize their wafer earnings.

Data labelling isn’t always straightforward either. Tweets often include nicknames, slang, and cultural references—for instance, Bitcoin is often referred to as “big biscuit” in Mandarin-language tweets.

This is where data validation comes into play.

Chipper nodes are DIN’s AI-driven data validation and processing nodes, responsible for validating and vectorizing data, while also enabling users to earn tokens ($xDIN and $DIN).

Behind the scenes, each user-operated node actually runs a small AI model locally to validate that the tweet’s content matches the attached label before storing it in the decentralized data layer. Users can operate these nodes on standard PCs, without the need for costly hardware setups.

The AI models validators use continuously improve as they process more validated data, allowing the network to get smarter and more accurate over time.

Currently, DIN handles all data validation in-house, but the goal is to decentralize the validation process. Active testing for nodes is currently underway. Users can run the node software on their local devices to test the network, with bug bounties in place as DIN prepares for its mainnet and token launch in the coming weeks.

#3: Computation nodes

Although not live yet, computation nodes represent DIN’s future plans for storing data privately and securely. Here’s how they’re planned to work:

  1. Vector Conversion: Computation nodes convert validated data into vectors.
  2. Privacy Processing: Vectors are processed through a ZK (Zero-Knowledge) processor to ensure privacy.
  3. Data Finalization: The finalized datasets and vectors are stored on IPFS, making them accessible to third parties.

A New L2 on BNBChain?

No official announcement has been made, but in our research, we uncovered a DIN token on the BNB Chain testnet. This hints at future blockchain developments—potentially a sidechain or Layer 2 solution on BNB Chain.

A Brief History of DIN

DIN might feel like a new player, but the project’s origins trace back to late 2021. Initially launched as “Web3Go,” it began as an on-chain data analytics platform within the Polkadot ecosystem, securing grants from the Web3 Foundation and working with clients like Moonbeam and Oak Network.

In 2022, the team broadened its reach to the BNB Chain ecosystem, joining Binance Lab’s MVB incubator and securing investment to develop a “multi-chain open data analytics platform.”

By July 2023, they saw the writing on the wall: generative AI was booming, and the need for robust data infrastructure was becoming more pressing than ever. The team shifted gears to build a comprehensive “data intelligence layer for AI,” aligning their mission with the data demands of AI innovation. This evolution culminated in May 2024, when Web3Go officially rebranded to DIN, marking a bold new focus on data as the next wave of AI advancement.

DIN’s Traction — Good Momentum So Far

Daily Users on opBNB: ~700K+

Source: BNB Chain DappBay

DIN’s Daily Transactions on opBNB: ~1.2M+

Source: BNB Chain DappBay

According to DappBay, DIN has been holding steady with an average of >700,000 daily users over the month of October and>1.2M daily transactions. The majority of transactions are related to xData users having to make an on-chain transaction every 24 hours to activate their xData app and earn points.

Source: BNB Chain DappBay

DIN consistently ranks among the top 10 dApps on BNB Chain, and on many days, it’s the #1 app by users on the network. While I haven’t tracked the BNB Chain ecosystem as closely as chains like Solana and Base, this is no small feat—especially considering BNB Chain’s longevity and strong backing from Binance. 

To put things in context, I took a look at some of the other top-ranked apps on BNB Chain to see what’s driving engagement there:

  • Vooi (DeFi) is a perp DEX aggregator
  • Particle network (Infra) is an omnichain protocol in testnet
  • Revox (Infra) is a modular on-chain network with a popular content app, ReadON
  • SERAPH (game) is a Souls-like RPG game.
  • MyShell is a no-code, AI app store ecosystem

According to the team, DIN has collected and labeled over 100M tweets so far, with a user base exceeding 30M across opBNB and Mantle.

What stands out here is DIN’s ability to generate a massive, real-time dataset of relevant tweets quickly, leveraging its substantial user base. This process doesn’t rely on the X API at all.

While xData currently focuses on Twitter, the team plans to expand the data collection and labelling platform to other sources like Reddit, Facebook, Instagram, and essentially any user data platform with high-value information. To me, this is where the real gold lies.

Side Quest: Reiki

Reiki is another product by DIN that ties neatly into the ongoing AI agent meta—in fact, DIN might have been ahead of its time, given the latent consumer interest in AI agents from what we saw with Truth Terminal and GOAT in recent weeks.

In January 2024, DIN launched Reiki, a platform where users could create AI agents (mainly chatbots) without coding experience. Users could also integrate their own knowledge base, allowing them to build engaging, personalized chatbots reminiscent of MyShell.

After launch, the platform quickly gained traction, becoming the #1 product on Product Hunt.

Reiki also gave creators several ways to monetize their bots, participate in reward programs, and even mint their bots as NFTs—adding a fun layer of ownership to the experience. Notably, BNB Chain’s Discord knowledge support bot is powered by Reiki.

Although the platform has been largely deprecated for now, the DIN team hasn’t ruled out bringing it back after their token launch. If revived, Reiki could provide an additional utility for the token and a way for AI agent creators to leverage the data collected by xData.

Token Design: xDIN, DIN and Node sales

In August-September 2024, DIN held a Chipper Node sale, raising $2.5M from node sales. These chipper nodes will allow users to run validation software on their local devices, using models to ensure data is labelled accurately. The sale was a success, with 25,112 Tier 2 nodes—priced at $99 each—completely selling out.

Supply-side

Pre-TGE, xData users can convert their wafers (points) into xDIN—a pre-airdrop token. However, there will be a conversion fee ranging from 5–30%, with those fees distributed to Chipper Node owners. This conversion mechanism isn’t live yet but is expected to begin once node “pre-mining” is live later this month.

At TGE, users will receive DIN (tradable token) airdrop based on their proportion of xDIN held, fully released with no complex lock-up mechanisms.

After the TGE, 25% of the total DIN token supply will be reserved for Chipper Node rewards. Half of this allocation will unlock in the first year, with the remaining emissions halving yearly.

Note that this is a relatively fast unlock compared to other projects conducting node sales, where node rewards are distributed gradually over 3–4 years.

Demand side

Validator nodes will likely need to stake DIN tokens to participate in the network. In return, they’ll earn rewards for validating data, but they'll face slashing penalties if their outputs are inaccurate.

On the other end, data consumers must spend DIN tokens to access the network’s data. Since most Web2 businesses are still hesitant to engage with crypto, the company will need to facilitate these transactions to bridge the gap between traditional enterprises and the decentralized network.

We’re still awaiting detailed DIN tokenomics, which should be released closer to the TGE.

Team & Fundraising

The core team at DIN brings together talent from Columbia University, University College London, and the University of Stuttgart, with a decade of expertise in AI and blockchain.

DIN’s founder, Hao Ding, holds a Master’s degree in Information Technology from the University of Stuttgart. Before diving into crypto, Hao served as Director of Research Development at the Suzhou Institute of Artificial Intelligence in China. He then took on the role of Vice-President at Litentry, an identity oracle network, before founding Web3Go.

I had the pleasure of meeting Hao in person, and we had great conversations about the future of AI. His belief? Data will be at the heart of it all. The DIN team currently consists of 16 members, primarily engineers.

DIN participated in Binance Lab’s MVB 5 accelerator program and raised $4M in a seed round in July 2023, led by Binance Labs, HashKey, NGC, and Shima Capital. In August 2024, DIN secured another $4M in funding, with participation from Manta Network, Moonbeam Network, Ankr, and Maxx Capital, bringing its total fundraising to $8M.

Our Thoughts

Thought #1: Building a decentralised Scale AI is a sexy story

Source: https://sacra.com/c/scale-ai/

Data collection and labelling is big business.

Scale AI is the best known player in this space, reporting annual recurring revenues of ~$1B. This is fuelled by heavy demand from foundational AI model companies like OpenAI, Anthropic, and Cohere who are Scale’s main customers. As of May 2024, it was valued at a whopping $14B.

Let’s take a closer look at Scale AI’s business model. 

Scale relies on a large, distributed workforce for its data labelling tasks, which involve manual tagging of videos, sorting photos, and transcribing audio.

It employs ~240,000 workers across several countries, actively recruiting in regions with high unemployment rates and lower costs of living. Kenya, for instance, has become a key recruitment hub in Africa, with in-person “boot camps” in Nairobi and targeted paid advertisements to attract workers.

The labelling process typically has two layers: a first layer of annotators who label data from scratch and a second layer of quality controllers who review the work, add missing annotations and correct errors. It’s very human-intensive, but it works because human labour costs are low, and its clients are willing to pay significant money.

Now, imagine scaling this model (pun intended) through decentralized networks. A global, permissionless workforce incentivized by tokens could allow anyone to participate, while a distributed network of validators ensures data accuracy and quality. Decentralization could open up new possibilities for scaling data labelling, turning it into a truly global, democratized process.

Thought #2: Large User Base = Good

DIN’s primary advantage today lies in its large, engaged community—built up over two years of focused community-building efforts. With this network, DIN can rapidly mobilize data collection based on specific criteria. The challenge, however, is identifying where the true data demand lies, directing its users to collect and label the right datasets, and building sustainable revenue streams to support long-term growth.

Thought #3: Incentives are a double-edged sword.

Right now, much of the user engagement is driven by anticipation of token rewards once the token goes live. But if the team can’t drive sufficient demand for the token later on, usage could drop off as initial interest fades. Creating this demand will require speculative interest and establishing a market of data consumers eager to buy these datasets.

Thought #4: Data labelling is a highly competitive space

DIN isn’t the only crypto team vying for a share of this market—projects like Sapiens, Grass, and Masa are also in the race. But the pie is substantial. Take GRASS, for instance, which currently has a market cap of $2.5 billion, underscoring the scale of opportunity in this sector.

One path for DIN to differentiate and stand out could be training and deploying proprietary AI models for data validation, reducing dependence on human labour. This automation-first approach could streamline operations, enhance scalability, and give DIN an edge over competitors still relying heavily on manual processes.

Parting Thoughts

Data networks represent one of the most exciting frontiers at the intersection of AI and crypto. Unlike traditional centralized models, crypto-powered data networks leverage decentralized participation and incentives to build high-quality datasets at scale.

DIN is positioning itself as an early mover in this space, and it’ll be fascinating to see how the project develops. This is DIN’s opportunity to seize. I often tell people: data networks are one of the smartest areas to be building in right now.

Crypto is reshaping how data is collected, validated, and monetized—building the foundation for a new, decentralized data economy.

Cheers,

Teng Yan

Quick links for DIN:

This research was sponsored by DIN, with Chain of Thought receiving funding for this initiative. All insights and analysis are our own. We uphold strict standards of objectivity in all our viewpoints.

To learn more about our approach to sponsored Deep Dives, please see our note here.

We’re grateful to our research partners for helping us keep our Crypto AI research free and accessible to all.


Powered by beehiiv

Others you may like