Web3’s need for decentralized storage
Decentralized storage is the infrastructure of Web3, it should meet following
1. The capacity should be high enough for large volume of users at the same time
Every storage, retrieval and download operation of the user is equivalent to transaction,
and it requires consensus to be recorded on the chain applying to decentralized storage
service. If this decentralized storage system serves a decentralized community or
decentralized social media, the volume of transactions will be very large. When the DAU of
a video website reaches 100 million, users will view more than 1.6 billion videos per day.
So, the blockchain for Web3 must have high TPS and low gas fee. At present, the
successful blockchains all provide financial services, so users have the expectation to make
money by using them. And that’s why they can tolerate high gas fees. Despite this, high gas
fees have been criticized by users and defi projects in ETH ecosystem.
2. Low operating cost
The main cost of nodes participating in decentralized storage is the purchase of server
storage equipment, IDC hosting and bandwidth rental fees (or the cost of purchasing cloud
computing). Relatively speaking, bandwidth is more expensive. Can decentralized storage
cost less than centralized storage services with the same user experience?
3. Good user experience
Can users quickly retrieve, obtain stored data? Even faster than centralized storage?
The user experience mentioned here is mainly the speed of retrieval and download. Due to
the topology of the Internet, for users in different regions accessing the same server, the
variance in speed is very large. Centralized applications often require the purchase of CDN
services, while decentralized services can be composed of a large number of servers
distributed in different regions. Build your own CDN network to give users a better
4. The storage node has autonomy of rejecting certain content
Decentralized services will face the problem of dealing with illegal content. On the one
hand, the entire network cannot conduct content censorship for users, nor can there be a
unified standard; on the other hand, each node has to meet the legal restrictions on
content in the country where it is located.
At present, Arweave has only a very crude solution, and when it grows its influence, it is
impossible to meet the contradictory requirements from users and governments.
5. The storage service of the node can be verified
At present, most decentralized services can achieve verifiable storage services, but most of
them are complex and inefficient.
6. Can be integrated with existing web services through gateway
IPFS allows anyone to create a gateway, so that users can use a browser to access the
content stored on IPFS. This technology is not difficult to implement, but how to motivate
a large number of users to build gateways.
Existing decentralized storage networks cannot meet all the needs of web3.
IPFS is a well-designed p2p decentralized storage and transmission network. It applies the
best of p2p technology in the past 20 years, including DHT, BitTorrent, Git, etc. The entire
IPFS protocol is divided into 7 layers: identity, network, routing, exchange, object , files
and naming. Three of these sub-protocols are briefly introduced here.
IPFS is based on content addressing. It uses DSHT based on S/Kademlia and Coral.
Kademlia is widely used by p2p applications such as Gnutella and BitTorrent. It can find
files in a large scale node network. For a network with 10 million nodes, 20 hops would be
enough to find files maximumly.
Coral has improved Kademlia. When searching for a hash key, only part of the value can
be returned; Coral can also form a cluster by region, and the node first searches in the
cluster of its own region, which greatly increases the efficiency of search and transmission.
For unpopular content, searching through DHT is still inefficient, and it needs to go
through multiple hops. The actual network connection between hops may be very slow,
resulting in the search taking a long time. Applications with high real-time requirements,
such as short videos, are difficult to use DHT to find content.
IPFS adopts and develops BitTorrent’s BitSwap protocol to complete the distribution of
content. A node maintains the want_list and have_list of a data block, and exchanges data
with other nodes. Unlike BitTorrent, the blocks of want_list and have_list are not limited
to blocks of a file, and all nodes form a data exchange market.
In order to prevent nodes from only requesting but not contributing, BitSwap establishes a
ledger for nodes and establishes credit for other nodes that exchange data with it. If a node
does not always respond and does not serve other nodes, other nodes will no longer
respond to its data requests. Through such a credit system, IPFS encourages nodes to store
and actively respond to the data needed by other nodes, but if a large amount of popular
content is stored and responded to requests from a large number of nodes, the cost is very
high. It is difficult for IPFS to provide large-scale, stable, high-quality services if there’s no
economic incentives but relying on credit systems like BitSwap.
IPFS data objects are stored in the structure of Merkle DAG. Merkle DAG has many
advantages: 1) tamper-proof, all content is verified by checksum, once tampered or
damaged, it will be immediately discovered; 2) reuse, same blocks from different files or
different versions from the same file will be stored repeatedly; 3) It is beneficial to the
version management of the file, and the IPFS object itself is also developed from Git; 4) If
the decentralized storage with incentives and economic model is adopted, the user pays for
storage, the node needs to prove that it has always stored the user’s data. The data object
adopts Merkle DAG, which makes the storage proof relatively simple to complete, which
will be explained in detail later.
A block in the IPFS data object is set to 256k, which also brings some disadvantages: 1) For
large files, too many blocks are used, resulting in queries and too many nodes involved in
downloading. If most of the nodes are unstable nodes, user experience will be very bad 2)
It is more difficult for nodes to censor content. On the one hand, nodes cannot violate the
privacy of users. On the other hand, nodes must also comply with the laws of the country
where they are located. For example, nodes in the United States cannot serve child
pornography content. We will discuss this problem in detail in the solution.
IPFS inherited the spirit of one for all and all for one in the early days of the Internet.
Nodes contribute their own storage and bandwidth for free, and also use the storage and
bandwidth of other nodes for free. The advantage of being free is that there is basically no
requirement for participation, but there are also many defects: 1) Users often use their
own home PCs and home bandwidth as IPFS nodes, because home PCs have excess
storage space, and home bandwidth is paid monthly so it will not have extra costs.
However, users will not spend money to deploy servers in the IDC to purchase bandwidth.
The home node is not stable enough, and the home upstream bandwidth is much smaller
than the downstream, which greatly limits the bandwidth for providing services to others;
2) Free services are easy to be attacked, and the cost of malicious nodes is low. IPFS has to
adopt a more complex identification mechanism and mutual evaluation mechanism
between nodes; 3) The user experience is not good enough, due to the instability of many
nodes, with the low efficiency of DHT retrieval, users cannot expect an efficient and stable
After IPFS, Protocol Labs launched Filecoin, an incentivized decentralized storage, and
there are three markets in Filecoin. In the storage market, storage miners rent out the
available storage space, the storage is verified by the Filecoin network, and storage users
store data by paying Fil. In the retrieval market, users pay Fil to retrieval miners for the
data they provide. Finally, participants can conduct Fil transactions, and Fil circulates
among miners, users and other token holders.
Filecoin’s economic model is based on data storage. Storage miners play a central role in
providing storage services and ensuring consensus on the chain. Storage miners have four
types of income: 1) Storage market income, and obtain Fil paid by users by selling storage
services; 2 ) Blockchain transaction fees, by creating new blocks through competition, to
obtain transaction fees for transactions within the block; 3) Retrieval market revenue, to
obtain Fil paid by users by providing retrieval services; 4) Storage mining rewards, by the
block Rewards which are issued to reward those maintaining the blockchain, running
contracts, and subsidizing reliable and useful storage.
The upper limit of Fil minting is 2 billion, of which 10% is allocated to financing, 7.5% sold
in 2017; 15% is allocated to Protocol Labs; 5% is allocated to the Filecoin Foundation; the
remaining 70% is allocated to miners as mining rewards , 55% is storage mining reward,
and 15% is mining reward reserve. Since most of Fil is distributed to miners through
rewards, after the launch of the Filecoin mainnet in 2020, miners mainly focus on reward
mining, and most miners do not serve real users. The miners fill in the data by themselves
(copy it directly with the hard disk), encapsulate the sectors, and completely regard Fil as
the target of financial speculation, and only rent a small amount of bandwidth to meet the
basic requirements of mining.
What kind of users need decentralized storage? What kind of applications do they need?
Are they willing to pay for it? Filecoin has not yet been able to answer these questions very
well. Filecoin uses financial means to motivate miners to participate in large-scale
participation. It already has a huge storage hardware device. Can it really serve customers
in the end?
For a paid decentralized storage service, the node needs to prove to the customer that it
has stored the user’s data for a period of time. Filecoin adopts two methods: Proof-ofReplication and Proof-of-Spacetime. Proof-of-Replication ensures that the node saves the
user’s data in its own local physical storage. The storage proof generally adopts the
challenge/response proof method. For example, the verifier randomly asks the node to
provide a part of the user data. However, Sybil, Outsource and Generation can attack.
Filecoin’s PoRep introduces concepts such as sector and encapsulation to avoid the above
attacks. PoRep can only prove whether the node stores user data at the moment when the
verifier challenges, Proof-of-Spacetime requires nodes 1) generate sequential Proofs-ofReplication as a way to determine time; 2) recursively compose the executions to generate
a short proof . Filecoin uses zk-SANRKs to implement PoRep and PoSt, which is quite
If there is no storage mining reward, the storage proof does not need to be very strict. For
example, a Sybil attacker may have no motivation to attack and gain no benefit. PoS can
also be implemented in a very simple way, which means there is no need to come up with
PoSt anymore, the implementation of proof of storage will be greatly simplified, and we
will discuss it in detail in the solution.
The Arweave mainnet launched in June 2018, and it provides a “permanent storage”
service. Arweave found that storage costs have declined at a rate of about 30% per year
over the past few decades, and a model estimated that users prepaid storage costs for at
least 200 years is not an alarming value. Because of this feature, Arweave has become a
decentralized storage used by many Defi and NFTs, such as mirror, solana, uniswap,
yearn, makerDAO, etc. Arweave uses a much more conservative estimation when it
actually calculates the cost of permanent storage, but will the cost of storage continue to
fall so rapidly over the next 200 years? Arweave also does not take into account the cost of
miner bandwidth, and its economic model is questionable whether it can truly support
Arweave stores user data in the transaction of the block, and it creates Blockweave and
blockshadow technology, so that nodes do not need to store all the blocks of the chain.
Arweave introduces two new data structures, 1) blocklist, which is a hash list of all
previous blocks, through which old blocks can be verified and potential new blocks
evaluated; 2) walletlist, which is a list of all active wallet addresses , which enables nodes
to validate new transactions without owning the block that produced the previous
transaction. Since these blocklists and walletlists have already been fully verified in the
previous “continuous verification” process of each block, new miners do not have to
download and verify the entire blockchain. The design of Arweave should be to reduce
costs. If each node must save a complete block, it is equivalent to that all nodes save a copy
of user data. This cost is too high. We estimate that Arweave is more suitable for saving
files with a small amount of data, such as web pages, etc., but not suitable for videos,
games, etc. with a large amount of data.
Since user data is stored in transactions, Arweave introduced BlockShadow technology to
decouple transactions and blocks, and only send the “shadow” of blocks between nodes. It
only contains a walletlist and block hashlist, as well as a set of transaction hashlists ,
through this information, the node can reconstruct the complete block. This solution
solves the cost and efficiency problems of block distribution and consensus, but we should
dig deeper. Is it a good design to store user data in transaction? Its drawbacks are obvious,
it greatly increases the complexity of the system, and is also very unfavorable for the
storage of large files.
Another important feature of Arweave is free data access, which uses the wildfire
mechanism to incentivize nodes to share data for free. In reality, servers are placed in
different IDCs, and the speed of users accessing servers in different IDCs is very different.
Generally speaking, the server in the IDC that is closer to the top router of the backbone
network responds faster to users, and the user experience is better; a server in an ordinary
IDC can only serve users in the same area. Buying bandwidth from a good IDC is very
expensive, far exceeding the cost of storage. Therefore, in order to provide a high-quality
user experience for Internet services, it is often necessary to purchase CDN services. Even
the cheapest p2pcdn, its price far exceeds the price of storage. We believe that accessing
data for free is not a good design choice and will be discussed in more detail in the
Arweave’s wildfire is similar to IPFS’s Ledger. The node establishes a ranking system for
other nodes that exchange data with it. The ranking is based on the node’s response speed
to network requests and the speed of receiving data. The node will give priority to sending
transactions and blocks to the nodes according to the ranking to give these nodes an
advantage in mining, thereby incentivizing nodes to actively respond to requests from
other nodes. The distribution of user data in the entire network is extremely important. If
user data is always stored on the “closest” node to the user who needs it, then the use cost
and user experience of the entire decentralized storage will be optimal, making the cost
even lower and the user experience even better than centralized storage solutions. This is
the key to the success of the decentralized storage network. What kind of data distribution
mechanism is best and closest to achieving the ideal state? It will be discussed in detail in
Arweave supports nodes refusing to serve certain content and to vote democratically in the
overall network governance to refuse to serve content that is condemned by the majority.
Although this mechanism exists, in the current practice, only a blacklist is maintained,
including a blacklist of hash values of data that the node does not want to store. Due to the
small amount of usage, this demand is not urgent for Arweave, although the current
mechanism is still simple and does not seem to have any adverse effects. A decentralized
network serving a large number of users and applications will inevitably encounter this
important problem, involving the laws of different countries, and we will also discuss this
in the solution.
Swarm was a project supported by the Ethereum Foundation in the early days and
operated independently after financing. Let’s take a look at the economic model of Swarm
first. The initial distribution of token BZZ issued by swarm is as follows: 7% for
foundation, 42% for private placement, 8% for public offering, 10% for DApp subsidy, 20%
for development team, 13% for ecosystem, and ecosystem means financing for
participating nodes’ equipment and bandwidth costs, plus 1 million BZZ airdrops through
the testnet, total 62.5 million. After the mainnet goes online, BZZ follows the Bonding
Curve mechanism. When BZZ is higher than the public offering price, it will be issued in
the same proportion as the above distribution; when the currency price falls, BZZ will be
automatically destroyed to reduce the circulation, but it is not a stable currency
Storage nodes obtain benefits through storage incentives, bandwidth incentives and
discovery incentives. 1) Storage incentives, users need to pay “stamps” to upload data,
Swarm aggregates all stamps, redistribute them by storage size of each node; 2)
Bandwidth incentives, data upload and download between nodes usually have senders,
forwarders and receivers, and both forwarders and receivers will receive benefits. The
benefits come from the mutual transfer of traffic fees between nodes. The upper limit of
the traffic fee that a node can receive depends on the bandwidth. The higher the
bandwidth, the higher the upper limit of the traffic fee; 3) Node discovery incentives.
Swarm is different from Filecoin. The income of nodes mainly comes from the payment of
real users, not mining rewards. Before Swarm went online in 2021, it attracted a large
number of miners trying to mine, reaching more than 400,000 nodes, and more than
200,000 active ones. But after its launch, miners soon find out the benefit is very low
because Swarm cannot attract a lot of real paying users. A lot of miners cannot operate for
a long time.
What is the real need for decentralized storage? How big is it? How to develop and
compete with alternatives? This is a basic problem that a decentralized project with a
purpose of servicing must solve, and we will discuss it in the solution. Swarm encourages
users to use it first. Its Accounting Protocol design is very unique. The contribution of
bandwidth between nodes will be accounted for. It does not need to be paid at the
beginning until the debt of one party reaches a certain level, thus encouraging users to use
Swarm is similar to a simplified IPFS in retrieval and transmission, which is equivalent to
implementing an incentive layer with the Ethereum xDai sidechain. The basic unit of
storage in Swarm is called Chunk, and the maximum size of a chunk is 4k. Therefore, its
design goal does not include large files. The Swarm team may have chosen a strategy of
growing along with the growth of demand for decentralized storage from decentralized
applications, rather than attracting users through the rise in currency prices, because most
of the speculators attracted by this are not decentralized Stored users. With these two
different strategies, who can reach the maturity of decentralized storage?
CRUST introduces MPOW (Meaningful Proof-of-Work) based on TEE (Trusted Execution
Environment). TEE is a concept proposed by Global Platform. Currently, there are mainly
Intel and ARM implementations. We don’t know much about TEE technology. It requires
special hardware. To achieve trusted computing, but not the consensus of the blockchain,
the storage proof of nodes, and trusted computing are not necessary. If trusted computing
is to increase efficiency and increase security, then it is not the only choice, and good
architecture design can achieve the same purpose.
There are also some decentralized storage projects, such as sia, storj, etc., but so far, there
is no project that can fully meet the needs of web3 for decentralized storage networks, and
the market size of the entire decentralized storage is still very small. In this chapter, we
will propose solutions for Mises.
1. User needs
The decentralized storage network can serve both 2C users and 2B users, and the needs of
the two are very different.
2C users are willing to pay to use decentralized storage services to save two types of data:
1) Important personal data; 2) Data that can make money for users. The data may be text,
pictures or videos. At present, the centralized network disk (iCloud, etc.) has a large
number of paying users. The online drive will give users a certain amount of free storage
space, and the excess part will be charged according to the size and storage time.
Compared with centralized network disks, decentralized storage has the following
advantages: 1) The user’s account and content are not controlled by the service provider,
and truly belong to the user; 2) The security is better, and user data may be stored by
different service providers in the network. Being stored by different service providers at
the same time reduces the probability of data being damaged; if the storage costs are
comparable, users will choose decentralized storage services.
There are many creators on the Internet, they create novels, music and web dramas, and
hope to get income from the content, these creators are active on different platforms,
contribute content, and share the revenue with the platform. Due to the monopoly nature
of Internet platforms, creators are often in a weak position, most of the revenue is taken
away by the platform, and creators have no bargaining power. Creators can put their
content on the decentralized storage network, and the network generates URLs for the
content, including introduction, preview, price and other information, and spreads it
through various forums, communities, social media, and later through decentralized social
media or Community, where creators really have their own viewers/customers. There is no
one powerful platform that takes most of the creator’s revenue, only a small part of the
revenue determined by the open source code is distributed to the nodes and development
teams of the decentralized network.
2B users are more focused on using decentralized storage CDN services to provide their
users with a high-quality user experience at a low cost. Decentralized storage always looks
for the closest and fastest node to the end user to serve him. If there are enough nodes and
they can be distributed evenly in the Internet, it can be done, and the consumption of
Internet backbone network bandwidth can be reduced. Implementation options are
discussed in Section 6.
The content of 2B users is often free to its end consumers, and a few premier content only
allows its members to access, or needs a one-time payment. Since the decentralized
network takes each access to the content as a transaction, it needs to support gas fees (very
few) and bandwidth fees, which requires the enterprise to help its users pay. If the
enterprise encounters a malicious attack, using the program accessing its content at scale,
a decentralized web doesn’t take this into account, requiring businesses to deploy their
own anti-spam strategies.
To clarify the user needs of decentralized storage services, the following sections will give
solutions to Mises. For specific system design, please refer to Chapter 3.
2. Economic Model
Principle 1: User data is not stored on the blockchain. The upload and download of user
data is a transaction between consumers and servers (storage nodes), and each transaction
is stored on the Mises chain. Different from the atomic properties of token transfer
transactions, the completion of upload and download transactions requires a process, and
its completion requires a mechanism to verify.
User data is not stored on the chain because:
- 1. It is not necessary, and data uploading to the chain is not a necessary
condition for decentralized storage;
2. If user data is stored on the chain as a transaction, it greatly increases the
amount of data synchronized by the transaction. The amount of storage
required by nodes and the time required to synchronize transactions greatly
reduce the efficiency of consensus and increase the cost of nodes;
3. In order to optimize the problem in 2), many optimizations like Arweave have
been applied, increasing the complexity of the code, making the robustness of
the entire network weaker.
Principle 2: The storage node is not a node of the Mises chain. Its revenue comes entirely
from the user’s payment and share, and there is no mining reward.
Mining rewards for storage nodes will bring many disadvantages:
1. Stimulate false storage demand, storage nodes constructing fake transactions
to obtain rewards, filecoin is currently facing such a situation;
- 2. In order to prevent nodes from defrauding mining rewards, a strict storage
proof mechanism is bound to be introduced, and high gas fees will increase
the cost of cheating, thereby increasing the complexity of entire system.
3. A large number of tokens are rewarded for behaviors with no real value, the
entire economic model is not optimized for the decentralized storage service
with the lowest overall cost, for example, high gas fees are charged, It is very
unfavorable to the real demand;
- 4. A lot of investment in the early stage, if there is no real demand to support, it
will inevitably be a fail in the later stage, hurting the entire industry.
The user’s payment for the storage service will not be directly paid to the storage node at
one time. Storage generally has a certain storage service period. After the user pays, the
MIS is managed by the Mises chain. The Mises chain node will periodically (one month)
request the storage node for a storage certificate. After the node successfully proves, it will
receive a one-month service fee, of which a small part of it will enter the staking pool of the
storage node until the storage node meets the requirements of the staking pool.
Content consumers, creators and storage nodes form a content market, which sets prices
for content. When consumers acquire content, they need to pay according to the price (or
a third party pays for consumers, and the third party is often the operator of a certain
service). Storage After the node completes the service, it can obtain 25% of the revenue, if
this part of the revenue has exceeded its minimum service fee. Storage nodes can share the
premium income of good content.
Principle 3: The Mises chain adopts the POS mechanism, and the chain nodes need to
pledge MIS to obtain mining rewards and transaction gas income.
Please refer to the economic model section of the mises project white paper.
Principle 4: Mises chain gas fee should be kept low and stable relative to fiat currency
Mises chain mainly supports decentralized social relations, decentralized storage services,
decentralized social media/communities, they are not financial services, users cannot
stand high gas fees, and try to avoid the possibility of excessive gas caused by currency
A simpler solution is to consider the currency price for a period of time (3 months) each
time the gas fee is calculated. The average currency price of the 9 time points in the last
three months is p, and the target price of the gas fee is g. The number of MIS to be paid is
m=g/p. If the gas fee is paid by the community or social media operator to help the user,
the node will charge m MIS. If the user needs to pay personally, the wallet recommends m
to the user, and the user can choose Increase or decrease the paid MIS, as long as there are
nodes willing to pack it.
3. Build and Cost
Let’s first look at the two projects: filecoin and swarm.
Filecoin was launched by Protocol Labs, which developed IPFS and has a high reputation.
In 2017, filecoin ICO sold 7.5% of Fil and raised 250 million US dollars, causing many
investors around the world to invest in mining. According to a rough estimation, in 2020
when the filecoin mainnet went online in 2018, miners invested billions of dollars in
hardware equipment such as server storage. After the mainnet goes online, due to miners’
staking, high gas fee burning and 180-day mining release mechanism, coupled with the
entire crypto bull market in the first half of 2021, the price of Fil has been rising, attracting
more miners to invest, and the current computing power It has reached 15.53EB
(2022.02.16). With such a huge storage capacity, there is not much real demand. The
income of miners depends entirely on the rewarded Fil. If filecoin does not use this time to
develop real decentralized storage users, it will either attract more miners to join, or
collapse. Miners must lose their huge investment in the early stage.
We believe that it is basically impossible to develop such a huge demand for decentralized
It was rumored that Swarm was a project promoted by Vitalik and Garvin Wood. It was
also popular in the first half of 2020, with a large number of miners participating. At one
time, there were more than 400,000 nodes and more than 200,000 active nodes. In
addition to airdropping a total of 1 million BZZ to the nodes participating in the test, after
the mainnet goes online, the reward of BZZ is very small, but the actual users use very
little, and a large number of miners find that they can only get a small amount of BZZ,
which is not enough to pay for the cost of server hosting. The price of BZZ did not
skyrocket as expected, the popularity of swarm ebbed, and a large number of nodes were
no longer active. This is a good thing for Swarm. By gradually developing users, reducing
user usage costs, and optimizing user experience, it may find a way to gradually prosper.
Mises believes that the development of users cannot be achieved overnight. The
development of nodes should not achieve false prosperity by hyping tokens. Decentralized
storage should not be a financial service in nature. It is a key step for crypto to go out of
the financial field. Whether it can succeed, it determines whether the crypto industry can
grow dozens of times faster in the next 20 years.
4. Storage Proof
The storage service has a service period, and it cannot be an instantaneous transaction.
During the service period, the storage node is generally required to provide verifiable
storage proof. In particular, the network that rewards nodes for mining based on storage
services requires strict storage service proofs. Otherwise, a large number of reward tokens
will be obtained by the Wool Party by cheating, which will not achieve the effect of
incentives. Filecoin’s PoRep and PoSt proofs are quite complex and use zk-SNARKs.
However, if there is no mining reward, the incentive of cheating to obtain rewards
disappears, the node gets more income from the real access service, and the node does not
need too strict storage proof at all.
Mises’s solution is that during the storage service period, the nodes of the miess chain are
not fixed at a fixed time every month.
The challenge storage node randomly asks it to provide a block of 4k data of a certain
content, and the chain node calculates the hash of the block, and compares it with the
Hash of the corresponding block in the Merkle tree of the content when the content is
uploaded. If it is correct, it proves to be successful. The size of the data block can be
configured. If the data block is too large, more data needs to be transmitted for each
verification, and more calculations are required; if the data block is too small, the content
of the merkle tree is too large.
The storage fee paid by the user will not be sent to the storage node at one time. According
to the service time, after each storage proof is successful, the network will transfer a part of
the node until the storage service ends. When the storage node is initialized, there is no
need to provide staking tokens. After receiving the service fee, 20% of the staking token is
withdrawn each time as the node’s staking token until the number of staking tokens
reaches the requirement. If the data is accessed by other users, it is equivalent to a more
complete storage proof. If it is found that the node does not save the complete data on the
server, the network will deduct the node’s staking tokens to compensate paying users. The
node can no longer provide services until the node completes the staking token.
5. Efficient retrieval
Decentralized storage is often addressed through the hash of the content, and retrieval is
also decentralized. IPFS, SWARM and other projects have adopted Kademlia-based or
improved protocols, but there are still problems: 1) Kademlia is based on the id of the node
(ie is hash) to calculate the “distance” between nodes, therefore, when the retrieval
message is transmitted, the actual Internet distance of the relayed nodes may be very far,
for example, one in China and one in the United States, the network speed may be very
slow. Even if the improved Coral is used, this problem cannot be eliminated; 2) Because
the retrieval speed cannot be guaranteed, some applications with high real-time
requirements cannot be used, such as tiktok;
Mises proposed a two-level index scheme. 1) All the nodes of the mises chain are also used
as index nodes, that is, according to the hash value of the content, the index of hash : urls
is established. The establishment of the index needs to go through the consensus
mechanism of the mises chain, so that the index content of each node is consistent, and
any user who connects to any index node will retrieve the same result. 2) The chain node
can simply index the content according to the hash, for example: content id1, first retrieve
the index of the chain nodes responsible for it, and then connect to one of the nodes to
query; 3) When uploading some files, specify to generate a preview page , a URL that can
access the file is directly generated in the preview webpage, which can be accessed without
retrieving the user. Unless the URL is invalid, other copies in the decentralized network
will be retrieved.
Content indexed by all nodes is the most efficient in retrieval, and only needs to be
retrieved once, but the resources of nodes are limited, and higher retrieval costs should be
paid by the content owner, which is especially suitable for applications that require high
real-time performance. Mises’ solution allows the content to be retrieved with at most two
queries, which greatly improves efficiency. In addition, mises does not divide the content
into many chunks for storage, which greatly reduces the size of the index file, and fewer
index nodes can serve more Finally, most of the access through the preview webpage does
not require retrieval, which greatly reduces the end user’s demand for retrieval. Mises is
suitable for services that require high user experience. In fact, it can achieve better user
experience than centralized services. In the next section, we analyze it from the
perspective of download.
6. Content distribution and CDN
1) Network distance
The Internet connects different types of networks in different countries and regions
through routers. Generally speaking, accessing a server in another country and region will
be slower than accessing a server in this region. There are three reasons:
- 1. Accessing other countries and regions More routers are required for transit;
- 2. Backbone network bandwidth between countries and regions is limited;
- 3. Other non-technical reasons.
Routers also have different levels. The top-level router generally only exchanges routing
information with the top-level router through the BGP protocol, and directly connects to
the server of the top-level router, and its routing information is more easily known to other
routers. If we regard the entire Internet as a graph, with routers as nodes, and the access
speed of two directly connected routers as the distance between them, we can calculate the
direct “distance” between any two computers on the Internet.
The IP address is assigned to each Internet operator in advance, and then it is assigned to
each computer it accesses. Therefore, knowing the IP often also knows its approximate
location on the Internet. The Internet can be divided into different AS domains. The
computer networks in the same domain are relatively close, and the retrieval and storage
services provided by servers in the same AS domain will greatly improve the efficiency.
The index and storage nodes of mises will be grouped according to the AS domain. Provide
a better user experience than centralized services.
The index nodes, storage nodes and users in the Mises network need to determine their
own domain. Secondary index nodes in the same domain only index the content saved by
the storage nodes in this domain. When an index node responds to a user’s retrieval
request, it always returns the content saved by the storage node in the same domain as the
user. Domains are not static. Nodes and nodes, users and nodes need to regularly check
the access speed of their connected peers, so as to make the domain division as accurate as
2) Content distribution
Principle 1: Keep content closest to your visitors.
There is no server on the Internet that can allow all visitors to quickly access, and
applications with a large number of users need to purchase CDN services to improve user
experience. The essence of CDN services is to keep the content closest to users. If the
decentralized storage network has enough nodes, it does not need a third-party CDN
service, and it can realize its own CDN by adjusting the storage distribution of content.
Principle 2: Visitors to content are mostly in the same area as content creators.
Mises users first connect to chain nodes, index nodes and storage nodes in the same region
as themselves. When a user uploads content, the storage node in the same area is
preferentially selected. The decentralized storage network has a content “replication”
strategy. When the access of the content in a certain area increases, the storage node in the
area can actively request to store the content, and the user does not need to pay the
storage fee of the node. Users of the content get benefits, and nodes can no longer store
the content at any time.
2B users who need CDN services can specify multiple areas for the content, and the
number of storage nodes required for each area. 2B users need to pay storage fees and
bandwidth fees for the content, and users who access the content do not need to pay.
Users (content supply users, content consumption users), storage nodes, and index nodes
together form a storage market and a content consumption market.
Storage nodes quote according to unit storage space and time, it can see the quotes of all
storage nodes, and
And you can adjust your offer at any time.
When a user stores personal data (referred to as user A), when A stores content, he can see
the quotations of nodes in the same area. He can choose the storage node, or he can simply
hand it over to the network for matching according to the market price.
When a user stores the content to be disseminated (referred to as user B), in addition to
selecting a storage node, B also needs to set a price for consuming users to access the
content, and the price cannot be lower than the minimum price specified by the storage
2B users are actually buying CDN services (suspended)
When a content consuming user accesses paid content, he first sees the preview page of
the content, and then decides to pay to watch the full content. For the free content of 2B
users, he can browse directly.
Does Decentralized Storage Not Moderate Content? Is it not possible to delete useruploaded content? What if the node or user violates the laws of the country where they are
located? These issues are very important. Kazaa and BitTorrent have faced many legal
lawsuits all over the world over the past 20 years due to pirated content. Studying these
lawsuits will help us find solutions that not only protect users’ data, but also prevent nodes
from breaking the law. situation.
Principle 1: User data for personal use only is not reviewed and cannot be deleted unless
he is in arrears.
Mises encrypts and stores the user’s personal data, which can only be decrypted by the
user through the private key. The node or any third party does not know what the content
is, so it is impossible to audit the content. Since viewing the content requires the user’s
private key, it is impossible for the user to disseminate the content, and the content cannot
have social impact, so there is no need for censorship. The node also has no right to delete
the user’s private content unless the user defaults on the storage service fee.
Principle 2: The content that the user wants to use for dissemination, after uploading the
generated URL or previewing the web page, can be Mises browser or any browser access.
When a user uploads content, if he wants a third party to access the content, mises will
generate a URL for the content, the user can spread the URL through his own social media
or community, and other users can access it through a browser; if the user wants to access
the content Charge, mises will generate a preview page URL for the content, and other
users will decide whether to pay the user MIS to view all the content after seeing the
preview page. If you use the mises browser, if the URL is invalid, you can query the mises
network through the hash id of the content to find the content saved by other nodes, and it
can still be accessed.
Principle 3: The node can watch the content (via url or preview webpage) stored by the
user for dissemination, and the node has the right to refuse to provide services for the
Every country has laws that prohibit some content from being distributed on the Internet.
For example, viewing child pornography is illegal in the United States. Nodes that operate
for a long time cannot violate local laws, so nodes have the right to refuse to provide
services for certain content. The node can view the content stored by the user for
dissemination. If the content violates the law of the node’s location or the religious belief
of the node operator, the node can refuse to file a denial of service on this ground. The
decentralized network will look for another node to provide services (such as a node in
another country) according to the reason for rejection. Then, the node can delete the
content and return the MIS that has been collected, and the index node will be updated
accordingly. If all nodes refuse to serve the content, the mises network will refund the MIS
to the user after deducting the gas cost.
In Mises’ scheme, the node is the legal responsible person for the content. The node is not
outside the laws of the country where it is located, but there are more than 200 countries
in the world, and different countries have different laws. It is this diversity that allows
different content to have the opportunity to find the largest living space through the