The Oracle Problem
One of the major issues with blockchain technology and smart contracts is the inability to naturally connect to external data sources. As most world data remains off-chain, blockchain networks are facing an uphill battle in generating real-world applications. The second part of the problem is blockchains need to be able to trust the validity of off-chain data and know it hasn’t been tampered with.
Oracles alleviate the oracle problem by bridging the gap between the off-chain and on-chain worlds. Put differently, Oracles are digital entities that connect blockchain networks to different data sources. These data sources can be on-chain or off-chain; they can connect to traditional databases or application programming interfaces (APIs), as well as blockchain native sources. Similar to APIs, blockchain oracles serve as a way for different data sources to communicate between each other. Decentralized oracle networks (DONs) enable the implementation of trustless smart contracts by ensuring data availability, validity, and off-chain computation.
Importance of Decentralization
Centralized oracles pose issues with data accuracy and susceptibility to manipulation as there is one point of vulnerability. DONs spread that risk across multiple entities. They utilize multiple nodes and data sources to ensure data validity and reduce the chance that incorrect data points make it downstream to smart contracts and end-users. Centralized oracles are often a non-starter for smart contracts due to the possibility that a single malicious actor can send bad data downstream. Additionally, centralized oracles have lower data availability. They can easily be shut off by their operator or deny service to smart contracts. The combination of low correctness guarantees and poor availability make decentralized oracles the clear choice for smart contracts.
For decentralized oracle networks to work properly, an incentive framework needs to be designed to align the node operators’ incentives with data consumers. Operators need to be rewarded for providing accurate data and penalized for poor quality. A proper framework generally includes two aspects: attributability and accountability.
Oracles are required to sign the data that they are providing to the network. This creates an immutable record on the blockchain so third parties can easily determine which oracles have been providing good data and which have historically been incorrect. Therefore, future users can choose to only request data from highly rated oracles.
Decentralized oracle networks can require network participants to place a stake of confidence on the data they are providing. If they are highly certain the data is good, there is minimal risk for the operator to place a stake on their claim. If they are uncertain on the accuracy of data, they will place less of a stake and likely be a lower priority from data requestors. Operators can additionally be given rewards for continually providing reliable data or have their stakes taken for providing inaccuracies. This provides accountability to oracle operators and incentivizes proper behavior.
Implementing attributability and accountability structures should align the best interests of oracle operators and data consumers and mitigate the rewards for malicious actors. Finely tuned incentives combined with an increasing number of network participants minimize attack vectors and reduce chances of bad data making it to end-users.
Types of Oracles
If examining oracles based on what data service they provide, there are four broad categories:
- Input Oracles
- Output Oracles
- Cross-Chain Oracles
- Compute-Enabled Oracles
Input / Output Oracles
Input oracles are the most common category of oracles and are used to bring real world data on-chain. Input oracles connect to existing databases or APIs and help bridge the gap between the on-chain and off-chain worlds. A simple example of an input oracle would a lending/borrowing protocol using price data from a centralized exchange’s API to determine the value of collateral that a user is depositing.
Without oracles, manual processes would have to be introduced into on-chain interactions, defeating the whole purpose of trustless blockchain networks.
Like input oracles, output oracles work by bringing on-chain data off-chain. This enables off-chain computation using on-chain data and makes blockchain data accessible for various analytics programs. Additionally, output oracles are used to trigger off-chain actions. For example, an oracle can trigger a legacy system payment or prompt a database entry.
Cross-chain oracles read and write information across different blockchains. Digital assets are currently fragmented across various layer-1, layer-2, and side-chain networks. The ability for blockchains to communicate with each other improves interoperability and efficient markets. Cross-chain oracles provide a foundation for crypto bridges and omnichain protocols, alleviating some of the liquidity fragmentation problems.
Compute-enabled oracles (CEOs) are the newest oracles which use secure off-chain computation to complete otherwise impractical on-chain calculations to provide outputs to decentralized blockchain services. These oracles are frequently used in zero-knowledge rollups by completing off-chain computations and only posting validated transaction batches on-chain, reducing costs and increasing transaction throughput.
Together, the different types of oracles enable smart contracts to utilize data from various data sources, whether they are on-chain or off-chain, resulting in a secure trustless operation.
Across the oracle categories described above, there are different design structures that are considered to best suit data needs. Considerations revolve around data availability, storage requirements, and aggregation techniques.
Immediate Read Oracles
Immediate read oracles are the simplest oracles and are used for quick data decisions. The data is often small and easy to query. Data responses are provided on a just-in-time basis, hence the name immediate read. An example of where an immediate read oracle is used would be a simple query of “is this number larger than 100?” in which case the oracle could quickly provide the necessary yes or no response.
Request-response oracles are one of the most common design patterns for Ethereum based smart contracts. As their name implies, request-response oracles work by having a smart contract request data and the oracle responds with off-chain data. Request-response oracles are optimal for large datasets that cannot be stored in smart contracts’ storage and when an end-user only needs a small portion of a data set.
Publish-subscribe oracles are slightly different than request-response oracles in that they provide a more continuous data feed that can be read by other contracts. The data is more likely to be changing frequently and contracts must be aware of changes to the oracles’ data storage.
As mentioned, two key aspects of oracles are data accuracy and availability. In order to ensure those two conditions are met, oracle networks implement different types of data aggregation techniques for maximum data quality. The leading provider of oracles, Chainlink, uses three different levels of aggregation for its price feed oracles: at the data source level, the node operator level, and the oracle network level. To start, oracles interact with independent data aggregators for price data. Independent data aggregators utilize various exchange data (CEXs and DEXs) to provide volume-weighted price and eliminate any spoofing data. At the first layer, data aggregators are already capturing a wide collection of different price data, and before data aggregators are approved, they are highly vetted by Chainlink, ensuring the highest quality of data practices are being used.
At the individual node level, after retrieving data from multiple independent data aggregators, an oracle will take the median value of all data aggregators to mitigate outliers and API downtime. Therefore, Chainlink oracles are capturing a wide spectrum of data across the whole market and smoothing it to provide a singular tamper-resistant value.
Lastly, the median value from multiple node operators is taken. The median is often taken after a predetermined number of nodes attest to a data point, but the aggregation rules can be customized. This adds a third layer of decentralization and validation to the process.
In summary, price data is pulled from independent data aggregators which collect data from multiple sources, then an individual node will pull the median value from all data aggregators, and finally, the median value from the entire node network is taken. In combination, these three levels of aggregation make it extremely difficult for a malicious actor to manipulate data. Chainlink provides protocols with the highest level of confidence in their data security and reliability.
Use Cases & Market Landscape
One of the largest use cases for oracles is Decentralized Finance (DeFi). All DeFi apps require price data to enable functions like swapping, borrowing, lending, trading, etc. As mentioned previously, Chainlink’s price feed oracles are essential to providing these services. DeFi has already blossomed into a huge industry with more than $50 billion in total value locked (TVL) and peaked at nearly $180 billion in 2021. Without the ability to provide accurate price data to dApps across multiple blockchains, the DeFi system can’t operate.
Another use case is cross-chain oracles connecting enterprises directly to blockchain networks, providing them with an innovative middleware solution. Oracle networks can help institutions instantly connect their business to the blockchain economy and integrate smart contracts into their day-to-day operations. This can improve operational efficiency and remove manual processes from traditional workflows, and output oracles can connect to backend systems to ensure proper off-chain systems are kept in-sync. For example, an insurance business could use smart contract functionality with input oracles to automatically verify if an insurable event took place. After confirming the event, output oracles can send a signal to the business’ backend system to trigger a payment to the insured customer.
As modern society becomes more and more digitized, there are an increasing amount of data points and sources that blockchains can connect to. DeFi and middleware are just two examples of oracle use cases, but the true number of applications is only limited by the number of real-world data sets. Below is a visualization of the extensive Chainlink network and the value they provided in 2022.
It’s not hard to imagine how big the oracle market could grow as the on-chain and off-chain worlds continue to intertwine. The real market growth will come as real-world assets (RWA) start to move on-chain. Things such as stock trading, derivatives, real estate, insurance, and foreign trade are all industries that can move onto blockchains and will require oracle technology to operate. The total market across all industries exceeds hundreds of trillions.
The transition of these markets into cryptographic form will take years to complete but none of it is possible without a robust oracle network. Oracles are uniquely positioned to benefit from broad adoption of blockchain technology and should scale in lockstep with the industry as a whole.
The oracle token market currently has a value of $4.2 billion and is largely dominated by one player: Chainlink. Chainlink benefited from being one of the first players in the space and have since become the market standard for oracle solutions. Chainlink’s market cap currently sits at $3.3 billion, representing 78.5% of the total oracle market. The next two largest players are UMA and Band Protocol which each have market caps of approximately $170 million, roughly 19 times less than that of Chainlink.
Oracle Token Use
As the market is largely dominated by one protocol it makes sense to examine the uses of Chainlink’s token, $LINK. The LINK token has two main use cases:
- Medium of exchange
- Rewards token for node operators and alerters
Medium of Exchange
Most Chainlink oracles fall under the request-receive design structure, meaning that smart contracts on blockchain networks need to request data from the oracle and then the oracle node will send data back to the contract. When a smart contract requests data from a Chainlink oracle, it’s required to pay for the data request via the LINK token. The fees are up to the operator but tend to vary according to the data request. LINK’s design as a medium of exchange begins to create a “data currency” where any protocol requiring secure data needs to purchase LINK tokens to ensure its smart contracts can request data from reputable nodes. As noted in the previous section, the amount of use cases for oracles is enormous and the LINK token will find itself at the center of the market as the main data currency.
Rewards token for node operators and alerters
In addition to being a medium of exchange between data requestors and node operators, Chainlink requires operators to stake LINK tokens to help align the operator’s incentives with the oracle network. The more LINK a node provides, the more services it’ll be requested to complete, and therefore more fees accrue to that operator. Operators’ stakes are subject to forfeiture upon node failure or malicious behavior. Additionally, Chainlink has implemented what they call Future Fee Opportunity (FFO) which drives fees towards oracle nodes with strong performance histories. Misbehavior is met with a reputation downgrade, reducing an operator’s FFO. As mentioned previously, a proper incentive framework is paramount to the success of an oracle network. Chainlink’s staking deposits and FFO provide implicit and explicit incentives to operators, directly addressing the attributability and accountability aspects of an oracle incentive framework.
In a recent update, Chainlink has enabled LINK staking for non-node operators to increase the security of the Chainlink network and provide holders with more utility. Non-operator stakers can raise alerts if network nodes do not meet predefined performance conditions. This helps decentralize network security and keeps nodes honest, as their stakes can potentially be slashed if an alert is raised. In addition to helping secure the network, staking offers approximately 5% APR, offering general holders utility beyond simple token appreciation. In combination, the LINK token’s utility reinforces a strong network of trustless data.
As mentioned, important aspects of oracles are being tamper-resistant and providing accurate data. This is not always the case, as there have been several successful oracle attacks resulting in extreme losses for users. One of the most prominent examples is the Mango Markets hack which was executed via oracle manipulation.
Mango Markets, a DeFi platform on Solana was exploited for $112 million. The attacker deposited collateral into Mango’s platform and then was able to manipulate the price oracle of his [was the attacker identified? definitely a “he”?] collateral asset to artificially increase its value. As his collateral value was extremely high, he was able to borrow large amounts of other assets while keeping his LTV ratio within normal parameters. Using over $10 million in total, the exploiter pumped the price of $MNGO over 300% and then borrowed against the artificially pumped $MNGO, getting away with $112 million.
Price manipulation is the single largest risk to oracles in their current state. Crypto is still a small asset class and many assets are still small and illiquid, making them susceptible to manipulation despite the different layers of aggregation oracle networks utilize. As the industry continues to grow it will become harder for malicious actors to successfully attack oracles.
Blockchain networks at their core are closed loop systems unable to connect to real world data sets or other blockchain networks. Oracle technology alleviates this problem, bridging any data input onto a blockchain network, enabling the creation of trustless smart contracts and serving as a backbone for things such as DeFi and enterprise middleware. Oracles should continue to grow as more enterprises begin utilizing blockchain technology and need secure and accurate data. The potential for oracle growth is extraordinary if real world industries begin to move to a tokenized future, but that future is fully contingent on if oracle networks can implement proper design structures to ensure data is readily available and tamper resistant.
 DefiLlama as of 4/27/2023
 CoinGecko as of 5/11/2023