Chainlink, The Ethereum Oracle

For months now 4chan’s /biz/ could easily be mistaken for /chainlink/, leading one to dismiss the project.

The astronomic recent rise of about 10x in weeks could also lead one to just call it a 4chan pump, but is there actually something here?

Chainlink's astronomic rise, June 2019Chainlink's astronomic rise, June 2019Chainlink’s astronomic rise, June 2019

Looked more closely, the whitepaper of this project is almost unreadable, but thankfully Thomas Hodges, Integration Engineer for Chainlink, was kind enough to give us an extensive interview found below.

Chainlink Birdseye View

Chainlink’s aim is to connect smart contracts to the real world through what are called oracles.

An easy example here is England v Germany, world cup, who will win? A fancier one is replacing BitPay where you send eth to the smart contract, an exchange automatically turns it into fiat, pays the pizzeria, you get the pizza, the exchange gets the eth.

The use cases are countless, including in machine to machine payments, insurance, and much more, with the innovation here being that instead of connecting to one server that feeds the data, you connect to numerous servers or better called nodes.

Underneath there is much complexity in implementation with the code all open sourced, but you basically choose the nodes, you can run one yourself too, the data is then aggregated and is sent to the smart contract on-chain.

That in effect makes the blockchain aware of what’s going on outside, something which when connected with sensors and so on, can make the tech a lot more powerful than currently.

All of this was made possible by an ICO in September 2017 when $32 million was raised, hence perhaps explaining the 4channers.

The Link token is an ERC20. There is no blockchain here. This is just a middleware or something running on top of ethereum or other smart contract blockchains.

Finally of relevance is the potential use of Trusted Hardware or secure enclaves where the data goes through the enclave without the node operator even knowing what data is going through, with the security of data here significantly increased.

Chainlink, The Oracle

We spoke to Thomas Hodges, Integration Engineer for Chainlink, to gain a better understanding at a conceptual level of how all this works.

How does a smart contract get written to by the data feed?

Thomas Hodges: The Chainlink node responds through its oracle contract to the consuming contract.

How? As in let’s say we bet on who will win the world cup, we have all these feeds, they say Germany, how does the smart contract now react to this, as in how you writing to the smart contract?

Okay, that’s a different kind of question. The process is like an async request with two on-chain transaction: You create a transaction which requests data; The node responds with data.

When getting into how you accomplish that first step, you’ll have to have some understanding of how to create a request.

Basically you would tell the node where to get the data, like which API to hit, what field of that API’s response to parse and to what data type.

In current architecture, you send requests to multiple distinct nodes, and receive corresponding answers directly to your consuming smart contract.

That is what we’re doing with the reference data contracts to store the price of ETH/USD on-chain.

I understand how you get the data, I’m trying to understand how the smart contract is told to send say eth to the say England address or Germany address and how the smart contract is actually able to move the eth, I think Sergey Nazarov called it triggering?

That’s entirely up to you as the contract writer. You’re going to receive an answer back on your consuming contract, you can do whatever you want with that.

But how does it trigger it? It needs the private key right or isOwner permission?

The Chainlink node writes to its oracle contract, using its private key, calling the fulfillOracleRequest method. Within that method, the Chainlink node operator collects payment and triggers your contract.

That way, your contract only receives an answer from the oracle contract that you send the request to.

Right, so with chainlink the outside world can only respond to what happens onchain. It can’t tell onchain what to do?

The Chainlink node technically can, but it won’t receive payment for doing so. We’re creating a prepayment protocol that would allow the Chainlink node to do just that.

So how does the smart contract read the chainlink node?

The smart contract receives the answer from the oracle contract that it sent the request to.

How does it receive the answer? Like how you connecting to the smart contract, onchain, on let’s say eth?

The Chainlink node calls the fulfillOracleRequest method of its oracle contract, which then calls the consuming contract with the answer.

Right so you create a new contract with the data?

Not necessarily. You can provide the address and function for the contract that you want the answer to be sent to.

So how does the oracle contract, which is outside eth right, how does it call eth?

The oracle contract is on-chain. The Chainlink node is off-chain.

Well then one step up, how does a chainlink node call the oracle contract?

When the request is sent to the oracle contract, an event is logged which contains the data about the request. The Chainlink node subscribes to these events and responds after processing the Job ID that was requested. Within that job, there is a task called EthTx which allows the node to write the answer back to the oracle contract.

By eth tx do u mean an ethereum transaction?

Yes, it creates an Ethereum transaction, with the data containing the answer.

To write to the smart contract, presumably you need some sort of authorization otherwise anyone can write to it.

That’s why you use the recordChainlinkFulfillment modifier. So that you only receive an answer back from the same oracle contract address that you sent the request to.

And then how do you manage the aggregation. Going to the simple Germany (G) England (E), one node says G another E and I understand there’s the reputation aspect, but am trying to understand the aggregation first, how you’re kind of putting all this data together to come with presumably one answer or do you send all the answers in that transaction?

Currently, the consuming contract has to do aggregation of distinct answers that it receives from each oracle that it sent requests to. You can see an implementation of this here , and that contract is in use as our reference data contract and can be monitored here.

In the service agreements protocol , this process would be simplified by sending a single request and receive n answers from the oracle nodes you selected in the agreement. The coordinator contract associated with service agreements is still in development.

Let’s say we have 4 Gs and 5 Es, does it just go with E?

So the Aggregator contract allows its owner to define which oracles to send requests to, what jobs to trigger, and how much to pay them. Then it also handles responses, including m of n, to store the median value of the answers received and record what block that value was calculated.

So like, say you send 10 requests to 10 distinct oracles, the creator of the Aggregator can define that when 7 oracles respond, take the median instead of waiting for all 10.

And if I understand correctly, in this example above, we have 9 nodes, so 9 sets of data through a tx, and this 9 set is now onchain?

Pretty much yeah, so we’d send 9 requests to 9 different oracles and receive 9 responses.

Do you have an etherscan contract that has data received from an oracle?

Yes of course, you can even see the current monitoring of our reference data contract here . To take an actual response as an example, see this transaction.

So that transaction is to a very busy contract with algorithms and no data.

It has data, it stores the current ETH/USD rate and the block height at which that rate was last updated.

Where do I click on etherscan to see that? I’m looking at this https://etherscan.io/address/0x89f70fa9f439dbd0a1bc22a09befc56ada04d9b4#contracts

That is actually an oracle contract, not the Aggregator. You would want to look at this: https://etherscan.io/address/0x79fEbF6B9F76853EDBcBc913e6aAE8232cFB9De9#readContract

Chainlink aggregator contract, July 2019Chainlink aggregator contract, July 2019Chainlink aggregator contract, July 2019

Right, and just so that it’s very clear, what does 9240 (now 9476) uint256 mean here? I know about uint, but more what answer did this contract want and thus what 9240 mean?

9240 would be the grouping of when requests were sent. So as a security precaution, we track the groupings of when requests went out to oracles so that they don’t get responses mixed in with different requests. You’re actually going to want to look at the values for currentAnswer and updatedHeight.

Technically speaking, latestCompletedAnswer could be made an internal or private variable since the requester doesn’t really have any concern over it.

Right so current answer is 28246698320 int256. What that mean? ETH’s price? In satoshis I guess?

CurrentAnswer is the current price of ETH, multiplied by 100000000.

Interesting.If you have the time if I can complete the picture. The idea is that someone like me can be a data source right and run a node etc, where is my data as a node source stored?

More like you would be, as a node operator, a data mover/facilitator/transporter. The data is stored at its source, be it some API, you are simply retrieving that data and writing it to smart contracts which requested it.

The node itself is not really concerned with storing that data that it retrieved.

So I can’t manually enter say 420 for eth’s price?

What do you mean? Like, as in a scenario where you have a malicious oracle? The answer for that is decentralization. So that your contract isn’t triggered by a single oracle’s incorrect/malicious response.

No I mean honestly, like I have a chainlink node, they want me to say what x is, and I just enter x.

Well, for one thing the Chainlink node doesn’t take manual input. You would have to write your own client to do that anyway.

So what does the node do?

The node retrieves the answer from the requested API and responds on-chain. All automated, without the need for intervention from the operator.

So who is running the API?

That depends on the API. A requester could request data for Kaiko’s API, for example.

Well that’s just one step up. I can run the API and the node.

Sure. If you’re a data provider yourself. Again Kaiko is a good example of this. They provide an API and they also run a Chainlink node.

Not familiar with them

They’re an API service that provides data about cryptocurrency prices.

I think what you’re saying is that the node operator is trusted to choose the API?

Not exactly. The requester can also choose the API. Either by supplying it as a parameter in their request, or by using a Job ID specific to an API.

Right so the aim here then is to ensure the data feeder doesn’t lie about what the API says?

Correct, by utilizing multiple nodes to retrieve data from either the same API or similar APIs, like what we do with cryptocurrency prices.

So I can’t interfere with the API unless I am the API? And by can’t obviously I mean… hopefully can’t I guess.

Right.

Interesting, and I suppose I can’t just do this all directly to the API myself or multiple APIs in the aggregator contracts I template?

You need to have some client interface between your smart contract and the API. Smart contracts can’t make any external references on their own.

Why not, you said you just ethtx.

That is from the opposite direction: the Chainlink node writing to the blockchain. A smart contract cannot reach out to external systems, like an API, without some middleware, which is what Chainlink is.

Well chainlink is a smart contract.

We have smart contracts as part of the protocol, but it is also an off-chain client.

A smart contract reader. So you see what’s happening in the smart contract, but I get what you say, here you have many servers seeing it rather than one.

Right. On-chain, the oracle contract receives requests and emits events, which are read by the off-chain Chainlink node.

Last one, in regards to secure enclaves, how does one know they are running a secure enclave or trusted hardware etc?

SGX offers remote attestation for that assurance.

What does that mean? So like I’m here, let’s say I have copy pasted a template or whatever and want to ensure the chainlink node is running trusted hardware, can I be ensured?

Yes, you can obtain a cryptographically proven assurance that the remote node is running SGX. https://software.intel.com/en-us/sgx/attestation-services

Is there someone who can speak to the business aspect? Like you guys working/met Intel, they developing this stuff for you or is it generalized hardware adaptable to what you’re doing?

SGX specifically is an open source SDK available for anyone to utilize.

So I think all this chainlink stuff is open sourced, you want to comment on why wouldn’t I just copy it and remove the token aspect and replace it with eth?

That’s not really my concern. Any open source project faces the same problem. It basically comes down to trust in the forking team’s ability to maintain their version of the software long-term.

FIN

The Link token is used to pay node operators for fulfilling requests. Such payment can obviously be made in eth, but a new team that wants to fork the project would face two problems.

First, funding. Chainlink has raised millions, so they have the funding to keep the software secure and maintained long term. A forked team might instead be just a hobby or side project.

The second challenge they face is skills. The Chainlink team seems to be very knowledgable in what they do, so competing and keeping up might not be easy.

Competition however would presumably always be welcomed in the open space, with a lot that can be done in this specific niche, including marketplaces where a fee can be taken for their usage.

Making this a very interesting project that can develop in many ways beyond API authentication and connection.

An interesting one might be allowing an individual, like the police, to enter say “accident” and rate its severity.

You can likewise do the same with court judgments, a simple “win” “1 million dollars” with that then released automatically so potentially triggering insurance and a very complicated web especially in negligence or tort, all without human involvement except for the “win” bit.

One can easily see here potential for abuse. The court officer might lie. But that would then just create a bit more work to manually send the payment to the correct place and obviously would get the officer fired and perhaps imprisoned.

Because although these things happen automatically, we can see them and obviously we do know, especially in this simple example of the court order, that the input was incorrect.

For while machines or the blockchain might facilitate a better handling of complexity, humans would still rule by the exercise of judgment.

So there would be oversight in ensuring the system is running fine which it probably would maybe 99% of the time. In those 1% cases, it’s just a bit more work to correct the mistake which in this case is just about money, so, it’s not like the current system doesn’t have the same problem of the court officer.

This merely potentially increases the efficiency of back-end stuff and automates them. Instead of having that court order go to some lawyer admin that sends it to an insurance admin who spends however long to click all the boxes and send it back to the lawyer with the payment then eventually at some point after months reaching the winner, here you turn a lot of it into code.

Then there’s the more “digital” aspect of connecting the blockchain to sensors and wifi so that it is aware of what’s going on and thus lets us know what’s going on.

Making it potentially a powerful middleware that extends the capabilities of the blockchain.