Greenfield Executable: Unlocking the Open Data Economy

This April, BNB Chain launched the BNB Greenfield Testnet called “Congo”. What sets BNB Greenfield apart from existing Decentralized Storage Networks (DSNs) are three vital features.

With the proposal of Greenfield Executable, Greenfield can also solve existing issues of large-scale data processing to unlock the potential of an open data economy. This article explains the design and use case of Greenfield Executable and why it’s important for a sustainable data economy.

Issues of large-scale data processing

The current issue with processing large-scale datasets lies in the complexity and scale of the data. Traditional data processing methods and technologies need help to efficiently handle the volume, velocity, and variety of data generated by large-scale datasets. Some key challenges include:

Ownership: Consumer data is at risk of data being controlled or manipulated by a single entity. With Greenfield, users have greater control over their data. They can choose how their data is stored, who has access to it, and under what conditions it can be accessed. Encryption ensures that only the data owners or authorized users have access to their information, enhancing privacy and reducing the chances of unauthorized access or data breaches.
Scalability: As datasets grow in size, traditional systems face difficulties in scaling horizontally to accommodate the increased workload and storage requirements.
Speed: Processing large-scale datasets in a timely manner can be challenging. The time it takes to extract, transform, and load (ETL) the data for analysis, as well as the processing time for complex computations, can be significant.
Data privacy and security: Large-scale datasets can contain sensitive or personally identifiable information. Ensuring the privacy and security of the data while allowing for efficient processing poses a significant challenge.

Why Greenfield Executable

To solve the issues above, the Greenfield Executable idea was proposed by the community. It is designed to transform data processing for large-scale datasets to reduce cost and improve efficiency. The goal is to create an open, collaborative, and trusted computing ecosystem. NodeReal, one of the core contributors of BNBChain has implemented the Greenfield executable and open-source the codebase.

DSNs are competing to introduce the “execution logic” for data storage blockchain.

EXM is a language-agnostic serverless environment powered by Arweave, and enables developers to create permanent, serverless functions on the blockchain.
IPVM, or the InterPlanetary Virtual Machine, is IPFS’s effort to bring computation to IPFS using Wasm, SPKI, and object capabilities.
The Gensyn network focus on Machine Learning

Compared with these solutions, Greenfield Executable has the following strength:

Privacy and Security: as Greenfield was designed for “data as asset” and control permission at the beginning, the execution subsystem was designed with the “privacy and security” principle in mind. which means the execution is not public and the data/algorithms will be secure during execution.
Decentralization: the whole execution part is decentralized, Greenfield has designed the executing service providers subsystem to not only guarantee the execution security but also the integrity and benefit of executing service providers.
Flexibility, although Greenfield Executable feels like a FAAS service such as lambda, it doesn’t limit the developer’s programming mode. While with lambda, developers have to follow the programming rule to write function entrance etc. With Greenfield Executable, what developers need to do is just write their program by following the requirement of making a config file that specifies the abi, capability, the input/output data uri, etc, it is straightforward and simple compared to following the programming rule.

How does Greenfield Executable work?

New Concepts

Execution Service Provider

Similar to Greenfield Storage Providers (SP) for data stores, there are Greenfield Execution Service Providers dedicated to providing the execution environment and resources to support Greenfield executables.

To become an execution SP, providers must register themselves by depositing BNB tokens on Greenfield as their “service staking”. Greenfield validators will go through a dedicated governance procedure to vote for the execution SPs of their election. Execute SPs are encouraged to advertise their information and prove to the community their capability, as they must provide a professional execution environment with quality and security assurance.

The challenging system for storage SPs also works for execution SPs. Users, validators, storage SPs, other execution SPs, and BNB Greenfield itself may challenge an execution SP for data integrity, resource availability, and security breaches, among other issues. The challenger needs to provide “proof”, and the validators would help verify and vote. If the challenge succeeds, the challenger and validator would be rewarded, whereas the challengee would be punished by having part or all of their stakes slashed (depending on the severity of the issue).

Proof of Execution

It is a proof of execution that can verify if the execution is correctly executed and can be used to reproduce the execution steps. There are two potential solutions:

Interactive Fraud Proof: This solution relies on the assumption that non-web assembly (wasm) calls are secure due to sandboxing mechanisms. The focus is on verifying the integrity of wasm-bytecodes during execution. The challenger traces the inputs/outputs of out-call operations and uses them as part of the “proof.” By replaying the execution with the native traces, the result can be compared to validate the execution. This approach allows for interactive verification and agreement on execution results.
Validity (ZK) Proof: This solution involves leveraging zero-knowledge proof techniques based on the wasm stacks. Projects like zkwasm have been developed to generate zero-knowledge circuits for wasm execution. However, the current implementation has limitations that could impact the flexibility of executable code and introduce performance issues during execution.

Both methods can be used for different scenarios. Developers could choose the preferred “proof-of-execution” approach based on the complexity of the executable and the importance of performance. For simpler executables with acceptable performance, zk proof could be used to reduce the complexity of the interactive challenge. On the other hand, for performance-critical or complex executable logic, the interactive proof method can be employed.

Web Assembly

WebAssembly is not a programming language, but a binary format that can be generated from languages like C, C++, Rust, and Go. WebAssembly code can run alongside JavaScript, and can access the same web APIs and resources.

In our case, we choose web assembly to achieve the following goals:

Security: We can have a fully controlled and sandbox environment to execute payload code.
Portability: we can allow developers to choose any Wasm-supporting language they are comfortable with. All the code will be compiled into webassembly to be executed.
Easy gas calculation: We can use web assembly opcode to calculate gas easily.

Greenfield Executable workflow

Key Steps:

Executables are objects that are stored/operated on Greenfield by normal transactions.
New type of `invoke` transaction is created to call the executable.
Permission check happens before execution, to guarantee that the executable has the correct permission of the data it will be accessed during runtime, and also the user is legal to call the executable.
New role of execute SP in Greenfield takes the responsibility of setting up the execution environment, preparing data, and conducting the execution.
During execution, both data and codes are protected by sandboxing runtime, currently they are processed by Wasm runtime (namely Wasm closed world), and capability based security technologies are introduced to protect the runtime from data/code leakage and unexpected system behaviors.
The Greenfield framework is a component in the execution environment, which provides the capability of accessing out world (non-Wasm world) resources such as logging system, file system, Greenfield api, retrieve service etc.
After execution is completed, a `receipt` transaction will be generated and submitted to the Greenfield, which contains runtime info such as gas consumption, logging object ID, result object ID, resource usage, and execution proof etc.

Future work

Greenfield executable is currently in the proof-of-concept (PoC) phase. Right now, Greenfield has the capability to issue a “test binary” to execute SP and verify the results and gas consumption. Users can also challenge the execute SP to verify the execution results. This demonstrates Greenfield’s potential to store and invoke executables while ensuring the integrity of execution. It’s important to note that the PoC does not yet incorporate permission and proof-of-execution implementations.

As the dev team continues to develop and refine the Greenfield executable, they eagerly look forward to engaging in discussions and welcoming your ideas and code contributions. We believe that collaborative input and knowledge-sharing will be instrumental in enhancing the functionality and security of the Greenfield blockchain. Together, we can build a robust and reliable solution.

Conclusion

This article explored the design and implementation of the Greenfield executable, which represents a new initiative aimed at creating an open, collaborative, and trusted computing ecosystem. BNB Greenfield is a decentralized storage blockchain that offers various opportunities and its technology is still in its early stages, and the possibilities are vast. Join the Greenfield developer community today and be a part of the journey toward a decentralized future.

Binance Chain | BNB Smart Chain (BSC)

Table of Contents

Issues of large-scale data processing

Why Greenfield Executable