Core principles and patterns of working with collections in Noir.
As we deep dive into a fast-growing ecosystem and develop our own codebase in Noir, we recognize the need to define and use idiomatic Noir.
Idiomatic code follows the conventions and best practices of a particular programming language. Hence, idiomatic code has a big impact on readability and maintainability.
Noir is inspired by Rust, allowing us to borrow some of its idioms. However, the Noir community will still need to develop certain language constructs and patterns.
This article is just the beginning. We will share more insights from our development team here. Follow us on X so you do not miss the next parts.
It is hard to overemphasize the value of readable and maintainable code. Here are the top reasons to consider in the context of Noir:
Below is an analysis of a fictional JSON parser implementation, highlighting idioms, patterns, and refactorings encountered while working with collections in Noir.
Consider the following JSON parsing function:
global MAX_INPUT_LEN = 1024;
global MAX_JSON_NODE_LEN = 1024;
struct JsonNode {
offset: u64,
len: u64
}
fn parse_json(input: [u8; MAX_INPUT_LEN], input_len: u64) -> ([JsonNode; MAX_JSON_NODE_LEN], u64) {
...
}
Note one important limitation inherited by Noir from PLONK implementation. Array length information usually comes in pairs. First, we have the array's MAXIMUM length and then the actual length. Therefore, we need two arguments:
Also, the returning type has to be a tuple:
We can improve the code above by using BoundedVec instead of Arrays. It stores both the max length (as a generic parameter) and the actual length.
Below is the updated function:
fn parse_json(input: BoundedVec<u8, MAX_INPUT_LEN>) -> BoundedVec<JsonNode, MAX_JSON_NODE_LEN> {
...
}
There is another problem to notice in this code. It works only up to specific input sizes defined in a global. Regardless of the input size, proving always uses the same - maximum - amount of computation power.
To solve this problem, we will introduce generic parameters instead of constants.
The code after the first iteration of improvement could look like this:
struct JsonNode {
offset: u64,
len: u64
}
fn parse_json<MAX_INPUT_LEN, MAX_NODES>(input: BoundedVec<u8, MAX_INPUT_LEN>) -> ([JsonNode; MAX_NODES], u64)
{
...
}
However, a careful reader might sense something is still off. Let's examine an example of how to use the function above:
let (nodes, len) = parse_json(input);
for (let i: u64 = 0; i < MAX_NODES; i++) {
if (i < len) {
doSth(node[i]);
}
}
This is a familiar pattern we found in the code using loops. It consists of iterating over a fixed predefined size, and a nested if-statement restricts iterations to the dynamic size.
We can see a leaking abstraction. A simple dynamic loop takes 5 lines of code. It’s not DRY nor readable. Let’s improve that.
A popular way to abstract iteration is to introduce Iterable. Here is the boilerplate code we need to apply the pattern:
trait Iterable<T, MAX> {
let len: u64;
fn each(f: fn (T) -> ());
fn map<R>(f: fn (T) -> R) -> R;
...
}
impl<T, MAX> Iterable for BoundedVec<T, MAX> {
...
}
fn parse_json<MAX_INPUT_LEN, MAX_NODES>(input: Iterable<u8; MAX_INPUT_LEN>)
-> Iterable<JsonNode, MAX_NODES> {
...
}
And example usage:
let node = parseJSON(input);
node.each(|node| {
doSth(node)
});
Now, let's put our JSON parser into production. Suppose we need to parse content coming from a user, consume API content from a browser, handle OAuth data, or similar tasks. Often, this data comes in various sizes.
Creating a single large circuit capable of handling all sizes would be inefficient. It would require paying the computational costs of processing the largest possible input, even for the smallest practical input sizes.
Instead, we can generate several provers of different sizes, such as S, M, and L.
For example, the distribution of input data from users might resemble this:
So, I don't want to use an L-sized prover for S-sized input data.
Hence, I need to generate three provers. The only way to do this now is to create a library with core functionality using generic parameters. Then, I need to create three projects with the main function, each containing a single line that calls the generic functionality with different parameters.
Lib:
fn do_json_stuff<MAX_INPUT_LEN, MAX_NODES>(input: Iterable<u8, MAX_INPUT_LEN>) {
...
}
Main 1:
fn main(input: [u8; 1000]) {
do_json_stuff<1000, 10>(input);
}
Main 2:
fn main(input: [u8; 10000]) {
do_json_stuff<10000, 50>(input);
}
Main 3:
fn main(input: [u8; 100000]) {
do_json_stuff<100000, 100>(input);
}
Note that we assume a similar distribution of two variables MAX_INPUT and MAX_NODES. Those distributions can vary a lot. In practice, I could easily imagine a matrix 5x5 (i.e., (XS,XS), (XS, S), (S, XS), ..., (XL, XL)) of generic parameters. That would produce 25 provers, which could be a reasonable optimal computational resources optimization strategy.
We propose to implement a tool and, in the future, a feature of the language that would allow us to generate multiple provers more simply. It would leverage annotations to generate multiple subprojects.
Here is an example:
#[prover(MAX_INPUT_LEN=1000, MAX_NODES=10)]
#[prover(MAX_INPUT_LEN=10000, MAX_NODES=50)]
#[prover(MAX_INPUT_LEN=100000, MAX_NODES=100)]
fn doFancyJsonStuff<MAX_INPUT_LEN, MAX_NODES>(input: Iterable<u8; MAX_INPUT_LEN>) {
...
}
In this case, Nargo would generate multiple provers for multiple directories.
Ideal scenario: Initially, an external tool could be built to generate a subproject as an initial proof of concept. If the tool is deemed valuable by the community, the functionality can be moved to Nargo.
Consequently, a developer needs to manage multiple verification keys, which I believe is an acceptable short-term solution. It would be great to have one "master" verification key in the future.
The reasoning presented above encapsulates the core principles of working with collections in Noir. We hope it demonstrates how to work efficiently with Noir today and highlights important directions for the language's future development.
What are your thoughts on our approach? Do you have any suggestions or insights to share? Let us know on X.