Skip to content

feat(compiler): support registered host-await builtins for natural function call syntax#667

Open
kusha wants to merge 2 commits intomicrosoft:mainfrom
kusha:markbirger/registered-host-await-builtins
Open

feat(compiler): support registered host-await builtins for natural function call syntax#667
kusha wants to merge 2 commits intomicrosoft:mainfrom
kusha:markbirger/registered-host-await-builtins

Conversation

@kusha
Copy link
Copy Markdown
Contributor

@kusha kusha commented Apr 8, 2026

Summary

Extends the RVM compiler to accept a list of host function names at compile time. Calls to registered names are compiled directly into HostAwait instructions, allowing policy authors to write lookup(input.account_id) instead of __builtin_host_await(input.account_id, "lookup").

Motivation

  1. No compile-time validation of host-supported identifiers: The identifier passed to __builtin_host_await is a runtime string. Typos like "lokup" instead of "lookup" are silently accepted by the compiler and only discovered at runtime when the host rejects the identifier.

  2. Wrapper workaround breaks caller identification: Defining a wrapper lookup(x) := __builtin_host_await(x, "lookup") compiles the HostAwait instruction inside the wrapper module, not the calling module — breaking caller identification for authorization and auditing.

  3. Host can override standard builtins: Registering a name that matches a standard Rego builtin (e.g. time.parse_duration_ns) lets the host intercept that call and provide its own implementation.

Changes

Compiler (src/languages/rego/compiler/)

  • register_host_await_builtin() on Compiler to declare host function names. Validates arg_count == 1 and rejects the reserved name __builtin_host_await.
  • compile_from_policy_with_host_await() entry point that accepts the builtin list. Existing compile_from_policy() delegates with an empty list.
  • determine_call_target() extended with resolution order: explicit __builtin_host_await → registered host-await → user-defined → standard builtin.
  • Both explicit and registered paths emit identical HostAwait { dest, arg, id } bytecode.

Documentation (docs/rvm/)

  • instruction-set.md: Registered builtin syntax, resolution order, argument handling.
  • architecture.md: Updated to describe both emission paths.

Tests (tests/rvm/rego/)

  • Extended test harness with HostAwaitBuiltinSpec and argument assertion.
  • 9 YAML test cases: suspend/resume, run-to-completion, multiple names, queue, shadowing, object packing, arg_count rejection, reserved name rejection, standard builtin override.

Constraints

  • arg_count must be 1: The HostAwait instruction carries a single argument register. Use object packing to pass multiple values: lookup({"key1": v1, "key2": v2}). This can be lifted in the future by having the compiler auto-pack multiple arguments into an array or object.
  • __builtin_host_await is reserved: Attempting to register this name produces a compile-time error. It is handled by a dedicated code path (explicit 2-argument form) and cannot be overridden.

Limitations

  • No conflict detection: A registered name that collides with a standard builtin silently wins. No warning is emitted.
  • __builtin_host_await still exists: The raw builtin remains functional. Policy authors who discover it can use arbitrary identifier strings. However, the host controls how identifiers are resolved and can reject or error on unexpected identifiers at runtime.

Allow hosts to register function names at compile time so that calls to
those names emit HostAwait instructions directly, enabling natural syntax
like fetch(x) instead of __builtin_host_await(x, "fetch").

- Add host_await_builtins map and register_host_await_builtin() to Compiler
- Validate arg_count == 1 and reject reserved __builtin_host_await name
- Extend determine_call_target() resolution: explicit > registered > user > builtin
- Both explicit and registered paths emit identical HostAwait bytecode
- Add compile_from_policy_with_host_await() entry point in rules.rs
- Extended test harness with HostAwaitBuiltinSpec and args assertion
- 9 YAML test cases: suspend/resume, run-to-completion, multiple names,
  queue, shadowing, object packing, arg_count rejection, reserved name
  rejection, standard builtin override
- Documentation: instruction-set.md, architecture.md
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the Rego→RVM compiler to support “registered” host-await builtins, allowing natural function-call syntax (e.g. lookup(x)) to compile directly into HostAwait instructions, while retaining the explicit __builtin_host_await(arg, id) path.

Changes:

  • Add Compiler::compile_from_policy_with_host_await(...) and Compiler::register_host_await_builtin(...) to configure host-await builtin names (with arg_count == 1 enforced).
  • Extend call target resolution to prioritize explicit __builtin_host_await, then registered host-await names, then user-defined functions, then standard builtins.
  • Update the YAML-based RVM Rego test harness to pass registered builtins and (in suspendable mode) assert the host-await argument payload; add a new YAML suite covering registered host-await scenarios.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/languages/rego/compiler/mod.rs Adds compiler state + API to register host-await builtins and validate registration constraints.
src/languages/rego/compiler/rules.rs Adds a new compile entry point that accepts registered host-await builtins and wires it into compilation.
src/languages/rego/compiler/function_calls.rs Implements resolution order + emits HostAwait for registered names by auto-loading the identifier literal.
tests/rvm/rego/mod.rs Extends harness to pass registered builtin specs into compilation and validate suspendable host-await arguments.
tests/rvm/rego/cases/registered_host_await.yaml Adds test coverage for registered host-await behavior (shadowing, queueing, override, validation errors).
docs/rvm/instruction-set.md Documents registered host-await builtin syntax, resolution order, and argument handling constraints.
docs/rvm/architecture.md Updates architecture walkthrough to describe both explicit and registered HostAwait emission paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Mark Birger <birgerm@yandex.ru>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +315 to +319
assert_eq!(
actual, expected,
"HostAwait argument mismatch for {:?}: expected {:?}, got {:?}",
identifier, expected, actual
);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_eq! will panic on a HostAwait argument mismatch, which bypasses this test harness’s normal error propagation (and can skip listing dumps / want_error handling). Prefer returning an anyhow::Error (or using the existing bail-with-listing pattern by threading case context) so failures are reported consistently.

Suggested change
assert_eq!(
actual, expected,
"HostAwait argument mismatch for {:?}: expected {:?}, got {:?}",
identifier, expected, actual
);
if actual != expected {
return Err(anyhow::anyhow!(
"HostAwait argument mismatch for {:?}: expected {:?}, got {:?}",
identifier,
expected,
actual
));
}

Copilot uses AI. Check for mistakes.

result := kv_store(input.key, input.value)
query: data.demo.result
# Registration panics because arg_count must be 1.
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says registration "panics" when arg_count is invalid, but the compiler path returns a compile-time error (register_host_await_builtin returns Err). Consider rewording to avoid implying a panic.

Suggested change
# Registration panics because arg_count must be 1.
# Registration fails with an error because arg_count must be 1.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@anakrish anakrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

@@ -0,0 +1,240 @@
cases:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More test cases to lock down behavior:

  • Out param syntax: e.g: lookup(in, out)
  • Test case that uses __builtin_host_await and registerd builtins in same policy
  • Empty host await builtins

pub fn compile_from_policy_with_host_await(
policy: &CompiledPolicy,
entry_points: &[&str],
host_await_builtins: &[(&str, usize)],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scenarios to define behavior for:

  • Duplicate entries in host_await_builtins
  • Empty host_await_builtins
  • A name is empty or whitespace

(arg_regs[0], arg_regs[1])
} else {
// Registered host-awaitable builtin — identifier is the function name
if arg_regs.len() != 1 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to define behavior when registered host await builtin shadows a user write policy rule (both regular and function)

}

// Check registered host-awaitable builtins
if let Some(&arg_count) = self.host_await_builtins.get(original_fcn_path) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to detect shadowing here.


// Check registered host-awaitable builtins
if let Some(&arg_count) = self.host_await_builtins.get(original_fcn_path) {
return Ok(CallTarget::HostAwait {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it might be clear to distinguish between the builtin and the custom registered builtins via a new variant say RegisteredHostAwait. That way the rest of the code doesn't need to depend on the builtin's name (__builtin_host_await) to distinguish between builtin and registered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants