Skip to content

feat(compiler): FFI and C# bindings for RVM, incl. host-await built-ins#672

Draft
kusha wants to merge 3 commits intomicrosoft:mainfrom
kusha:markbirger/host-await-ffi-csharp-bindings
Draft

feat(compiler): FFI and C# bindings for RVM, incl. host-await built-ins#672
kusha wants to merge 3 commits intomicrosoft:mainfrom
kusha:markbirger/host-await-ffi-csharp-bindings

Conversation

@kusha
Copy link
Copy Markdown
Contributor

@kusha kusha commented Apr 9, 2026

Summary

Exposes the registered host-await builtins, Program compilation, and RVM runtime accessors through the FFI layer and C# bindings. This is the companion to #667 (registered host-await builtins in the compiler/VM), making the feature usable from C# consumers.

Motivation

The previous PR added registered host-await builtins to the Rust compiler and VM. However, the FFI boundary and C# bindings only exposed the raw __builtin_host_await path. This PR bridges the gap so C# consumers can:

  1. Register host-await builtins at compile time — pass builtin names to Program.CompileFromModules, which emits HostAwait instructions directly.
  2. Pre-load responses for run-to-completion mode — call Rvm.SetHostAwaitResponses to queue responses before execution.
  3. Inspect suspension state — call Rvm.GetHostAwaitIdentifier() and Rvm.GetHostAwaitArgument() to determine which builtin suspended and with what argument.

Changes

Rust VM (src/rvm/vm/machine.rs)

  • get_host_await_argument() and get_host_await_identifier()const fn accessors that return the argument/identifier when the VM is in a HostAwait-suspended state.

FFI (bindings/ffi/src/rvm.rs)

  • RegorusHostAwaitBuiltin#[repr(C)] struct for passing builtin registrations across FFI.
  • regorus_compile_from_modules — extended with optional host-await builtin array.
  • regorus_rvm_set_host_await_responses — pre-load response queue for run-to-completion mode.
  • regorus_rvm_get_host_await_argument / regorus_rvm_get_host_await_identifier — JSON accessors for suspension state.

C# Bindings (bindings/csharp/Regorus/)

  • HostAwaitBuiltin — readonly struct wrapping a builtin name and argument count.
  • ExecutionMode — enum (RunToCompletion, Suspendable).
  • Program.CompileFromModules — new overloads accepting HostAwaitBuiltin[].
  • Rvm.SetHostAwaitResponses — queue JSON responses for a given identifier.
  • Rvm.GetHostAwaitArgument() / GetHostAwaitIdentifier() — read suspension state.
  • ModuleMarshallingPinnedUtf8Strings and PinnedHostAwaitBuiltins helpers for safe FFI marshalling.

Documentation

  • bindings/csharp/README.md — added suspendable and run-to-completion examples with registered builtins.
  • bindings/csharp/API.md — added Program, Rvm, HostAwaitBuiltin, ExecutionMode API reference.
  • docs/rvm/vm-runtime.md — documented the new VM accessors.

Tests (bindings/csharp/Regorus.Tests/RvmProgramTests.cs)

  • RegisteredHostAwait_Suspendable_SuspendAndResume — registers get_account, suspends, verifies identifier and argument, resumes with a response.
  • RegisteredHostAwait_RunToCompletion_WithPreloadedResponses — registers translate, pre-loads a response, verifies end-to-end execution.

Notes

  • Entry point marshalling reused for JSON strings: The existing PinnedUtf8Strings helper (originally written for pinning entry point string arrays across the FFI boundary) is repurposed to also marshal the JSON response strings in SetHostAwaitResponses. Same pattern: pin an array of null-terminated UTF-8 pointers, pass pointer + length to Rust.

  • CompileFromEngine does not support host-await builtins: Only CompileFromModules accepts HostAwaitBuiltin[]. This is a scoping choice to keep the PR smaller — there is no technical constraint preventing it. The engine-based path can be extended in a follow-up if needed.

Mark Birger and others added 3 commits April 9, 2026 16:27
Allow hosts to register function names at compile time so that calls to
those names emit HostAwait instructions directly, enabling natural syntax
like fetch(x) instead of __builtin_host_await(x, "fetch").

- Add host_await_builtins map and register_host_await_builtin() to Compiler
- Validate arg_count == 1 and reject reserved __builtin_host_await name
- Extend determine_call_target() resolution: explicit > registered > user > builtin
- Both explicit and registered paths emit identical HostAwait bytecode
- Add compile_from_policy_with_host_await() entry point in rules.rs
- Extended test harness with HostAwaitBuiltinSpec and args assertion
- 9 YAML test cases: suspend/resume, run-to-completion, multiple names,
  queue, shadowing, object packing, arg_count rejection, reserved name
  rejection, standard builtin override
- Documentation: instruction-set.md, architecture.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Mark Birger <birgerm@yandex.ru>
Expose registered host-await builtins, Program compilation, and RVM
accessors through the FFI layer and C# bindings.

FFI (bindings/ffi/src/rvm.rs):
- compile_from_modules with host-await builtin registration
- set/get host-await responses, argument, identifier
- RegorusHostAwaitBuiltin C struct

C# (bindings/csharp/Regorus/):
- Program.CompileFromModules overloads with HostAwaitBuiltin[]
- Rvm.SetHostAwaitResponses, GetHostAwaitArgument, GetHostAwaitIdentifier
- HostAwaitBuiltin readonly struct, ExecutionMode enum
- ModuleMarshalling: PinnedUtf8Strings, PinnedHostAwaitBuiltins

Rust (src/rvm/vm/machine.rs):
- get_host_await_argument() and get_host_await_identifier() accessors

Docs: README examples, API.md reference, vm-runtime.md accessors
Tests: suspendable + run-to-completion C# scenarios
@kusha kusha force-pushed the markbirger/host-await-ffi-csharp-bindings branch from fa62435 to 54fbb82 Compare April 9, 2026 20:31
@anakrish anakrish requested a review from Copilot April 9, 2026 21:41
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the RVM host-await feature end-to-end by wiring “registered host-await builtins” through the Rust compiler, VM accessors, the FFI layer, and the C# bindings, with accompanying docs and tests.

Changes:

  • Add compiler support to register host-awaitable builtin names at compile time and emit HostAwait for natural function-call syntax.
  • Expose host-await suspension inspection (identifier/argument) and run-to-completion response preloading via FFI + C# APIs.
  • Add Rust YAML regression cases plus new C# binding tests and documentation updates.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/rvm/rego/mod.rs Extends YAML harness to pass registered host-await builtins into compilation and (optionally) assert host-await arguments in suspendable mode.
tests/rvm/rego/cases/registered_host_await.yaml Adds regression coverage for registered builtins (shadowing, queueing, arg-count validation, overrides).
src/rvm/vm/machine.rs Adds VM accessors for host-await identifier/argument when suspended.
src/languages/rego/compiler/rules.rs Adds compile_from_policy_with_host_await and wires registrations into compilation.
src/languages/rego/compiler/mod.rs Adds storage + registration API for host-awaitable builtins (with validation).
src/languages/rego/compiler/function_calls.rs Updates call resolution/emission to support registered host-await builtins and emit identifier literals.
docs/rvm/vm-runtime.md Documents new VM host-await accessors.
docs/rvm/instruction-set.md Documents registered host-await builtin behavior and resolution order.
docs/rvm/architecture.md Describes explicit vs registered HostAwait emission paths.
bindings/ffi/src/rvm.rs Adds FFI structs/APIs for compile-with-builtins, response preloading, and suspension JSON accessors.
bindings/csharp/Regorus/Rvm.cs Adds C# wrappers for host-await inspection + response preloading.
bindings/csharp/Regorus/Program.cs Adds CompileFromModules overload supporting host-await builtin registrations; refactors compile paths.
bindings/csharp/Regorus/NativeMethods.cs Adds P/Invoke declarations and FFI struct for host-await builtins + new VM functions.
bindings/csharp/Regorus/ModuleMarshalling.cs Generalizes UTF-8 string pinning helper and adds host-await builtin marshalling.
bindings/csharp/Regorus/Compiler.cs Introduces public HostAwaitBuiltin struct for registration.
bindings/csharp/Regorus.Tests/RvmProgramTests.cs Adds suspendable + run-to-completion tests for registered host-await builtins.
bindings/csharp/README.md Adds usage examples for registered host-await in both execution modes.
bindings/csharp/API.md Documents new/expanded Program/Rvm/HostAwaitBuiltin APIs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +708 to +713
values.push_back(val);
}
}

guard.set_host_await_responses(core::iter::once((id_value, values)));
Ok(())
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regorus_rvm_set_host_await_responses calls RegoVM::set_host_await_responses, which clears the entire host-await response map. As a result, calling this function multiple times (e.g., to preload responses for multiple identifiers) will drop previously set responses for other identifiers, making run-to-completion with multiple different host-await builtins impossible via this API. Consider changing the VM API usage to clear/replace only the queue for the specified identifier (or provide an FFI that accepts multiple identifiers in one call).

Copilot uses AI. Check for mistakes.
Comment on lines +224 to +227
/// Clears any previously configured responses, then queues the
/// provided values for the given identifier.
/// </summary>
/// <param name="identifier">The builtin identifier.</param>
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML doc says this method "clears any previously configured responses" but the underlying FFI currently clears all identifiers’ queues, not just the specified identifier. Either clarify in the docs that this overwrites the entire response set, or adjust the native API so callers can preload responses for multiple identifiers without losing earlier ones.

Suggested change
/// Clears any previously configured responses, then queues the
/// provided values for the given identifier.
/// </summary>
/// <param name="identifier">The builtin identifier.</param>
/// Replaces the entire previously configured HostAwait response set,
/// then queues the provided values for the given identifier.
/// This method does not preserve responses queued for other identifiers.
/// </summary>
/// <param name="identifier">The builtin identifier whose responses will be queued after the reset.</param>

Copilot uses AI. Check for mistakes.
Comment on lines +207 to +210
buffer[i] = new RegorusHostAwaitBuiltin
{
name = namePinned.Pointer,
arg_count = (UIntPtr)builtins[i].ArgCount,
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting builtins[i].ArgCount (an int) to UIntPtr will reinterpret negative values as very large unsigned values. Consider validating ArgCount is non-negative (and within a reasonable range) before casting, or changing HostAwaitBuiltin.ArgCount to an unsigned type to match the FFI surface.

Suggested change
buffer[i] = new RegorusHostAwaitBuiltin
{
name = namePinned.Pointer,
arg_count = (UIntPtr)builtins[i].ArgCount,
var argCount = builtins[i].ArgCount;
if (argCount < 0)
{
throw new ArgumentOutOfRangeException(nameof(builtins), $"Host await builtin at index {i} has a negative {nameof(HostAwaitBuiltin.ArgCount)}.");
}
buffer[i] = new RegorusHostAwaitBuiltin
{
name = namePinned.Pointer,
arg_count = (UIntPtr)argCount,

Copilot uses AI. Check for mistakes.

result := kv_store(input.key, input.value)
query: data.demo.result
# Registration panics because arg_count must be 1.
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment says registration "panics", but the compiler path returns a compilation error (want_error) rather than panicking. Adjust the comment to reflect that this is an expected compile-time failure.

Suggested change
# Registration panics because arg_count must be 1.
# Registration is rejected because arg_count must be 1.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants