Simulated · UNU Ad Hoc Consultation Workshop on the Digital Future

Governance of Autonomous Agentic AI Systems

Interoperability, Accountability, and Guidelines for Agent Harnesses

Background Guide for the ReModelUN Pilot Program · Paris–Geneva 2026

Important

This Is Not a Traditional Background Guide

In most Model UN conferences, the Background Guide is an exhaustive document that covers history, context, key actors, past UN actions, and detailed analysis of every sub-topic. Delegates read it, absorb it, and prepare positions based primarily on what the organizers have provided.

ReModelUN takes a deliberately different approach. What you are reading is a concise framing of the workshop topic—the core concepts, the direction of inquiry, and the governance questions we believe matter most. It is intentionally not comprehensive. The gaps are not oversights; they are your research space.

We believe this is a more honest and productive way to prepare for a policy debate on a fast-moving frontier topic. No pre-written guide can keep up with the pace of agentic AI development. What matters is your ability to find, evaluate, synthesize, and apply information from primary sources—and to do so with the help of the most powerful research tools available today.

Your Job: Research

Use this guide as a starting point. Go deeper on your own. Read the technical papers, UN frameworks, and policy analyses linked in the Resources page—and find sources we haven’t listed.

Your Tool: AI

You are actively encouraged to use AI tools—LLMs, frontier agents, retrieval-based research platforms, and any other tools you find useful. The quality of your AI-assisted research process is itself part of the learning.

Pre-Conference Submission Requirement

Before the conference, every delegate must submit a Research Process Document alongside their Position Paper. This document should record how you conducted your research: the AI tools you used, the prompts you wrote, the agent workflows or agent skills you designed, the LLM dialogue traces you found useful, your search strategies, and how you evaluated and compared sources. This documentation is part of the final evaluation because the quality of your reasoning process matters as much as the quality of your conclusions in ReModelUN.

I

Why We Designed This Workshop

The UNU Ad Hoc Consultation Workshop on the Digital Future is a fictional body designed specifically for the ReModelUN pilot. It does not exist as a real convening within the United Nations system. However, the problem it addresses is entirely real, and the reason we created this simulation is that the underlying policy challenge is no longer well served by a conventional Model UN format. The issue on the agenda is not only politically sensitive; it is technically fast-moving, operationally complex, and deeply entangled with questions of standards, infrastructure, security, human rights, and institutional legitimacy.

The simulated workshop sits inside the UNU Macau Digital Future Summer Training Program, a flagship initiative by UNU Macau and the Learning Planet Institute. Delegates are not only simulating diplomacy; they are acting as youth policy experts in a high-rigor environment designed to bridge the gap between youth advocacy and institutional policy-making.

The rapid transition from generative AI to agentic AI makes this especially urgent. Unlike passive assistants that merely generate text, agentic systems can plan, call tools, maintain memory, coordinate with other agents, and act across digital environments with limited human intervention. That shift means governance can no longer focus only on model outputs. It must also examine runtime control, tool permissions, evaluation layers, logging, escalation paths, and operational safeguards.

For this reason, the simulated workshop is tasked with exploring standards and policy guidance for agentic harnesses: the socio-technical runtime environments that constrain, monitor, and enable autonomous agents. The exercise moves beyond abstract AI ethics and into a more difficult question: how should global governance respond when the practical control surface of AI increasingly sits in the harness layer rather than in the model alone?

II

The Real-World Pilot Context

The structure of the program matters. Delegates begin with a full-day ReModelUN session in Paris at the Learning Planet Institute and then continue to Geneva for the ITU AI for Good Summit. This sequencing is deliberate. Paris is the site of structured deliberation; Geneva is the site of observation, comparison, and institutional exposure.

That means your preparation should be different from standard MUN preparation. You are not preparing to deliver generic speeches about innovation and ethics. You are preparing to hold up your claims against a live ecosystem of researchers, regulators, implementers, startups, standards actors, and UN-affiliated initiatives working on real deployments of AI.

01

Paris: Deliberation

Workshop negotiation, problem framing, and the first attempt to produce structured policy language at the Learning Planet Institute.

02

Geneva: Observation

Exposure to the AI for Good ecosystem, targeted note-taking, and comparison between workshop assumptions and expert practice.

03

Synthesis

Converting youth deliberation, evidence use, and structured outputs into a UNU-oriented policy artifact for dissemination.

This is why evidence trails, prompt strategies, and research workflows matter. In the ReModelUN logic, the quality of reasoning is not separate from the quality of the output. Delegates are encouraged to document how they searched, compared sources, tested assumptions, and refined claims.

III

The “Harness” Paradigm

From Prompts to Infrastructure

In the context of Agentic AI, a Harness is not a set of rules written in a prompt; it is the runtime environment in which the agent operates. It is the “tack” that connects the raw power of a Large Language Model (LLM) to the real world.

This distinction matters because many public discussions still talk about governing AI as if the main object of control were the model response itself. In practice, frontier labs and developers are increasingly discovering that reliability, safety, and usefulness depend heavily on the architecture around the model: retrieval systems, memory policies, step limits, sandboxing, evaluator agents, permissioning, observability, and human approval gates.

Technical Components of a Harness

Context Assembly

Dynamically filtering what the agent “knows” at any given moment to prevent hallucinations and ensure relevance.

Tool Orchestration

Defining the strict boundaries of what APIs and digital tools the agent can call, following “Least Privilege” access principles.

Verification Loops

Automated “evaluator” agents that check the output of “generator” agents before any action is taken in the real world.

Operational Governors

Hard-coded limits on token spend, execution steps, and recursive depth to prevent infinite loops or resource exhaustion.

Conceptual Model of an Agentic Harness

graph TD
    User[User Intent] --> Harness[Agentic Harness]
    subgraph Harness [The Harness Layer]
        Context[Context Assembly]
        Tools[Tool Orchestration]
        Verify[Verification Loop]
    end
    Harness --> LLM[Foundation Model]
    LLM --> Verify
    Verify -->|Approved| Action[Real World Action]
    Verify -->|Rejected| LLM
            
IV

Global Governance & the UN Problem Space

The governance challenge is not simply that agentic systems may fail. It is that they may fail through layered infrastructures with distributed responsibility. Models, application developers, integrators, institutions, users, and regulators all shape the final behavior. The harness becomes the operational site where accountability can either become more concrete or more diffuse.

From a UN and UNU perspective, the key question is how to make these infrastructures not just efficient, but legible, auditable, contestable, and aligned with public values. Delegates in this simulated workshop should therefore think in terms of governance design, not just safety slogans.

1. Interoperability Standards

How do harnesses from different providers and deployment stacks communicate? Without interoperability, we risk fragmented “agentic silos” where policy rules, audit methods, and safety signals cannot travel across platforms. Interoperability is therefore both a technical standards problem and a governance coordination problem.

2. The Accountability Gap

When an agent makes a harmful or consequential decision, who is answerable: the model developer, the application team, the harness engineer, the deploying institution, or the end user? The workshop should define a chain of provenance for agentic actions so that responsibility is not lost between model output and downstream execution.

3. Alignment with UN Values

Harnesses should incorporate human-in-the-loop or human-on-the-loop requirements for high-stakes domains affecting rights, safety, due process, public services, peace, and security. But the workshop should also ask a harder question: where should mandatory human intervention be required, and where can structured oversight be sufficient without reducing the system to uselessness?

V

Workshop & Topic Details

For the purposes of this simulation, delegates convene as the UNU Ad Hoc Consultation Workshop on the Digital Future—a fictional body designed to address frontier issues in technology governance. This workshop does not exist in the real UN system, but its mandate mirrors genuine governance gaps. Within the simulation, the ad hoc workshop is empowered to move rapidly between technical standards and high-level policy recommendations.

Three Sub-Issues

A

Interoperability

Establishing common protocols so that safety signals and audit trails can be shared across different agentic platforms. How do we prevent fragmentation while respecting commercial competition and national sovereignty?

B

Accountability

Defining the legal and operational responsibility for actions taken by autonomous agents, especially in cross-border contexts. Who is liable when an AI agent operating in one jurisdiction causes harm in another?

C

Guidelines for Agentic Harnesses

Creating a “Gold Standard” for the runtime environments that constrain agentic AI, ensuring they are secure by design, interoperable, and aligned with human rights and UN values.

Workshop Logistics

Rules of Procedure

Modified UNA-USA rules with an emphasis on technical consensus. See MUN 101 for procedural details.

Speakers List

Delegates will be added to the primary speakers list upon request. Speaking time is set at 90 seconds.

Working Papers

Papers must be sponsored by at least 3 delegates and signed by 5 to be introduced for debate.

Final Output

The workshop aims to adopt a single, comprehensive UNU Policy Brief by consensus.

VI

Guiding Questions for Delegates

As you prepare, consider these questions to deepen your analysis and shape your country’s position:

On Interoperability: Should there be a universal standard for how agent harnesses share safety signals, or should regional variation be allowed?

On Accountability: If an autonomous agent causes financial harm while operating across three jurisdictions, which country’s courts should have jurisdiction?

On Harness Design: Should governments certify or audit agent harnesses the way they audit financial institutions or medical devices?

On Human Oversight: Where is human-in-the-loop control essential, and where is automated oversight sufficient? How do you balance safety with usability?

On Development: How should governance frameworks differ between wealthy nations with advanced AI ecosystems and developing countries seeking to adopt these technologies?

On Innovation: How do you prevent governance from stifling beneficial innovation while still protecting against catastrophic risks from ungoverned agentic systems?

VII

What Your Output Should Be

Your goal in this workshop is not to pass a generic ceremonial resolution. You are working toward a UNU-style policy brief logic: a short, evidence-aware, policy-facing document that clearly defines the problem, identifies governance options, and proposes actionable recommendations for institutions.

Evidence-Based

Use traceable sources, standards, and concrete governance mechanisms rather than broad claims about innovation or ethics.

Actionable

Propose harness-oriented recommendations that an engineer, regulator, or standards body could actually implement.

Future-Proof

Address how governance should evolve as systems move from narrow assistants to more autonomous and networked agents.

VIII

Documenting Your Research Process

ReModelUN treats the research process as a first-class deliverable. In traditional MUN, what matters is the speech you give and the resolution you help pass. Here, we also want to understand how you arrived at your positions—what you searched for, how you used AI, what worked, what failed, and how you refined your approach.

This is not busywork. It is a core part of the ReModelUN research design. UNU Macau is studying how young participants learn, reason, and collaborate with AI. Your research documentation contributes to that study and helps us understand the real literacy frontier.

What to Document

AI Tools & Platforms

Which LLMs, agents, or retrieval tools did you use? (e.g., ChatGPT, Claude, Perplexity, custom agents, RAG pipelines). Note the version or model when possible.

Prompts & Queries

Record the key prompts you used. What did you ask? How did you refine your questions when results were unsatisfactory? Include examples of effective and ineffective prompts.

Agent Workflows & Skills

If you used agentic tools—multi-step workflows, agent skills, tool-calling agents, or multi-agent setups—describe the architecture. What did the agent do? What tools did it call? How did you constrain or direct it?

Source Evaluation

How did you verify AI-generated claims? Did you cross-reference with primary sources? Note instances where AI was wrong or misleading and how you caught it.

Submission & Evaluation

Your Research Process Document should be submitted alongside your Position Paper before the conference begins. There is no strict format requirement—it can be a structured report, an annotated log, or even a curated collection of screenshots and notes—as long as it is honest, specific, and reflective.

This document will be reviewed as part of the overall delegate evaluation. We are not looking for perfection. We are looking for genuine engagement with the research process: evidence that you explored, experimented, made choices, and learned something about how to use AI as a serious research tool for policy work.