Research Agenda — Functional Consciousness

Project ARIA: Agent World Simulation

A project proposal for an LLM-native cognitive architecture maximizing Functional Consciousness (FC) and a multi-agent simulation testbed, built on top of Nanobot.

Read the full project proposal

Theoretical Development

Deepening the relationship with consciousness theories

FC captures the functional substrate that major consciousness theories — IIT, GWT, HOT, Predictive Processing, and Attention Schema Theory — treat as necessary for consciousness. A forthcoming companion paper will systematically work through what FC covers and what “sticks out” beyond its scope for each theory, including a formal derivation of FC's engineering analogue of integrated information (Φ_FCS). This is not a claim that FC resolves the hard problem — it does not — but a precise mapping of where FC's functional commitments overlap with, and diverge from, each theoretical tradition.

Formal derivation of the cross-model aggregation formula

The current multiplicative formula for reasoning power under perfect cross-reasoning (P_agent = ∏_j P(m_j)) is acknowledged on the updates page as lacking a formal derivation and violating slice-invariance. A short technical paper will derive the correct formula from first principles using joint predictive mutual information, proving slice-invariance and establishing the boundary conditions under which the approximation holds.

Metric Validation

A metric proposal is not a validated metric. FC requires a systematic validation program before its scores can be interpreted with confidence. The following studies are needed, roughly in order of urgency and tractability:

Uncertainty and sensitivity analysis [low effort, high priority]

Systematically varying the key parameters of the FCS calculation across all benchmark systems to establish confidence intervals and minimum detectable differences. The current benchmark table uses point estimates; a revised version with uncertainty ranges is needed for scientific credibility.

Variable individuation protocol [medium effort, high priority]

A formal checklist specifying exactly how to identify, bound, and count self-model variables in a white-box system. Currently the largest reproducibility gap in the metric.

Inter-rater reliability of FSMA [medium effort]

Having multiple independent analysts apply FSMA to the same text using the same framework, and computing inter-rater agreement statistics. A necessary precondition for FSMA being treated as a reproducible scientific methodology.

Criterion validity study [high effort, most decisive]

Designing a battery of tasks specifically requiring self-modeling capacity and testing whether FCS predicts performance on these tasks independently of general capability measures. This is the study that would transform FC from a theoretically motivated proposal into an empirically validated instrument.

FSMA Methodology

Cross-text and cross-framework validation

The current FSMA demonstration uses a single text (Virginia Woolf) analyzed through a single framework (SBR). Applying FSMA to additional texts — Descartes' Meditations, published Descriptive Experience Sampling transcripts, Anne Frank's diary — and through additional frameworks — BDI, Metzinger's PSM, GWT's vocabulary — would test whether the self-model catalog is genuinely source- and framework-independent.

FSMA applied to AI reasoning traces

The most direct extension of FSMA is applying it to its actual target domain: AI systems. Published chain-of-thought reasoning traces and agentic system logs provide behavioral evidence from which self-models can be abductively inferred. This closes the loop from literary proof-of-concept to operational AI evaluation tool.

Annotation tooling

A structured software tool guiding analysts through the FSMA process — presenting self-model candidates systematically, requiring explicit justification for each identification, and computing inter-rater agreement automatically — would lower the barrier for independent replication and distributed contribution.

Application and Positioning

FC as a component of AGI evaluation

Current AGI benchmarks measure what systems know and can do. FC measures what systems know about themselves. As agentic AI becomes more autonomous, self-modeling capacity becomes a safety-relevant property that existing benchmarks do not capture. A position paper proposing FCS as a standard evaluation component — alongside capability benchmarks — is in preparation.

FC-based design guidelines

A practical engineering document translating the self-model catalog into architectural design decisions: given a target application, which self-models are required? Which are prerequisite for which others? This turns FC from an evaluation instrument into a design tool.

Get Involved

These research directions are open for collaboration. They span philosophy of mind, cognitive science, AI engineering, and formal measurement theory — and several are well-scoped for bachelor's or master's theses. If you are interested, please contact fraber@fraber.de.

Open Questions & Research Agenda