What are the main challenges and open difficulties for Functional Consciousness?

The Short Answer

FC trades one set of hard problems for another. It sidesteps the metaphysical implications and the mathematical intractability of theories like IIT by anchoring on observable behavior and information theory, but this move introduces its own measurement difficulties: how to correctly and completely identify a system's self-models, how to draw clean boundaries between them, and how to distinguish reasoning outputs that genuinely predict the future from outputs that merely look like reasoning. These are hard problems, but they are engineering problems rather than conceptual dead ends, and several of them have principled partial solutions. Ongoing revisions and technical issues are tracked on the Updates & Errata page.

The Longer Discussion

1. Self-model identification: completeness

The FCS depends on enumerating a system's self-models — but there is no procedure that guarantees you have found all of them. A system may possess latent self-models that are never selected by the current attention mechanism, or self-models that only emerge under conditions not present in the evaluation. The 46-model SBR catalog introduced in this paper is a useful starting point derived from a single text (Virginia Woolf's The Mark on the Wall), not a closed ontology. The paper explicitly acknowledges that "choosing different base texts will naturally yield different catalogs of self-models."

A score computed over an incomplete enumeration is a lower bound, not a true FCS. In practice this means FC scores should be reported with an explicit accounting of which self-model domains were examined — as the paper does when it describes the Waymo evaluation in terms of specifically kinematic, actuator, and task/trajectory self-models, and not others. The problem is analogous to incomplete feature coverage in software testing: systematic coverage is vastly better than none, and the catalog is explicitly designed to grow through scientific discourse, but completeness cannot be assumed.

2. Self-model boundary: separability and the slicing problem

Self-models overlap. The paper directly acknowledges this: "a robot's spatial self-model may be indistinguishable from its body model or an external 3D map." Where one self-model ends and another begins is often not crisp, and different evaluators applying different taxonomies may produce different decompositions of the same system.

FC partially mitigates this through the additive structure of the no-cross-reasoning case: if boundaries are drawn differently but domain coverage is equivalent, the total score FCS_agent = ∑_j FCS(m_j) should converge to approximately the same value, since the underlying predictive information is fixed by the system's actual states. The score is, in this limit, invariant to how the pie is sliced as long as all the slices are included.

However, this reassurance only holds cleanly at the no-cross-reasoning extreme. In the cross-reasoning case — which is where the most interesting and highest-scoring systems live — reasoning power becomes a product rather than a sum: P_agent = ∏_j P(m_j). Here, how you draw the boundaries between self-models can affect the product structure, and therefore the total score. This is an area where FC's current formalization is principled but requires more careful operationalization in the cross-reasoning regime.

3. Reasoning quality: separating signal from noise

FC's reasoning power term P measures state-space expansion under inference — specifically the growth of the conclusion manifold, the distinguishable and reachable states produced through reasoning. But a system can expand its output state-space dramatically while doing very little genuine prediction. Fluency, verbosity, or confident confabulation all increase apparent output without improving predictive accuracy at all.

FC's definition, grounded in Bialek et al.'s predictive information framework, requires that P capture conclusions that actually reduce uncertainty about future states — not merely conclusions that are numerous or syntactically complex. The paper operationalizes this through information-theoretic state-space expansion, but measuring this in practice — especially for black-box systems — requires a ground-truth model of future states against which predictions can be evaluated. For the Waymo taxi, this is tractable (MPC and Monte Carlo simulations over a known physical state-space). For LLMs or humans evaluated through FSMA, it requires behavioral proxies that are necessarily coarser. This is an area where FC's intent is clear but its black-box operationalization remains an open research problem.

4. Measuring reasoning power: the P operationalization problem

Even setting aside the signal-vs-noise problem, measuring state-space expansion requires knowing the system's state before and after a reasoning cycle. The paper is explicit about the methodological boundary here: for white-box agents (Waymo, Roomba, ACT-R), P can be calculated from architectural specifications. For black-box systems — Generative Agents, LLMs, humans — FSMA can only estimate P through behavioral proxies, yielding "conceptual breadth B rather than absolute, rigorous scores." The radar charts introduced for black-box benchmarking represent this coarser mode of analysis explicitly.

This is a limitation FC shares with most empirical approaches to cognition in systems with opaque internals. The paper's response is honest: these estimates are described as order-of-magnitude approximations with wide confidence intervals, intended to demonstrate the framework's discriminatory power rather than to produce precise constants. The companion repository is where tighter estimates, produced by domain experts with architectural access, would live.

5. The attention mechanism: black box or mutual information filter?

FC relies on an attention mechanism to select which self-model is currently active and make its contents available to global reasoning. A natural objection is that this mechanism is simply assumed — an unexplained precondition that FC borrows without grounding. On closer inspection, however, the objection is weaker than it appears.

Research on attention consistently finds that it is not allocated arbitrarily: it is drawn to signals that carry high information relative to the system's current predictive model — stimuli that are surprising but learnable. This is the core insight behind saliency models (Itti & Koch), precision-weighted prediction error in active inference (Friston), and curiosity-driven learning in information-theoretic frameworks (Schmidhuber, Bialek et al.). Attention, in other words, operates as a mutual information filter: a self-model is illuminated when its current state carries high mutual information with the system's future states, or when incoming signals deviate substantially from that self-model's predictions.

This is not incidental to FC — it is continuous with FC's own foundations. The same predictive information framework used to define R(m) also provides a principled account of why attention selects one self-model over another at a given moment. The attention filter need not be treated as a black box bolted onto FC from outside; it can be understood as the system's ongoing computation of where the highest-value self-model update currently lies.

The honest qualification is that this account is better established for perceptual attention — saliency, orienting responses, surprise-driven focus — than for metacognitive attention specifically, where the selection is between internal self-models rather than external stimuli. Extending the mutual information account fully to metacognitive selection remains an open empirical question, but the theoretical continuity is already there.

6. Cross-system comparability

The FCS is internally consistent: a higher score within a given system means richer self-models and more powerful reasoning over them. It is less clear that absolute FCS values are directly comparable across radically different architectures. The paper itself is forthright about this: scores are presented as "order-of-magnitude estimates to illustrate the discriminatory power of the FCS metric" rather than precise cross-system constants, and the evaluations of different systems rest on different boundary assumptions and confidence intervals of roughly ±an order of magnitude.

Whether the "bits" of predictive mutual information in a biological neural system are commensurable with those estimated from transformer output distributions is an open question. For now, cross-system comparisons should be treated as revealing broad differences in scale and cognitive shape — which is already a significant advance over purely qualitative comparison — rather than as precise numerical claims.

7. The scope of FSMA inferences: how diverse is diverse enough?

FSMA infers the presence of a self-model from behavioral evidence using an abductive logic: if a system consistently produces outputs that can only be explained by positing a self-model as a necessary precondition, then that self-model exists functionally. If the system consistently computes the correct output across diverse inputs, the functional model is present by definition.

This is a principled and deliberate move, grounded in functionalism and access consciousness. But it shifts the difficulty rather than eliminating it: the inference is only as strong as the diversity of the behavioral evidence on which it rests. "Consistently produced across diverse outputs" is the operative phrase in Definition 4 — and how diverse is diverse enough is not yet formally specified.

A system might produce correct self-referential outputs across all observed cases while actually relying on a much shallower representation that happens to generalize within the test distribution but would fail outside it. FSMA's identification of the minimal self-model required for the observed outputs is designed to guard against overclaiming, but the inverse problem — underclaiming because the test inputs were not sufficiently varied — is harder to guard against. This is not a flaw in FSMA's logic; it is an inherent challenge of abductive inference from finite behavioral evidence, and it is what makes the design of evaluation suites (like the stream-of-consciousness corpus) a non-trivial methodological choice. Developing principled criteria for input diversity — analogous to coverage criteria in software testing — is an important open research direction for FSMA.

How FC's difficulties compare to those of the Big Five

It is worth situating these challenges against the open problems faced by the major theories FC draws on. The comparison is instructive because it clarifies what kind of difficulty FC faces.

Theory	Core open difficulty	Nature of the problem
IIT	Computing Φ is NP-hard in the general case; exponential in the number of elements [1]	Mathematical — there is no known tractable algorithm; FC's Φ_FCS analogue is directly computable for white-box systems
GWT	No agreed operationalization of "global broadcast" outside specific neural architectures	Architectural — the theory is coupled to a substrate FC does not require
HOT	No agreed criterion for when a higher-order representation is "actual" vs. "dispositional"	Conceptual — the theory is underdetermined at its core
PP	The boundary between "active inference" and generic optimization is disputed	Theoretical — the framework is almost too general to falsify
AST	The causal claim (attention schema causes the illusion of experience) resists direct empirical verification	Empirical — the explanatory mechanism is not independently testable
FC	Completeness of self-model enumeration; P operationalization for black-box systems; FSMA input diversity	Measurement — hard in practice, tractable in principle

The key observation is that FC's difficulties are primarily measurement difficulties — problems of empirical access, boundary-drawing, and evaluation design. They are the difficulties of a methodology, not of a theory. IIT's NP-hardness is a mathematical result that no engineering effort will dissolve — and FC's own Φ_FCS analogue is offered precisely because it trades formal minimality for computational directness. HOT's dispositional ambiguity is a conceptual gap at the theory's core. GWT's substrate-dependence is an architectural commitment.

FC's difficulties, by contrast, are the kind that improve with better tools: richer and more diverse evaluation corpora for FSMA, tighter architectural access for white-box scoring, and more refined criteria for what counts as genuinely predictive state-space expansion. That is a research program, not a philosophical impasse — and the confidence intervals FC already reports honestly reflect where that program currently stands.

← Back to FAQ