NeurIPS 2026 Workshop

Structuring Reasoning for Interpretability and Control

Bringing together efficient reasoning, mechanistic interpretability, and AI safety to build principled structure into Large Reasoning Models.

Acronym StRICt
Venue NeurIPS 2026
Format Full-day Workshop
Location Paris, France

The reasoning process is opaque.

Large Reasoning Models generate long, unstructured traces — and we can't see inside them.

LRMs solve complex problems by allocating substantial computation before answering. Rather than producing a response directly, they generate a long reasoning trace and condition the final answer on it. These models have achieved strong performance gains in mathematical reasoning, code generation and multi-step inference.

Yet they introduce a new and largely unaddressed challenge: their reasoning process is unstructured, and thus neither interpretable nor controllable. Models output monolithic sequences of thousands of tokens, do not explicitly segment or annotate reasoning steps during generation, and their internal representations do not expose clean boundaries between reasoning stages.

We argue these challenges share a common root: the absence of principled structure in the reasoning process. Revealing structure enables observation — and observation enables control.

Inefficiency

Reasoning traces are verbose and redundant, leading to substantial computational waste that scales with model capability.

🔭

Unguided Exploration

Solution-space exploration is unmonitored and unguided, limiting adaptability and robustness across problem types.

🔗

Unfaithful Generation

Generated text is sometimes unfaithful to the model's internal computation, undermining trust in chain-of-thought explanations.

🛡

Novel Attack Surface

The unstructured thinking process makes LRMs more vulnerable than standard LLMs to jailbreaking and adversarial manipulation.

Why a workshop? Work on these problems is scattered across ML communities — efficient reasoning, mechanistic interpretability, and AI safety each has its own vocabulary and venues. No single community owns the question of reasoning structure. StRICt is designed to build the shared vocabulary these efforts currently lack, with a program centred on cross-community interaction rather than back-to-back talks.

Topics of Interest

StRICt welcomes submissions exploring the structure of reasoning traces at both the text level and hidden-state level, and how that structure can enhance the controllability, safety, and efficiency of LRMs.

01

Reasoning Step Segmentation

Step granularity, taxonomies of reasoning behaviours, and evaluation of step-level decomposition methods.

02

Observability of LRM Generation

Online monitoring of reasoning traces, faithfulness of chain-of-thought, detection of redundant or unproductive reasoning.

03

Hidden-State Structure of Reasoning

Probing methods for reasoning dynamics, alignment between text-level and latent-level structure.

04

Structure-Informed Control

Early-exit strategies grounded in reasoning structure, guided exploration, diversity of reasoning search, interpretable exploration.

05

Security & Safety of Reasoning

Online monitoring for unsafe behaviour before output is produced; latent-space probing for adversarial states; monitorability and CoT obfuscation; attack surfaces specific to LRM reasoning traces.

06

Benchmarks & Evaluation

Datasets and metrics for reasoning step quality, control quality, and faithfulness of reasoning chains.

Questions StRICt Will Address

Q1 Representation of computation. How should reasoning traces be segmented? What is the right level of abstraction, and does it align with the model's internal computation?
Q2 Faithfulness of generation. Does the generated text align with the model's internals? Which parts of the reasoning process are most — and least — faithful to the underlying computation?
Q3 Controllability of the reasoning chain. How can we monitor models effectively? Which control methods work best — test-time scaling, activation steering, structured prompting? Does segmentation affect controllability?
Q4 Safety of the reasoning chain. Can unsafe reasoning behaviour be detected before output is produced, using step-level or hidden-state monitoring? Is the reasoning trace a faithful safety signal, or can it be obfuscated? What does robust online monitoring look like under adversarial pressure?

Call for Papers

We invite submissions on all topics listed above. We welcome both novel research contributions and position papers that articulate new perspectives on structuring, interpreting, or controlling LRM reasoning.

July 18, 2026, AoE Submission Portal Opens
August 29, 2026, AoE Paper Submission Deadline
September 29, 2026, AoE Author Notification
October 20, 2026, AoE Camera-Ready Deadline
NeurIPS'26 Workshop Day

Submission Format

  • NeurIPS 2026 paper format
  • Anonymised double-blind review
  • Non-archival proceedings

Tracks

  • Short paper track - 4-page short papers (+ unlimited references)
  • Full paper track - 9-page full papers (+ unlimited references)

Review Criteria

  • Relevance to workshop themes
  • Technical quality and rigour
  • Potential to stimulate cross-community discussion
  • Novelty and impact
Submit via OpenReview

Keynote Speakers

To be announced.

?
Organizer TBA
Affiliation TBA
?
Organizer TBA
Affiliation TBA
?
Organizer TBA
Affiliation TBA
?
Organizer TBA
Affiliation TBA

Organizers

Organizer avatar
Yannis Belkhiter
IBM Research Dublin & Trinity College Dublin
Organizer avatar
Ibrahim Malik
IBM Research Dublin & Trinity College Dublin
Organizer avatar
Lisa Alazraki
Imperial College London
Organizer avatar
Greta Dolcetti
University of Venice
?
Organizer TBA
Affiliation TBA
?
Organizer TBA
Affiliation TBA
?
Organizer TBA
Affiliation TBA
?
Organizer TBA
Affiliation TBA

Program Committee Members

To be announced.

?
PC TBA
Affiliation TBA
?
PC TBA
Affiliation TBA
?
PC TBA
Affiliation TBA
?
PC TBA
Affiliation TBA