Call for Papers
Call for Papers
We welcome submissions from across domains that engage with the question: how should we evaluate the understanding of AI methods? Across all topics, we welcome papers that assess the validity of existing methods or propose new ones. Topics of interest include (but are not limited to):
Mechanistic interpretability as a way to measure model understanding (e.g. methods applicable to LLMs, code models, image models, video models, etc.).
Theory-driven evaluation metrics and frameworks (e.g. motivated by causal inference) for assessing a model's understanding of its domain.
Human-computer interaction (HCI) perspectives on evaluating world models (e.g. human-in-the-loop evaluation).
Evaluating understanding in reinforcement learning agents (e.g. by assessing whether agents' internal representations encode real-world mechanisms).
Cognitive science-inspired evaluations (e.g. drawing upon insights from human cognitive psychology and neuroscience).
Applications in scientific domains (e.g. measuring whether foundation models trained on physics data recover physical laws).
Benchmarks and datasets, developed specifically to assess a generative model's understanding of real-world principles.
Position papers and philosophical perspectives addressing what it means for algorithms to genuinely understand the real world.
Important note on scope: Our workshop is focused on techniques for evaluating understanding. Therefore, submissions focused solely on applying existing evaluation methods, without clearly engaging with their validity or proposing new ways to measure understanding, are outside the scope of the workshop. Examples of submissions that would not fit:
Papers focused on improving predictive performance of models without addressing how these improvements relate to understanding real-world mechanisms.
Reinforcement learning papers about world models that take common evaluation measures (e.g. planning or decision-making abilities) as a given without critically assessing whether these correspond to an understanding of real-world mechanisms.
Papers that use existing metrics to build more coherent world models without engaging with the validity of these metrics or proposing new ones.
Submission instructions
Submission: Submissions can be made via OpenReview here: https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models. We welcome submissions of original, unpublished material, as well as work that is currently under review (i.e. has been submitted but not yet accepted elsewhere). It is okay to submit a paper to multiple workshops.
Page limit: Papers should be up to 4 pages, excluding references and supplementary materials. A fifth page can be added upon acceptance.
Formatting: Please use the style files here.
Double-blind reviews: Authors should anonymize their submissions to ensure a double-blind review process.
Dual-submission policy: Accepted papers will be considered non-archival. We welcome ongoing and unpublished work, and accepted papers at this workshop can be considered for future publication at other conferences. We welcome submissions that are currently under review at other venues, including NeurIPS 2025. While we cannot consider work that has already been accepted for publication at another venue, we will allow papers that are accepted after they are submitted to our workshop.
Timeline
Submission open: April 16, 2025
Submission deadline: May 20, 2025, AoE
Reviews due: June 3, 2025, AoE
Decision notification: June 9, 2025, AoE