MBZUAI-Berkeley Joint Workshop on Emerging Directions of Machine Learning

Program (March 12, Wednesday, Excecutive Theatre)

9:00 - 9:20 am

Welcome coffee

9:20 am	Opening Remarks
	Timothy Baldwin (MBZUAI)

9:30 am	Gradient Equilibrium in Online Learning: Theory and Applications
	Michael I. Jordan (UC Berkeley)
	We present a new perspective on online learning that we refer to as gradient equilibrium: a sequence of iterates achieves gradient equilibrium if the average of gradients of losses along the sequence converges to zero. In general, this condition is not implied by, nor implies, sublinear regret. It turns out that gradient equilibrium is achievable by standard online learning methods such as gradient descent and mirror descent with constant step sizes (rather than decaying step sizes, as is usually required for no regret). Further, as we show through examples, gradient equilibrium translates into an interpretable and meaningful property in online prediction problems spanning regression, classification, quantile estimation, and others. Notably, we show that the gradient equilibrium framework can be used to develop a debiasing scheme for black-box predictions under arbitrary distribution shift, based on simple post hoc online descent updates. We also show that post hoc gradient updates can be used to calibrate predicted quantiles under distribution shift, and that the framework leads to unbiased Elo scores for pairwise preference prediction. [with Anastasios Angelopoulos and Ryan Tibshirani].

10:00am	Fair Allocation in Dynamic Mechanism Design
	Alireza Fallah (UC Berkeley)
	We consider a dynamic mechanism design problem where an auctioneer sells an indivisible good to groups of buyers in every round, for a total of T rounds. The auctioneer aims to maximize their discounted overall revenue while adhering to a fairness constraint that guarantees a minimum average allocation for each group. We begin by studying the static case (T=1) and establish that the optimal mechanism involves two types of subsidization: one that increases the overall probability of allocation to all buyers, and another that favors the groups which otherwise have a lower probability of winning the item. We then extend our results to the dynamic case by characterizing a set of recursive functions that determine the optimal allocation and payments in each round. Notably, our results establish that in the dynamic case, the seller, on the one hand, commits to a participation bonus to incentivize truth-telling, and on the other hand, charges an entry fee for every round. Moreover, the optimal allocation once more involves subsidization, which its extent depends on the difference in future utilities for both the seller and buyers when allocating the item to one group versus the others. Finally, we present an approximation scheme to solve the recursive equations and determine an approximately optimal and fair allocation efficiently. Based on joint work with Annie Ulichney and Michael I. Jordan.

10:30am	Coffee Break

11:00am	Anytime-valid off-policy inference for contextual bandits
	Ian Waudby-Smith (UC Berkeley)
	Contextual bandit algorithms are ubiquitous tools for active sequential experimentation in healthcare and the tech industry. They involve online learning algorithms that adaptively learn policies over time to map observed contexts to actions in an attempt to maximize stochastic rewards. This adaptivity raises interesting but hard statistical inference questions, especially counterfactual ones: for example, it is often of interest to estimate the properties of a hypothetical policy that is different from the logging policy that was used to collect the data -- a problem known as "off-policy evaluation" (OPE). Using modern martingale techniques, we present a comprehensive framework for OPE inference that relax unnecessary conditions made in some past works, significantly improving on them both theoretically and empirically. Importantly, our methods can be employed while the original experiment is still running (that is, not necessarily post-hoc), when the logging policy may be itself changing (due to learning), and even if the context distributions are a highly dependent time-series (such as if they are drifting over time). More concretely, we derive confidence sequences for various functionals of interest in OPE. These include doubly robust ones for time-varying off-policy mean reward values, but also confidence bands for the entire cumulative distribution function of the off-policy reward distribution. All of our methods (a) are valid at arbitrary stopping times (b) only make nonparametric assumptions, (c) do not require importance weights to be uniformly bounded and if they are, we do not need to know these bounds, and (d) adapt to the empirical variance of our estimators. In summary, our methods enable anytime-valid off-policy inference using adaptively collected contextual bandit data. This is joint work with Lili Wu, Aaditya Ramdas, Nikos Karampatziakis, and Paul Mineiro.

11:30am	On the Identifiability of ODEs/SDEs for Causal Inference
	Mingming Gong (MBZUAI & Unimelb)
	ODEs/SDEs have recently gained significant attention in machine learning and causal inference. However, the theoretical aspects, for example, identifiability and asymptotic properties of statistical estimation are still obscure. In this presentation, I will present our recent results on identifiability of linear ODE/SDEs from observational data. These identifiability conditions are crucial in causal inference using linear ODE/SDEs as they enable the identification of the post-intervention distributions from its observational distribution.

12:00pm	Lunch

2:00pm	Improving conditional coverage of conformal prediction methods
	Maxim Panov (MBZUAI)
	We present and compare two new methods for generating prediction sets within the conformal prediction framework, each addressing the limitations of traditional approaches by targeting improved conditional coverage. The first method builds upon quantile regression to estimate the conditional quantile of conformity scores, which are then adjusted to account for local data structure. The second method integrates the flexibility of conformal methods with estimates of the conditional distribution label distribution . By extending the framework of probabilistic conformal prediction, this approach achieves approximately conditional coverage through prediction sets that adapt effectively to the behavior of the predictive distribution, even under high heteroscedasticity. Non-asymptotic bounds are derived to quantify conditional coverage error for both approaches. Extensive simulations demonstrate that each method significantly improves over traditional techniques, paving the way for more robust and adaptable prediction set generation across diverse applications.

2:30pm	Leveraging Optimization for Adaptive Attacks against Content
	Nils Lukas (MBZUAI)
	Large Language Models (LLMs) can be misused to spread online spam and misinformation. Content watermarking deters misuse by hiding a message in model-generated outputs, enabling their detection using a secret watermarking key. Robustness is a core security property, stating that evading detection requires (significant) degradation of the content’s quality. Many LLM watermarking methods have been proposed, but robustness is tested only against non-adaptive attackers who lack knowledge of the watermarking method and can find only suboptimal attacks. We formulate the robustness of LLM watermarking as an objective function and propose preference-based optimization to tune adaptive attacks against the specific watermarking method. Our evaluation shows that (i) adaptive attacks substantially outperform non-adaptive baselines. (ii) Even in a non-adaptive setting, adaptive attacks optimized against a few known watermarks remain highly effective when tested against other unseen watermarks and (iii) optimization-based attacks are practical and require less than seven GPU hours. Our findings underscore the need to test robustness against adaptive attackers.

3:00pm

Coffee Break

3:30pm	Graph Neural Networks for Materials Discovery
	Martin Takac (MBZUAI)
	Machine learning methods—particularly graph neural networks (GNNs)—have emerged as powerful tools for accelerating the search for novel catalytic materials. By representing each material as a graph where atoms are nodes and bonds are edges, GNNs can capture local atomic interactions while still scaling to realistic systems. However, designing a GNN architecture that both assimilates global context and respects essential physical symmetries is nontrivial. In this talk, we provide an overview of GNNs for materials discovery and explain how deeper message-passing architectures can improve expressiveness yet risk “over-smoothing”—a phenomenon where atom-level embeddings become indistinguishable. We then highlight PaiNN as one promising approach that ensures rotational and translational equivariance. By incorporating these symmetries directly into the model, PaiNN preserves physically meaningful geometric information and yields more robust and interpretable predictions. Finally, we focus on predicting the Density of States (DOS), a property that reveals the distribution of electronic states and is crucial for identifying active sites in catalytic reactions. Accurate DOS predictions help pinpoint where electrons can be exchanged during catalysis, guiding experimental efforts toward more efficient and sustainable catalysts. Through concrete examples, we illustrate how GNN-based DOS modeling can bridge the gap between quantum-mechanical theory and large-scale materials discovery.

4:00pm	GFlowNets: An Introduction and Recent Advances
	Salem Lahlou (MBZUAI)
	Generative Flow Networks offer a framework for sampling from reward-proportional distributions in combinatorial and continuous spaces. By leveraging flow conservation principles, GFNs enable diverse exploration where traditional methods like MCMC struggle. This talk introduces the theoretical foundations of GFNs and highlights their practical applications in molecular design, protein structure prediction, and Bayesian network discovery. Special emphasis will be placed on recent advances in applying GFNs to improve the systematic exploration capabilities of large language models.

4:30pm-5:30pm	Panel Discussion
	Michael I. Jordan, Ian Waudby-Smith, Fakhri Karray, Michalis Vazirgiannis, Martin Takac, Samuel Horvath, Mingming Gong