Cost-optimal Sequential Testing via Doubly Robust Q-learning

发布时间：2026-04-22 浏览次数：10

主题:Cost-optimal Sequential Testing via Doubly Robust Q-learning

题目：基于双重稳健 Q-学习的成本最优序列化测试

时间: 2026年5月19日上午10:00-11:00

地点: 管科楼第二教室

主讲人: Doudou Zhou, National University of Singapore

Bio: Doudou Zhou is an Assistant Professor of Statistics & Data Science at the National University of Singapore. His research lies at the intersection of statistics, machine learning, and artificial intelligence, with a focus on statistical learning theory, multimodal data integration, electronic health records, and the evaluation of large language models. He develops principled methods for learning from noisy, heterogeneous, and partially observed data, with applications in biomedicine and modern AI systems.

照片：

Abstract:

Clinical decision-making often involves selecting tests that are costly, invasive, or time-consuming, motivating individualized, sequential strategies for what to measure and when to stop ascertaining. We study the problem of learning cost-optimal sequential decision policies from retrospective data, where test availability depends on prior results, inducing informative missingness. Under a sequential missing-at-random mechanism, we develop a doubly robust Q-learning framework for estimating optimal policies. The method introduces path-specific inverse probability weights that account for heterogeneous test trajectories and satisfy a normalization property conditional on the observed history. By combining these weights with auxiliary contrast models, we construct orthogonal pseudo-outcomes that enable unbiased policy learning when either the acquisition model or the contrast model is correctly specified. We establish oracle inequalities for the stage-wise contrast estimators, along with convergence rates, regret bounds, and misclassification rates for the learned policy. Simulations demonstrate improved cost-adjusted performance over weighted and complete-case baselines, and an application to a prostate cancer cohort study illustrates how the method reduces testing cost without compromising predictive accuracy.

学术报告