时 间 |
内容 |
地点 |
14:30-20:00 |
会议代表报到 |
中州颐和酒店 |
18:00-20:00 |
晚宴 |
中州颐和酒店 |
7月8日 |
开幕式 |
时 间 |
主持人 |
题 目 |
08:30-08:50 |
解俊山 |
开幕式 |
08:50-09:00 |
合 影 |
学术报告 |
时 间 |
主持人 |
报告人 |
题 目 |
09:00-09:40 |
梁汉营 |
何书元 |
概率统计与国民素质 |
09:50-10:30 |
王启华 |
A robust fusion-extraction procedure with summary statistics in the presence of biased sources |
10:30-10:40 |
茶 歇 |
10:40-11:20 |
王海斌 |
邓 柯 |
Model-Based Spatial Reconstruction of Large-Scale Biomolecules via Bayesian Inference of a Hierarchical Spatial Model |
11:20-12:00 |
朱文圣 |
Augmented Concordance Matched Learning for Estimating Optimal Individualized Treatment Regimes |
12:00-14:00 |
午餐(中州颐和酒店) |
15:00-17:30 |
二楼会议室 |
18:00-20:00 |
晚餐 |
7月9日 学术报告 |
时 间 |
主持人 |
报告人 |
题 目 |
08:30-09:10 |
郑 晨 |
王德辉 |
若干自回归过程的建模方法研究 |
09:10-09:50 |
薛留根 |
Two-stage estimation and bias-corrected empirical likelihood in a partially linear single-index varying-coefficient model |
09:50-10:00 |
茶 歇 |
10:00-10:30 |
薛留根 |
杨晓慧 |
多元循环双重机器学习建模及在异质性因果效应估计中的应用 |
10:30-11:00 |
韩子非 |
Approximate reference priors for Gaussian random fields |
11:00-11:30 |
李哲源 |
Automatic Search Intervals for the Smoothing Parameter in Penalized Splines |
12:00-14:00 |
午餐(中州颐和酒店) |
9日下午,参会代表离会 |
报告摘要:讲述如何用概率统计中的基本定理指导人们的行为, 如何克服贪婪与愤怒, 避免小概率事件带来的灾害.
报告人简介:何书元,现任首都师范大学教授, 科研兴趣是时间序列分析, 不完全数据的统计分析.曾任北京大学数学鱼虾蟹游戏
教授, 教育部统计学教学指导委员会副主任委员, 中国数学会概率统计学会理事长.
A robust fusion-extraction procedure with summary statistics in the presence of biased sources
报告摘要:Information from multiple data sources is increasingly available. However, some data sources may produce biased estimates due to biased sampling, data corruption, or model misspecification. This calls for robust data combination methods with biased sources. In this paper, a robust data fusion-extraction method is proposed. In contrast to existing methods, the proposed method can be applied to the important case where researchers have no knowledge of which data sources are unbiased. The proposed estimator is easy to compute and only employs summary statistics, and hence can be applied to many different fields, e.g., meta-analysis, Mendelian randomization, and distributed systems. The proposed estimator is consistent even if many data sources are biased and is asymptotically equivalent to the oracle estimator that only uses unbiased data. Asymptotic normality of the proposed estimator is also established. In contrast to the existing meta-analysis methods, the theoretical properties are guaranteed even if the number of data sources and the dimension of the parameter diverges as the sample size increases. Furthermore, the proposed method provides a consistent selection for unbiased data sources with probability approaching one. Simulation studies demonstrate the efficiency and robustness of the proposed method empirically. The proposed method is applied to a meta-analysis data set to evaluate the surgical treatment for moderate periodontal disease and to a Mendelian randomization data set to study the risk factors of head and neck cancer.
数学与系统科学研究院研究员,博士生导师,国家杰出青年基金获得者,教育部长江学者奖励计划特聘教授,中科院“百人计划”入选者。曾在北京大学、香港大学任教,先后访问加拿大、美国、德国及澳大利亚10多所世界一流大学。主要从事复杂数据经验似然统计推断、缺失数据分析、高维数据统计分析、大规模数据分析等方面的研究, 出版专著三部,在The Annals of Statistics, JASA及Biometrika等国际重要刊物发表论文150余篇, 部分工作已产生持久的学术影响。曾主持国家自然科学基金委国家杰出青年基金项目、重点项目、多项面上项目,作为核心骨干成员先后参加了两项国家自然科学基金创新群体项目及一项国家重大研发计划项目。是高维统计分会理事长, 生存分析分会副理事长,中国现场统计研究会常务理事,中国概率统计学会常务理事,曾任或现任《中国科学》(中英文版)(2005-2012)、Electronic Research Archive、Ann. Inst. Stat. Math、Biostatistics & Epidemiology及《应用数学学报》英文版等刊物及《现代数学基础丛书》与《统计与数据科学丛书》的编委。
Model-Based Spatial Reconstruction of Large-Scale Biomolecules via Bayesian Inference of a Hierarchical Spatial Model
报告摘要:Revealing the spatial organization of biomolecules and characterizing their spatial distribution in cells and tissues have long been recognized as importance problems in biomedical research. With rapid advances of DNA sequencing technologies in recent years, creative sequencing-based experimental assays, e.g., Hi-C and DNA microscopy, have been invented to reveal the spatial properties of large-scale biomolecules in a high-throughput and high-resolution manner. A typical experiment based on these technologies produces a count matrix to record the contact frequencies among molecules of interest, which are closely associated to their spatial distances, allowing us to reconstruct the spatial organization of large-scale biomolecules via data analysis. There is a great appeal to develop statistically rigorous and computationally scalable methods for this important problem. In this study, we fill in this gap with a novel method named HiSpa. Equipped with a hierarchical spatial model, HiSpa utilizes the idea of multi-scale modelling to reduce the computation complexity from O(n2) to O(n3/2) with little loss on the quality of the reconstructed spatial structure. Advanced Monte Carlo strategies are developed for efficient Bayesian inference of HiSpa. Superiority of HiSpa over existing methods is demonstrated by simulation studies and real data applications.
报告人简介:邓柯,2003年本科毕业于北京大学应用数学专业,2008年获北京大学统计学博士学位,同年进入哈佛大学统计系从事研究工作,历任博士后、副研究员。2013年进入清华大学工作,历任助理教授、副教授、长聘副教授。主要从事Bayes统计和统计计算方面的研究,并致力推动统计学与生物医学、人工智能、人文社科等领域的前沿交叉。在 JRSS-B, JASA, Biometrika, AoAS, PNAS, Nature Communications, Bioinformatics, IEEE Transaction on Signal Processing, ACL 等学术期刊和会议发表学术论文40余篇,主持及共同主持多项美国和中国的国家级研究项目。2014年入选国家级青年人才项目,2015年当选中国人工智能学会智慧医疗专业委员会副主任委员,2016年荣获“科学中国人年度人物”荣誉称号,2017年当选中国现场统计研究会计算统计分会理事长,2018年当选国际计算统计学会亚太地区分会委员、中国青年统计学家协会副会长,2019年担任北京“智源人工智能研究院”研究员,2020年荣获“世界华人数学家国际联盟”最佳论文奖和“中国数字人文大会”最佳论文奖。现任统计学国际期刊 Statistica Sinica 副主编,以及《应用数学与力学》、《应用概率统计》、《统计与精算》、《数字人文》等国内期刊编委。
Augmented Concordance Matched Learning for Estimating Optimal Individualized Treatment Regimes
报告摘要:Personalized medicine has recently received increasing attention because of the significant heterogeneity of patient responses to the same medication. The estimation of optimal individualized treatment regime or individualized treatment rule is an important part of personalized medicine. Individualized treatment regimes are designed to recommend treatment decisions to patients based on their individual characteristics and to maximize the overall clinical benefit to the patient. However, most of the existing statistical methods are mainly concerned with the estimation of optimal individualized decision rules for the two categories of treatment options and rely heavily on data from randomized controlled trials. There has been a relative lack of research work on the selection of multicategory treatment options in real-world settings. We address this challenge and propose a machine learning approach (ACML) to estimate optimal multi-category treatment regimes. This new learning approach allows for more accurate assessment of individual treatment response and alleviation of confounding, more importantly, ACML is doubly robust, efficient and easy to interpret. We first introduce the concordance-based value function that measures weighted concordance for each patient by matching imputation. We then propose a novel surrogate loss and employ an angle-based method to maximize the concordance-based value function that directly handles the problem of optimization with multicategory treatment options. Furthermore, an extension of ACML can be applied to ordinal treatment settings. The theoretical results show that proposed method is doubly robust. We further obtain that the resulting estimator of the treatment rule is consistent. Through a large number of simulation studies, we demonstrate that ACML outperforms existing methods. Lastly, the proposed method is illustrated in an analysis of AIDS clinical trial data.
教授。2008-2010年在耶鲁大学做博士后研究,2015-2017年访问北卡大学教堂山分校。现兼任中国现场统计研究会计算统计分会副理事长,中国现场统计研究会数据科学与人工智能分会秘书长,中国概率统计学会副秘书长,吉林省现场统计研究会秘书长等。主要从事统计学的方法与应用研究,研究方向为生物统计学和生物信息学。在统计学国际顶级期刊Journal of the American Statistical Association (JASA)、医学图像著名期刊NeuroImage等发表学术论文多篇。主持并完成国家自然科学基金项目。
报告摘要:近年来,整数值数据备受专家学者的关注,本报告主要介绍了整数值线性自回归过程、门限整数值自回归过程以及随机系数自回归过程的建模和参数估计问题. 首先, 对于整数值线性自回归(integer-valued autoregressive, INAR) 过程, 我们采用分位回归方法给出过程的参数估计, 讨论了估计量的渐近性质. 利用数值模拟验证了估计方法的有效性与稳健性. 并将其应用于失业人口数据中, 进一步验证了估计方法的可靠性. 其次, 基于二项稀疏算子和负二项稀疏算子, 我们提出一个一阶门限整值自回归过程 (BNBTINAR(1)). 讨论了该过程的严平稳性、遍历性、三阶矩的存在性等统计性质. 给出了该模型参数的条件最小二乘估计和条件极大似然估计, 同时给出了估计量的相合性与渐近正态性. 对于门限变量 �� 的估计, 我们给出一种新的估计算法 (SIS 算法). 并应用数值模拟以及新冠病毒数据证明了估计量的有效性以及模型的优越性. 最后, 我们提出了一个随机系数由数据和协变量驱动的随机系数自回归 (RCAR(1))过程, 并给出过程的遍历性条件. 同时研究了 RCAR(1) 过程的条件最小二乘估计、条件极大似然估计和条件分位回归估计, 以及三种估计量的相合性与渐近正态性. 通过数值模拟验证了估计量的有效性与稳健性. 并用该模型拟合了一组恒生指数数据, 验证了该模型的有效性.
院长、教授、博士生导师、享受国务院政府津贴专家、宝钢优秀教师奖获得者、教育部新世纪优秀人才、2015年度吉林省长白山学者特聘教授、高等学校统计学类专业教学指导委员会委员(2013-2022),吉林省第四批高级专家,吉林省高等学校首批学科领军教授、吉林省第六批拔尖创新人才第一层次人选 、吉林省“第十二批有突出贡献的中青年专业技术人才。
王德辉教授主要从事时间序列分析、风险理论分析、保险精算等方面的研究,发表 SCI论文30余篇,主持(包括结题)国家自然科学基金面上项目7 项(含国家自然科学基金重点项目子项目1项),博士学科点专项基金1项,获得“2015年度高等学校科学研究优秀成果奖(科学技术)教育部自然科学二等奖”一项,“第十一届全国统计科研优秀成果二等奖”一项,“吉林省自然科学技术成果二等奖”一项,“吉林省自然科学技术成果三等奖”一项,2019年获吉林省科学技术奖二等奖(自然科学)。
Two-stage estimation and bias-corrected empirical likelihood in a partially linear single-index varying-coefficient model
报告人:薛留根 (河南大学)
报告摘要:In this talk, we study the estimation and empirical likelihood (EL) of the parameters of interest in a partially linear single-index varying-coefficient model. A two-stage method is presented to estimate the regression parameters and the coefficient functions. The asymptotic distributions of the proposed estimators are obtained. Meanwhile, a bias-corrected EL ratio for the regression parameters is proposed. It is shown that the ratio is asymptotically standard chi-squared. The result can be directly used to construct the EL confidence regions of the regression parameters. Simulation studies are carried out to evaluate the finite sample behavior of the proposed method. An application example of a real data set is given.
报告人简介:薛留根,北京工业大学教授,河南大学特聘教授,博士生导师。现兼任中国现场统计研究会生存分析分会副理事长。研究方向为:非参数统计与数据分析。主要研究兴趣包括:非参数与半参数模型的统计推断、复杂数据统计分析与建模、经验似然等。主持国家和省部级科研项目15项,其中连续5次获国家自然科学基金资助。出版著作8部,其中3部专著。在包括《Journal of the American Statistical Association》、《Journal of the Royal Statistical Society,Series B》、《The Annals of Statistics》、《Biometrika》等学术期刊上发表论文260余篇,其中3篇为高被引论文。以第一完成人获教育部自然科学二等奖1项;以第一完成人获全国统计科学研究优秀成果一等奖1项。培养博士研究生20人,硕士研究生45人;在指导的研究生中,1人获北京市优秀博士学位论文及全国优秀博士学位论文提名奖,1人获全国统计科学研究优秀成果博士学位论文二等奖。
Approximate reference priors for Gaussian random fields
报告摘要:Reference priors are theoretically attractive for the analysis of geostatistical data since they enable automatic Bayesian analysis and have desirable Bayesian and frequentist properties. But their use is hindered by computational hurdles that make their application in practice challenging. In this work, we derive a new class of default priors that approximate reference priors for the parameters of some Gaussian random fields. It is based on an approximation to the integrated likelihood of the covariance parameters derived from the spectral approximation of stationary random fields. This prior depends on the structure of the mean function and the spectral density of the model evaluated at a set of spectral points associated with an auxiliary regular grid. In addition to preserving the desirable Bayesian and frequentist properties, these approximate reference priors are more stable, and their computations are much less onerous than those of exact reference priors. Unlike exact reference priors, the marginal approximate reference prior of correlation parameter is always proper, regardless of the mean function or the smoothness of the correlation function. This property has important consequences for covariance model selection. An illustration comparing default Bayesian analyses is provided with a dataset of lead pollution in Galicia, Spain.
副教授。2017年毕业于德州大学圣安东尼奥分校,曾在波士顿Vertex Pharmaceuticals任高级统计师三年。主要研究方向为贝叶斯统计、空间统计与生物统计。在American Statistician, Journal of Statistical Software, Scandinavian Journal of Statistics, Statistics in Medicine等期刊发表论文十余篇,主持国家自然科学青年基金一项。
Automatic Search Intervals for the Smoothing Parameter in Penalized Splines
报告摘要:The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization. Unfortunately, the grid search method requires a pre-specified search interval that contains the unknown global optimum, yet no guideline is available for providing this interval. As a result, practitioners have to find it by trial and error. To overcome such difficulty, we develop novel algorithms to automatically find this interval. Our automatic search interval has four advantages. (i) It specifies a smoothing parameter range where the associated penalized least squares problem is numerically solvable. (ii) It is criterion-independent so that different criteria, such as GCV and REML, can be explored on the same parameter range. (iii) It is sufficiently wide to contain the global optimum of any criterion, so that for example, the global minimum of GCV and the global maximum of REML can both be identified. (iv) It is computationally cheap compared with the grid search itself, carrying no extra computational burden in practice. Our method is ready to use through our recently developed R package gps (>= version 1.1). It may be embedded in more advanced statistical modeling methods that rely on penalized splines.
报告人简介:李哲源,英国布里斯托大学硕士,英国巴斯大学博士,加拿大西蒙莎菲大学博士后,河南大学讲师,关注以罚样条为基础的回归模型和相关统计计算,及其在生态环境数据、公共健康数据方面的应用。在JASA、Statistics and Computing、BMC Medical Research Methodoly、Science of the Total Environment等期刊上发表文章5篇,主持国家自然科学基金一项,在CRAN、GitHub上有4个R包。

1. 报到当天可从中州颐和酒店进入校园,中州颐和酒店的入口在河南大学金明校区西大门北侧约100米处。
2. 会议期间可凭代表证进出校门。
1. 新郑机场至中州颐和酒店
2. 开封北站至中州颐和酒店
3. 宋城站至中州颐和酒店
