1.
Reliable AI
2.
Interpretable, Robust AI for discovery of physical laws
3.
Interpretable, reliable AI for health
4.
Physics-informed AI
5.
Fair AI
6.
Big data analysis and statistical machine learning
7.
Predictive modeling and uncertainty quantification
8.
Scientific computing and computational fluid dynamics
9.
Stochastic multiscale modeling
My research interests include diverse topics in artificial
intelligence, machine learning, data science, computational and predictive
science and statistical learning both on algorithms and applications. A main
current thrust is stochastic
simulation (in the context of uncertainty quantification,
statistical learning and beyond), and multiscale modeling of physical and
biological systems (e.g., blood flow). My research goal is to develop
high-order numerical algorithms to promote innovation with significant
potential impact and design highly-scalable numerical solvers on petascale supercomputers to investigate new knowledge
discovery and predictive modeling for critical decision making in complex
physical and biological complex systems.
Data Science Case Studies
· IMPROving The quality of ChRYSLER CROSSMEMBER CASTINGS
Summary:
A crossmember is a structural component that undergoes strict X-ray inspection
to ensure its quality. The optimal environmental and operational parameter
settings are identified for making quality CHRYSLER crossmember castings
through a novel optimization algorithm.
Y. Sun, G. Lin, Q. Han, D. Yang, C. Vian, Exploratory data analysis for
achieving optimal environmental and operational parameter settings for making
quality crossmember castings, Die Casting
Congress & Exposition, 2019.
· Classification of Machine functions
Summary:
The increase in mobile machine automation and data collection has allowed
mobile equipment manufacturers to push to implement their machines with smart
machine learning algorithms to assist in the condition monitoring of the
system.
Advanced machine
learning algorithms are employed to classify the machine functions on a Bobcat
435 mini excavator.
N.J. Keller, A. Vacca, Y. Sun, Y. Zuo, G. Lin, Classification of Machine Functions:
A Case Study, the 16th Scandinavian International Conference on Fluid Power,
May 22-24, 2019, Tampere, Finland.
· Multifidelity learning for material properties prediction
Summary:
We develop
multi-fidelity model-based machine learning tools for empirical potential
development for Si:H nanowires. The calculation speed
using developed empirical potentials is fast compared to the first principle
calculations with very good accuracy.
· Deep learning for power system state estimation
Summary:
The complexity of
distribution power grids is increasing due to widespread deployment of
renewable resources and power electronic devices. We employ deep belief network
with non-Gaussian uncertainties for probabilistic state estimation of
distribution power system.
Y. Huang, Q. Xu, C. Hu, Y. Sun, G. Lin, Probabilistic state estimation approach for AC/MTDC Distribution system using deep belief network with non-Gaussian uncertainties, IEEE Sensors Journal, 19(20): 9422 - 9430, 2019. https://doi.org/10.1109/JSEN.2019.2926089
· design
optimal control strategy for ebola outbreak
Summary:
The 2014-15 Ebola outbreak in West Africa is a serious threat to global public
health. To design and evaluate different control strategies for Ebola outbreak,
we employ machine learning, sensitivity analysis and parameter estimation to
analyze the observation dataset. The results indicate that simultaneously
strengthening contact-tracing and effectiveness of isolation in hospital would
be most effective control strategies.
J. Ponce, Y.
Zheng, G. Lin*, Z. Feng, Assessing
the effects of modeling the spectrum of clinical symptoms on the dynamics and
control of Ebola, Journal theoretical biology, 467:111-122, 2019.
https://www.sciencedirect.com/science/article/abs/pii/S002251931930013X
· robust data-driven discovery of physical laws
Summary:
We develop a new
machine learning approach on data-driven discovery of physical laws in implicit
form from noisy datasets. This approach is effective, robust and able to
quantify uncertainties by providing an error bar for each discovered candidate
equations.
S. Zhang, G. Lin*, Robust data-driven discovery of governing physical laws with error bars, Proceedings of the Royal Society of London. Series A, mathematical, physical and engineering sciences, A 474: 20180305, 2018. https://doi.org/10.1098/rspa.2018.0305
Research Highlight
Dr. Guang Lin’s research spans several interconnecting fields in
computational and applied mathematics: machine learning, artificial
intelligence, big data analysis, numerical methods for stochastic differential
equations and uncertainty quantification (UQ), modeling and simulation of
complex systems, higher-order numerical methods, data assimilation, stochastic
inverse problem, design and optimization under uncertainty and numerical
methods for rare events.
Research Highlight Summary: (click each one with hyperlink)
Reliable AI with impact
applications
Interpretable Robust AI: Discovery of physical laws
Interpretable,
reliable AI for Health
Physics-informed
AI for COVID-19
Fair AI: Fair
supervised learning for sensitive attributes
Second-order stochastic asymptotic analysis
Tackling the Curse of Dimensionality Challenge in Uncertainty
Quantification
Tackling Big Data Challenge
in Data Analysis and Uncertainty Quantification of Ultra-Large Systems
Tackling Multimodal
Distribution Challenge in Large-Scale Bayesian Inverse Problems
Numerical Methods for Rare
Events
Reliable AI with Impact Applications
Lin’s group has developed a series of stochastic gradient
Markov Chain algorithms and replica exchange algorithms for quantifying the
uncertainty in deep neural networks. Dr. Lin developed a novel Bayesian sparse
deep neural network, which can achieve accurate prediction with a much smaller
number of neurons and it also can evaluate the
uncertainties in the neural network prediction. This work was documented at one
of the top machine learning conferences, the 2019 Conference on Neural
Information Processing Systems. In addition, Lin has developed a novel adaptive
replica-exchange stochastic-gradient Markov Chain Monte Carlo algorithm to
accelerate the convergence of the conventional Markov Chain Monte Carlo
algorithms for non-convex deep learning. This work was published at one of the
top machine learning conferences, the 2020 International Conference on Machine
Learning. His methods provide a new approach to obtaining the posterior
estimation of the deep neural networks, with applications extending to various
fields, and make a great impact in improving the accuracy of nano-3D printing,
predicting phonon scattering rate and lattice thermal conductivity, showcasing
both versatility and significant impact. His work has led to the development of
highly effective tools to improve reliability
and quantify the uncertainties in deep neural networks. His work has been
documented in 7 prestigious Nature Family Journals [1-7] and 17 top
AI Conferences and journal papers.
Interpretable Robust AI: Discovery of physical laws with
error bars
Discovering governing physical laws from noisy
data is a grand challenge in many science and engineering research areas. Lin
has developed a new approach to data-driven discovery of ordinary differential
equations (ODEs) and partial differential equations (PDEs), in explicit or
implicit form with error bars. The new algorithm is effective, robust, and
able to quantify uncertainties by providing an error bar for each discovered
candidate equation. This work is published in Proceedings of the Royal
Society A [11].
With the explosive growth of biomarker data in
Alzheimer’s disease (AD) clinical trials, numerous mathematical models have
been developed to characterize disease-relevant biomarker trajectories over
time. While some of these models are purely empiric, others are causal, built
upon various hypotheses of AD pathophysiology, a complex and incompletely
understood area of research. One of the most challenging problems in
computational causal modeling is using a purely data-driven approach to derive
the model’s parameters and the mathematical model itself, without any prior
hypothesis bias. Lin and his collaborator developed an innovative
data-driven modeling approach to build and parameterize a causal model to
characterize the trajectories of AD biomarkers. This framework integrates
causal model learning, population parameterization, parameter sensitivity
analysis, and personalized prediction. By applying this integrated approach to
a large multicenter database of AD biomarkers, the Alzheimer’s Disease Neuroimaging
Initiative, several causal models for different AD stages are
revealed. This research will benefit Alzheimer’s disease early diagnosis
and personalized predictions. This work is published in Nature Digital
Medicine [5].
Physics-informed AI for COVID-19
Integer- and
fractional-order epidemiological models using physics-informed neural networks.
By incorporating physical information into machine learning, Lin has pioneered
advancements in physics-informed deep learning, enabling machine learning to be
employed for the development of integer- and fractional-order epidemiological
models using physics-informed neural networks. Lin analyzed a plurality of
epidemiological models through the lens of physics-informed neural networks
that enable us to identify time-dependent parameters and data-driven fractional
differential operators. Lin considers several variations of the classical
susceptible-infectious-removed model by introducing more compartments and
fractional-order and time-delay models. His work has been published in the
prestigious Nature Computational Science [7].
Fair AI:
Fair supervised learning for sensitive attributes
While our societies
utilize more and more diverse and complex data for decision-making, a plethora
number of existing studies in algorithmic fairness are limited to a simple
machine-learning task such as a binary classification with a discrete sensitive
attribute. Only a few fairness-aware machine learning methods exist for solving
more general situations such as handling a mix of continuous and discrete
sensitive attributes. An example is controlling the effects of age, height, and
race simultaneously in predictive modeling. To adapt diverse sensitive
attributes to fair supervised learning, Lin designed a novel fairness penalty
for fair classification/regression that is universally applicable for any
format of sensitive attributes. The proposed penalty covers the two major
notions of fairness: independence between the model’s predictive outcome and
subgroup labels; and invariance of conditional distributions of the model’s
predictive outcomes given the true response outcome across subgroups. Lin
mathematically characterized the estimation error and loss of utility for the
empirical model estimated with the fairness penalty. Under various simulation
scenarios, Lin demonstrated that the proposed framework involves relatively
simplified optimization and achieves better fairness improvement and accuracy
preservation than competing methods in general. This work is documented in Top AI conference, AISTAT [23].
Reference:
1. Jason E. Johnson, Ishat
Raihan Jamil, Liang Pan, Guang Lin, Xianfan Xu, Bayesian Optimization with Gaussian-Process
based Machine Learning for Improvement of Geometric Accuracy in Projection
Multi-photon 3D Printing, Light Science
& Applications (Nature, Impact Factor 20.6), 14,56, 2025.
2.
Zhang,
Y., Zhang, S., Wu, H., Wang, J., Lin, G.,
Zhang, A.P. Miniature computational spectrometer with a plasmonic
nanoparticles-in-cavity microfilter array. Nat Commun 15, 3807 (2024).
3. Ziqi Guo, Zherui
Han, Dudong Feng, Guang Lin, Xiulin Ruan, Sampling-accelerated prediction of phonon
scattering rates for converged thermal conductivity and radiative properties, Npj Computational Materials, 10, 31, 2024.
4.
Ziqi Guo, Roy Chowdhury Prabudhya1, Zherui Han, Yixuan Sun, Dudong Feng, Guang
Lin*, and Xiulin Ruan, Fast and Accurate Machine
Learning of Phonon Scattering Rates and Lattice Thermal Conductivity, Nature
npj Computational Material 9, 95, 2023.
https://doi.org/10.1038/s41524-023-01020-9
5.
Haoyang Zheng, Jeffrey Petrella, P. Murali Doraiswamy, Guang Lin*,
Wenrui Hao, Data-driven causal model discovery and
personalized prediction in Alzheimer’s disease, Nature NPJ Digital Medicine,
5, 137, 2022.
6.
Yixuan Sun, Surya Mitra Ayalasomayajula, Abhas Deva, Guang Lin, R. Edwin Garcia, Artificial
Intelligence Inferred Microstructural Properties from Voltage-Capacity Curves, Scientific
Reports, 12, 13421, 2022.
https://doi.org/10.1038/s41598-022-16942-5
7.
Ehsan
Kharazmi, Min Cai, Xiaoning
Zheng, Guang Lin, George Karniadakis,
Identifiability and predictability of integer- and fractional-order
epidemiological models using physics-informed neural networks, Nature
Computational Science, 1-10, 2021.
8.
Guang Lin,
Chau-Hsing Su and George E. Karniadakis, The
stochastic piston problem, Proceedings of the National Academy of Sciences
of the United States of America, 101(45):15840-15845, 2004.
https://doi.org/10.1073/pnas.0405889101
9.
Guang Lin,
Chau-Hsing Su and George E. Karniadakis, Random
Roughness Enhances Lift in Supersonic Flow, Physical Review Letters,
99:104501, 2007.
https://doi.org/10.1103/PhysRevLett.99.104501
10.
Sheng
Zhang, Guang Lin, Samy Tindel, Two-dimensional
signature of images and texture classification, Proceeding of the Royal
Society of London. Series A, mathematical, physical and engineering
sciences, A.478:20220346, 2022.
https://royalsocietypublishing.org/doi/abs/10.1098/rspa.2022.0346
11.
Sheng
Zhang, Guang Lin*, Robust data-driven discovery of governing physical
laws with error bars, Proceedings of the Royal Society of London. Series
A, mathematical, physical and engineering sciences, A 474: 20180305, 2018. https://doi.org/10.1098/rspa.2018.0305
12.
Yifan
Du, Guang Lin*,Turbulence
Generation from a stochastic wavelet model, Proceeding of the Royal Society
of London. Series A, mathematical, physical and engineering sciences,
474(2217):20180093, 2018.
13.
Bledar A. Konomi, Georgios Karagiannis, Kevin Lai, Guang
Lin, Bayesian treed Calibration: an application to Carbon capture with AX
sorbent, Journal of American Statistical Association, 112(517): 37-53,
2017.
https://doi.org/10.1080/01621459.2016.1190279
14.
F.
Liang, Y. Cheng, and G Lin, Simulated Stochastic Approximation Annealing for
Global Optimization with a Square-Root Cooling Schedule, Journal of the
American Statistical Association, 109(506): 847-863, 2014.
15. Wei Deng, Xiao Zhang, Faming Liang, Guang Lin, An adaptive
empirical Bayesian method for sparse deep learning, 2019 Conference on Neural Information Processing Systems (NeurIPS), accepted, Dec. 8 – Dec. 14, 2019, Vancouver,
Canada. (Tier 1 AI conference)
16. Wei Deng, Faming Liang, Guang Lin, A contour
stochastic gradient Langevin dynamics algorithm for simulations of multi-modal
distributions, 2020 Conference on Neural
Information Processing Systems (NeurIPS), Dec. 5
– Dec. 12, 2020, , virtual meeting. (Tier 1 AI
conference)
17. Wei Deng, Qi Feng, Liaoyao
Gao, F. Liang, G. Lin, Non-convex learning via replica exchange stochastic gradient MCMC, 2020 International Conference on Machine Learning (ICML), accepted,
Jul 12 - 18, 2020, Virtual Meeting. (Tier 1 AI conference)
18. Wei Deng, Qi Feng, Georgios Karagiannis,
Guang Lin, Faming Liang, Accelerating Convergence of Replica Exchange
Stochastic Gradient MCMC via Variance Reduction, The Ninth International Conference on Learning Representations (ICLR), May 4th-7th,
2021, accepted (virtual meeting). (Tier 1 AI conference)
19. Wei Deng, Siqi Liang, Botao Hao, Guang Lin, Faming Liang, Interacting
Contour Stochastic Gradient Langevin Dynamics, The Tenth International Conference on Learning Representations (ICLR) 2022, Virtual Meeting. (Tier 1 AI
conference), Apr 25th – 29th, accepted.
20. Wei Deng, Qian Zhang, Qi Feng, Faming Liang, Guang
Lin, Non-reversible Parallel Tempering for Uncertainty Approximation in
Deep Learning, Thirty-seventh AAAI
Conference on Artificial Intelligence, accepted, 2023. (Oral accepted talk)
21. Haoyang Zheng, Wei Deng, Christian Moya, Guang Lin*,
Accelerating approximate Thompson sampling with underdamped Langevin Monte
Carlo, The 27th International Conference on Artificial Intelligence and
Statistics (AISTATS 2024), May 2nd – 4th, 2024,
Valencia, Spain, PMLR 238:2611-2619, 2024.
22. Haoyang Zheng, Hengrong Du, Qi
Feng, Wei Deng, Guang Lin, Constrained Exploration via Reflected Replica
Exchange Stochastic Gradient Langevin Dynamics, ICML 2024.
23. Jinwon Sohn, Guang Lin, Qifan
Song, Fair Supervised Learning with A Simple Random Sampler of Sensitive
Attributes, The 27th International Conference on Artificial Intelligence and
Statistics (AISTATS 2024), May 2nd – 4th, 2024,
Valencia, Spain, PMLR 238:1594-1602, 2024.
24. Wei Deng, Yian Ma, Zhao
Song, Qian Zhang, Guang Lin, On Convergence of Federated Averaging
Langevin Dynamics, 40th Conference on Uncertainty in Artificial Intelligence
(UAI 2024), Oral presentation, 2024.
Second-Order Stochastic Asymptotic Analysis
1.a Stochastic Piston
Problem
Fig. 1 Left: Sketch of
shock paths induced by random piston motions; Right: Normalized variance of the
perturbed shock paths
Motivation: This research is
motivated by studying how small random piston affect the shock paths.
Methods: A second-order stochastic perturbation analysis
algorithm for stochastic piston problem is developed.
Results: Lin's work on stochastic
piston problem is a re-formulation, within the stochastic framework, of a
classical aerodynamics benchmark problem that studies how small random piston
motions affect shock paths. A second-order asymptotic analytical solution for
the linearized stochastic Euler equations for the stochastic piston problem is
derived. Asymptotic results of the perturbed shock paths for early and longer
times are provided. This study reveals that the variance of the location of the
perturbed shock paths initially grows quadratically with time and switches to
linear dependence for longer times.
Why
it Matters: The developed work provides insight on the effect
of random piston motion on the shock paths. In addition, it will have a
significant and broad impact on UQ algorithm development as it sets the
foundations for the second-order stochastic asymptotic analysis for uncertainty quantification, which is
useful for predictive modeling in many applications of interest to NSF,
DOE, AFOSR, ONR, ARL and DARPA.
Reference:
1. G. Lin, C.-H. Su and G.E. Karniadakis, The stochastic piston problem,
Proceedings of the National Academy of Sciences of the United States of
America, 101(45):15840-15845, 2004.
2.
Z. Zhang, X. Yang, G. Lin, G. Karniadakis,
Numerical solution of the Stratonovich- and Ito-Euler
equations: Application to the stochastic piston problem, Journal of
Computational Physics, 236: 15-27, 2013.
1.b Random Roughness
Problem
Fig. 2 Left: Pressure contour for oblique shock problem with random
rough surface; Right: Scaling laws for mean of the perturbed lift with respect to
the correlation length and amplitude of the random roughness
Motivation: This research is
motivated by studying how random surface roughness interact with the shock and
affect the aerodynamics of aircraft.
Methods: A second-order stochastic perturbation analysis
algorithm for random roughness problem is developed. An integrated framework by
combing both the second-order
stochastic perturbation methods and high-order stochastic numerical methods is
used to develop to uncertainty propagation is developed. The
Results:
Lin's
work on random roughness problem provides the answer on how random roughness
can affect the shock paths, drag and lift forces in supersonic flow. A
second-order asymptotic analytical solution of the perturbed lift and drag
forces as a function of the random roughness are derived for the
two-dimensional random roughness problem. This study reveals that random
roughness actually can enhance the lift for supersonic aircraft.
Why
it Matters: The results are
useful in evaluating the effects of roughness in high-speed flight but also in
designing novel enhanced-lift aerodynamic surfaces using rough skin concepts. The developed work will have a significant and broad impact as it sets
the foundations on combining both the second-order stochastic asymptotic analysis and high-order stochastic numerical
methods to uncertainty quantification, which is useful for predictive modeling
in many applications of interest to NSF, DOE, AFOSR, ONR, ARL and DARPA, such
as predicting how ice on the aircraft surface will affect the dynamics of the
aircraft.
Reference:
1.
G. Lin, C.-H. Su and G.E. Karniadakis, Random
Roughness Enhances Lift in Supersonic Flow, Physical Review Letters, 99:104501,
2007.
2. G. Lin, C.-H. Su and G.E. Karniadakis, Stochastic modeling of random
roughness in shock scattering problems: Theory and simulations, Comput. Methods Appl. Math. Eng., 197(43-44): 3420-3434,
2008.
3.
G. Lin, X. Wan, C.-H. Su and G.E. Karniadakis,
Stochastic fluid mechanics, IEEE Computing in Science and Engineering (CiSE), 9:21-29, 2007.
4.
G. Lin, C.-H. Su and G.E. Karniadakis,
Predicting shock dynamics in the presence of uncertainties, Journal of
Computational Physics, special issue in stochastic uncertainty prediction,
217(1) 260-276, 2006.
Tackling the Curse of Dimensionality Challenge in
Uncertainty Quantification
Fig. 3 Sketch of Curse of
Dimensionality: More Data are Needed as Dimension Increases
Motivation: This research is
motivated by the critical challenge, so called “curse of dimensionality” issue
in quantifying high-dimensional uncertainties in the complex stochastic partial
differential systems.
Methods: To tackle the curse of dimensionality challenge,
several advanced high-order stochastic numerical algorithms have been
developed:
High dimensions: adaptive analysis of variance (ANOVA)
algorithms - For stochastic problems with high stochastic input
dimensions, an adaptive ANOVA-based gPC method based
on three different adaptive criteria for solving high-dimensional stochastic
PDE systems [4] was developed as a dimension-reduction technique to decompose
of the original high-dimensional stochastic problem results into a set of
low-dimensional sub-problems in stochastic space, which can be efficiently
solved by the sparse-grid stochastic collocation method. This is motivated by
the observation that for many real-physical systems, only a relatively small
number of stochastic dimensions is important and will significantly impact the
stochastic systems’ outputs. To speed up the computation, a reduced basis ANOVA
method is developed in [9]. In addition, to model high-dimensional stochastic
multiscale problem, adaptive ANOVA-based data-driven stochastic methods are
developed [8] and a variance-based mixed multiscale finite element method is
proposed in [5]. A random domain decomposition method is introduced in [10]. To
solve high-dimensional inverse problem, an adaptive ANOVA-based probabilistic
collocation Kalman filter method is developed in [11].
High dimensions: Compressive sampling algorithms - To address the “curse
of dimensionality” issue, a careful model reduction can be performed through
the evaluation of a gPC expansion that contains a smaller
subset of significant gPC bases. Compressive
sensing-based based numerical methods for selecting such smaller subset of
significant gPC bases have been developed. To further
improve the efficiency and accuracy of the compressive sensing-based
uncertainty quantification methods, new bases for random variables are
identified in [1, 2] through linear mappings such that the representation of
the quantity of interest is sparser with the new basis functions associated
with the new random variables.
High dimensions: Bayesian model selection
algorithms
- Bayesian model selection-based numerical methods for selecting smaller subset
of significant gPC bases have also been developed. In
particular, the Bayesian model uncertainty methods [7] and the Bayesian mixture
prior procedure [6] have been developed by Lin and his coworkers. In this work,
a fully Bayesian stochastic procedure is employed to perform gPC basis selection and coefficient evaluation
simultaneously. It recovers possible sparse structures in both stochastic and
spatial domains.
High dimensions:
Inverse regression-based algorithms - Many high-dimensional UQ problems are
intrinsically low-dimensional, because the variation of the quantity of
interest is often caused by only a few latent parameters varying within a
low-dimensional subspace, known as the sufficient dimension reduction subspace
in the statistics literature. Motivated by this observation, two inverse
regression-based UQ algorithms are developed in [3] for high-dimensional
problems. Both algorithms use inverse regression to convert the original
high-dimensional problem to a low-dimensional one, which is then efficiently
solved by building a response surface for the reduced model, for example via
the polynomial chaos expansion.
Results:
The
developed advanced high-order stochastic numerical algorithms take advantage of
the special properties of the stochastic
partial differential systems, such as the sparsity, or the sufficient
dimension reduction subspace property, etc., which enables us to greatly reduce
the high-dimensional space into a low-dimensional manifold so that we can
quantify the high-dimensional uncertainties in the complex stochastic partial differential systems.
Why
it Matters: The developed work
will have a significant and broad impact as it sets the foundations on
high-order stochastic numerical methods for high-dimensional uncertainty
quantification problems, which are useful for predictive modeling in many
applications of interest to NSF, DOE, AFOSR, ONR, ARL and DARPA.
Reference:
1. X. Yang, H. Lei, N. Baker, G. Lin*, Enhancing sparsity of Hermite
polynomial expansions by iterative rotations, Journal of Computational Physics,
307: 94-09, 2016.
2. H. Lei, X. Yang, B. Zheng, G. Lin, N. Baker, Constructing Surrogate
Models of Complex Systems with Enhanced Sparsity: Quantifying the Influence of
Conformational Uncertainty in Biomolecular Solvation, SIAM Multiscale Modeling
and Simulation, 13(4): 1327-1353, 2016.
3.
W. Li, G. Lin*, B.
Li, Inverse regression-based uncertainty quantification algorithms for
high-dimensional models in theory and practice, Journal of Computational
Physics, 321:259-278, 2016.
4. X. Yang, M. Choi, G. Lin, G.E. Karniadakis, Adaptive ANOVA Decomposition
of Incompressible and Compressible Flows, Journal of Computational Physics, 231(4):
1587–1614, 2012.
5. J. Wei, G. Lin*, L. Jiang, Y. Efendiev, Analysis of Variance-based Mixed
Multiscale Finite Element Method and Applications in Stochastic Two-Phase
Flows, International Journal for Uncertainty Quantification, 4(6): 455-477,
2014.
6.
G. Karagiannis, B.
Konomi, G. Lin*, Mixed shrinkage prior procedure for basis selection and global
evaluation of gPC expansions in Bayesian framework:
Applications to elliptic SPDEs, Journal of Computational Physics, 284: 528-546,
2015.
7. G. Karagiannis, G. Lin*, Selection of Polynomial
Chaos Bases via Bayesian Model Uncertainty Methods with Applications to Sparse
Approximation of PDEs with Stochastic Inputs, Journal of Computational Physics,
259: 114–134, 2014.
8.
Z. Zhang, X. Hu,
T.Y. Hou*, G. Lin*, P. Yan, An adaptive ANOVA-based data-driven stochastic
method for elliptic PDE with random coefficients, Communications in
Computational Physics, 16: 571-598, 2014.
9. Q. Liao, G. Lin*, Reduced basis ANOVA method for partial differential
equation with high-dimensional random inputs, Journal of Computational Physics,
317: 148-164, 2016.
10. G. Lin*, D. M. Tartakovsky, and A. M. Tartakovsky, Uncertainty
quantification via random domain decomposition and probabilistic collocation on
sparse grids, Journal of Computational Physics, 229(19): 6995-7012, 2010.
11.
W. Li, G. Lin*, D.
Zhang, An Adaptive-ANOVA-based PCKF for High-Dimensional Nonlinear Inverse
Modeling, Journal of Computational Physics, 258: 752–772, 2014.
12.
G. Lin* and A. M. Tartakovsky, Numerical
studies of three-dimensional stochastic Darcy's equation and stochastic
advection-diffusion-dispersion equation, Journal of Scientific Computing,
43(1): 92-117, 2010.
Tackling Big Data
Challenge in Data Analysis and Uncertainty Quantification of Ultra-Large
Systems
Fig. 4 Big data challenge
Motivation: This research is
motivated by studying how to quantify uncertainty for stochastic partial
differential systems with big amount of data. Such data could be either
generated from the stochastic partial differential systems or from observation.
Methods: In using Gaussian process for large data sets, we
need to invert a large-scale covariance matrix. Scalable approaches have been
developed to invert such large-scale matrix efficiently. If the covariance
matrix is separable, the separable covariance function approach has been
developed in [5], or approximate the covariance function based on a modified
version of the linear model of coregionalization in [3,4]. Coregionalization
approximation provides more accurate results than the separable approach. In
situations where the separability assumption does not hold, a new effective
method termed the full-scale approximation approach with block modulating
function, with linear computational cost in time has been developed in [2].
Model calibration based on multi-fidelity computer model mixture is developed
in [1]. These approaches enable us to obtain accurate results for Bayesian
inference using linear time in big data analysis. Guang Lin received 2010
Department of Energy Advanced Scientific Computing Research Leadership
Computing Challenge award in recognition of his work in analyzing big climate
data using extreme-scale supercomputers.
Results:
To
tackling big data challenge, we developed multi-fidelity models to handle
stochastic problems with big data generated by numerical models. For big data
from observation, we developed advanced algorithms to approximate the
covariance matrix. Numerical examples have demonstrated that the developed
methods can handle stochastic partial
differential systems with big data.
Why
it Matters: The developed work will
have a significant and broad impact as it sets the foundations on developing
efficient numerical methods on handling big data in uncertainty quantification,
which is useful for predictive modeling in many applications of interest to
NSF, DOE, AFOSR, ONR, ARL and DARPA.
Reference:
1. G. Karagiannis, G. Lin*, On the design of a predictive model of computer
model mixtures and their calibration through experimental data, Technometrics, in review.
2. B. Zhang,
B. Konomi, H. Sang, G. Karagiannis, G. Lin*, Full scale multi-output Gaussian
process emulator with nonseparable auto-covariance
functions, Journal of Computational Physics, 300: 623–642, 2015.
3. B. Konomi, G. Lin*, Low-Cost Multi-output Gaussian Process with
Application to Uncertainty Quantification, International Journal for
Uncertainty Quantification, 5(4): 375-392, 2015.
4. B. Konomi, G. Karagiannis, G. Lin, On the Bayesian Treed Multivariate
Gaussian Process with Linear Model of Coregionalization, Journal of Statistical
Planning and Inference, 157-158: 1-15, 2015.
5. Bilionis I, N. Zabaras, B Konomi, and G Lin. Multi-output
separable Gaussian process: Towards an efficient, fully Bayesian paradigm for
uncertainty quantification, Journal of Computational Physics, 241: 212-239,
2013.
Fig. 5 Adaptive Mesh Refinement for
UQ Problem with Discontinuities
Motivation: This research is
motivated by studying how to quantify the uncertainty for complex stochastic
partial differential systems with local
feature/discontinuities/non-stationarity/low regularity.
Methods: The stochastic behavior of real-world complex
systems is inevitably highly non-stationary, with local feature and
discontinuities, due primarily to the relatively large number of heterogeneous
sub-systems. Hence, it is a crucial to build advanced numerical methods for
non-stationary systems with local feature and discontinuities. In particular, a
Bayesian-treed multivariate Gaussian process model [1-3] and an adaptive WENO
collocation method [4] have been developed to tackle such challenge, which adaptively
partition the stochastic space into multiple elements. The size of each element
is adaptively adjusted based on the location of local feature/discontinuities.
Results:
We have
developed two different efficient numerical algorithms that are able to
adaptively partition the stochastic space into multiple elements. The size of
each element is adaptively adjusted based on the location of local
feature/discontinuities.
Why
it Matters: The developed work
will have a significant and broad impact as it sets the foundations on
developing advanced stochastic numerical methods to uncertainty propagation for
stochastic problems with local
feature/discontinuities/non-stationarity/low
regularity, which is useful for predictive modeling in many applications of
interest to NSF, DOE, AFOSR, ONR, ARL and DARPA.
Reference:
1.
B. Konomi, G. Karagiannis,
A. Sarkar, X. Sun, G. Lin*, Bayesian Treed Multivariate Gaussian Process with
Adaptive Design: Application to a Carbon Capture Unit, Technometrics,
56(2): 145–158, 2014.
2.
Konomi, B., G.
Karagiannis, K. Lai, G. Lin, Bayesian treed Calibration: an application to
Carbon capture with AX sorbent, Journal of American Statistical Association,
ISSN: 0162-1459 (Print) 1537-274X (Online), 2016. DOI:
10.1080/01621459.2016.1190279
3.
B. Konomi, G.
Karagiannis, G. Lin, On the Bayesian Treed Multivariate Gaussian Process with
Linear Model of Coregionalization, Journal of Statistical Planning and
Inference, 157-158: 1-15, 2015.
4. W. Guo, G. Lin*,
A. J. Christlieb, J. Qiu, An adaptive WENO collocation method for differential
equations with random coefficients, Mathematics, 4(2), 29; doi:10.3390/math4020029, 2016.
5. G. Lin, X. Wan,
C.-H. Su and G.E. Karniadakis, Stochastic fluid mechanics, IEEE Computing in Science and Engineering (CiSE),
9:21-29, 2007.
Tackling Multimodal
Distribution Challenge in Large-Scale Bayesian Inverse Problems
Fig. 6 Adaptive importance sampling
from the posterior PDF
Motivation: A notable error source in modeling physical
systems is parametric uncertainty, where the values of model parameters that
characterize the system are not known exactly due to limited data or incomplete
knowledge. In this situation, a data assimilation algorithm can improve
modeling accuracy by quantifying and reducing such uncertainty. However, the
performance of these algorithms, such as Kalman filter, will degenerate if the
parametric uncertainty has non-Gaussian distribution, and fail for multimodal
distribution. Quantifying the uncertainties with multimodal distribution in
data assimilation is notoriously challenging and difficult. In addition, these
algorithms often require a large number of repetitive model evaluations that
incur significant computational resource costs, in particular for predicting
complex systems, such as the global climate models.
Methods: In response to these issues, an adaptive
importance sampling algorithm is developed that alleviates the burden caused by
computationally demanding models.
Two key techniques
implemented in this algorithm are: 1) a Gaussian mixture (GM) model adaptively
constructed to capture the distribution of uncertain parameters and 2) a
mixture of polynomial chaos (PC) expansions built as a surrogate model to
alleviate the computational burden caused by forward model evaluations. These
techniques afford the algorithm great flexibility to handle complex multimodal
distributions and strongly nonlinear models while keeping the computational
costs at a minimum level.
Results: Three test cases demonstrated that the developed
algorithm can effectively capture the complex posterior parametric
uncertainties for the specific problems being examined while also enhancing
computational efficiency.
Why
it Matters: Parametric uncertainty
often arises in these models because of incomplete knowledge of the system
being simulated, resulting in models that deviate from reality. The algorithm
developed in this research provides an effective means to infer model
parameters from any direct and/or indirect measurement data through uncertainty
quantification, improving model accuracy. This algorithm has many potential
applications. For example, it can be used to estimate the unknown location of
an underground contaminant source and to improve the accuracy of the model that
predicts how groundwater is affected by this source.
Reference:
1. Li W and G Lin.
2015. “An adaptive importance sampling algorithm for Bayesian inversion with
multimodal distributions.” Journal of Computational
Physics 294:173-190. DOI:10.1016/j.jcp.2015.03.047.
2. W. Li, D. Zhang, G. Lin, A surrogate-based
adaptive sampling approach for history matching and uncertainty quantification,
SPE Reservoir Simulation Symposium, SPE 173298, Houston, Texas, Feb. 23-25,
USA, 2015.
Numerical Methods for Rare Events
Fig. 7 Sketch of a transition
pathway of rare event
Motivation: Dynamical systems are often subject to random
perturbations or noise. Even when the noise amplitude is very small, it has a
profound influence on the dynamics on the appropriate time-scale. When the
noise is small, which is the case of interest here, the classic methods, such
as Monte Carlo or direct simulation of Langevin equations, become prohibitively
expensive, due to the presence of two disparate time-scales: the time-scale of
the deterministic dynamics and the time-scale between the rare events caused by
the noise.
Methods: An asymptotic analysis and efficient rare event
simulation for stochastic Korteweg-de Vries equation
has been developed in [2]. To tackle such challenging issue, hp-adaptive
parallel minimum action methods [1] have been developed to study the transition
behavior induced by small noise and the structure of the phase space for
nonlinear dynamical systems. In addition, a
efficient Bayesian experimental design method is developed in [3] for failure
detection.
Results:
The
hp-adaptive parallel minimum action method employs multi-grid technique and
hp-adaptive algorithms [1], which further improve the efficiency of MAM by
replacing the global reparametrization with
hp-adaptivity and parallel implementation.
Why
it Matters: Rare events play a critical role in nature. In fact,
phenomena like nucleation events during phase transitions, chemical reactions,
conformation changes of biomolecules, bitable behaviors in genetic switches, or
regime changes in climate are just a few examples of rare events among many
others. The developed work will
have a significant and broad impact as it sets the foundations on developing
efficient adaptive algorithms for rare events, which is useful for predictive
modeling of rare events in many applications of interest to NSF, DOE, AFOSR,
ONR, ARL and DARPA.
Reference:
1. X. Wan, G. Lin, Hybrid parallel computing of minimum action method,
Parallel Computing, 39: 638-651, 2013.
2. G. Xu, G. Lin*, J. Liu, Rare Event Simulation for Stochastic Korteweg-de Vries Equation, SIAM/ASA Journal on Uncertainty
Quantification, 2 (1): 698-716, 2014.
3. H. Wang, G. Lin, J. Li, Gaussian process surrogates for failure
detection: a Bayesian experimental design approach, Journal of Computational
Physics, 313: 247-259, 2016.
Fig. 8 Regional climate model
parameter estimation using Simulated Stochastic Approximation Annealing
algorithm [1]
Motivation: This research is
motivated by studying how to efficiently solve large-scale stochastic inverse
problem or parameter estimation problem with computational expensive model. In
practice, most often the computational expensive model is given as a black box
and we don’t know the mathematical models inside.
Methods: We treat such large-scale inverse problem or
parameter estimation problem as a global optimization problem. Two advanced
numerical methods have been developed as follows:
1.
Simulated
Stochastic Approximation Annealing for Global Optimization with a Square-Root
Cooling Schedule published at the prestigious journal: Journal of the American
Statistical Association [1]
2.
Parallel Interactive Stochastic Approximation
Annealing for Global Optimization [2]
Results:
The
convergence of the two advanced global optimization algorithms has been
demonstrated through benchmark examples. In addition, we have employed the two
developed algorithms to improve both the regional [3,5] and global [4,6]
climate model predictivity by tuning the uncertain parameters inside the
convection scheme using the available satellite datasets. This study reveals
that we can not only tune the uncertain parameters to improve the capability in
predicting the precipitation, but also correct the non-physical phenomena,
e.g., double ITCZ in global climate modeling, that bothers climate modelers for
long time. Guang Lin received Ronald L. Brodzinski Award for Early Career
Exception Achievement from Department of Energy Pacific Northwest National
Laboratory in 2012 in recognition of his work on developing advanced
optimization algorithms to calibrate complex global and regional climate
models.
Why
it Matters: The developed work will
have a significant and broad impact as it sets the foundations on advanced
computational stochastic methods to large-scale inverse problem or parameter
estimation problem with computational expensive model, which is useful for
improving the model predictivity in many critical applications of interest to
NSF, DOE, AFOSR, ONR, ARL and DARPA.
Reference:
1. F. Liang, Y. Cheng, and G Lin, Simulated Stochastic Approximation
Annealing for Global Optimization with a Square-Root Cooling Schedule, Journal
of the American Statistical Association, 109(506): 847-863, 2014.
2.
G. Karagiannis, B. Konomi,
F. Liang, G. Lin, Parallel Interactive Stochastic Approximation Annealing for
Global Optimization, Journal of Computational and Graphical Statics, 1-19, doi:10.1007/s11222-016-9663-0, 2016.
3. H. Yan, Y. Qian, G. Lin, L.R. Leung, B. Yang, Q.
Fu, Parametric Sensitivity and Calibration for the Kain-Fritsch Convective
Parameterization Scheme in the WRF Model, Climate Research, 59: 135-147, 2014.
4.
C. Zhao, X. Liu, Y.
Qian, J. Yoon, Z. Hou, G. Lin, S. McFarlane, H. Wang, B. Yang, P.-L. Ma, H.
Yan, J. Bao, A Sensitivity Study of Radiative Fluxes at the Top of Atmosphere
to Cloud-Microphysics and Aerosol Parameters in the Community Atmosphere Model
CAM5, Atmos. Chem. Phys., 13: 10969-10987, 2013
5. B. Yang, Y. Qian, G. Lin, R. Leung, Y. Zhang, Some
issues in uncertainty quantification and parameter tuning: a case study of
convective parameterization scheme in the WRF regional climate model,
Atmospheric Chemistry and Physics, 12(5): 2409-2427, 2012.
6. B. Yang, Y Qian, G Lin, LYR Leung, PJ Rasch, GJ Zhang, SA McFarlane, C
Zhao, Y Zhang, H Wang, M Wang, and X Liu, Uncertainty Quantification and
Parameter Tuning in the CAM5 Zhang-McFarlane Convection Scheme and Physical
Impact of Improved Convection on the Global Circulation and Climate, Journal of
Geophysical Research. D. (Atmospheres), 118: 395-415, 2013.
Fig. 9 Sketch of stochastic network
problem
Motivation: Dynamical network systems, such as social network,
cyber-network, epidemic disease network and power grid, are critical to our
daily life. Such network systems are often subject to random noise. Such noise
plays critical rule in changing the topology, the dynamics, and the stability
of the dynamical network systems. When the size of the network increases, it is
a great challenge to quantify the uncertainties in complex ultra-large network
systems.
Methods: To tackle such challenging issue, advanced
dimension reduction methods have been developed to perform dimension reduction
on the dynamical network systems in [1]. Rigorous uncertainty quantification
algorithms have been employed to endow ultra-large dynamical stochastic network
simulations with a composite error bar [2-10]. Guang Lin received 2016 NSF
faculty early career development award in recognition of his work on
uncertainty quantification and big data analysis in smart grid and other
complex stochastic network systems.
Results:
The
numerical examples have demonstrated that the developed methods are able to effectively
reduce the size of the stochastic network systems and quantify the
uncertainties in the stochastic network systems. In particular, we have
demonstrated the developed methods on the next generation smart grid.
Why
it Matters: Noise plays a critical role in dynamical network
systems, such as social network, power grids and epidemic disease network. The developed work will have a
significant and broad impact as it sets the foundations on developing efficient
adaptive algorithms for predictive modeling of stochastic network in many
applications of interest to NSF, DOE, AFOSR, ONR, ARL and DARPA.
Reference:
1. S. Wang, S. Lu, N. Zhou, G. Lin, M. Elizondo, M.A. Pai, Dynamic-feature
Extraction, Attribution and Reconstruction (DEAR) Method for Power System Model
Reduction, IEEE Transactions on Power Systems, 99: 1-11, 2014.
2.
G. Lin*, M.
Elizondo, S. Lu, X. Wan, Uncertainty Quantification in Dynamic Simulations of
Large-scale Power System Models using the High-Order Probabilistic Collocation
Method on Sparse Grids, International Journal for Uncertainty Quantification,
4(3): 185-204, 2014.
3. D. Meng, N. Zhou, S. Lu, G. Lin, An Expectation-Maximization Method for
Calibrating Synchronous Machine Models, 2013 IEEE PES General meeting, July
21-25, 2013, Vancouver, BC, Canada.
4. Elizondo MA, S Lu, G Lin, and S Wang, Dynamic Response of Large Wind
Power Plant Affected by Diverse Conditions at Individual Turbines, In IEEE
Power and Energy Society General meeting, July 27-31, 2014, National Harbor,
MD, USA.
5. J.B. Coble, G. Lin, B. Shumaker, P. Ramuhalli,
Approaches to Quantify Uncertainty in Online Sensor Calibration Monitoring,
2013 American Nuclear Society Winter Meeting and Technology Expo., 2013.
6. TA Ferryman, DJ Haglin, M Vlachopoulou, J Yin,
C Shen, N Zhou, G Lin, FK Tuffner, and J
Tong. Net Interchange Schedule Forecasting of Electric Power Exchange for
RTO/ISOs, 2012 IEEE PES General meeting, July 22-26, 2012, San Diego, CA.
7. D Meng, N Zhou, S Lu, and G Lin. Estimate the Electromechanical
States Using Particle Filtering and Smoothing, 2012 IEEE PES General meeting,
July 22-26, 2012, San Diego, CA.
8. S Wang, S Lu, G Lin, and N Zhou. Measurement-based Coherency
Identification and Aggregation for Power Systems, 2012 IEEE PES General
meeting, July 22-26, 2012, San Diego, CA.
9. G. Lin, N. Zhou, T. Ferryman, and F. Tuffner,
Uncertainty Quantification in State Estimation using the Probabilistic
Collocation Method, Power Systems Conference and Exposition, March 20th, 2011,
Phoenix, AZ.
10.
T. Ferryman, F. Tuffner,
N. Zhou, and G. Lin, Initial Study on the Predictability of Real Power on the
Grid based on PMU Data, Power Systems Conference and Exposition, March 20th,
2011, Phoenix, AZ.
Fig. 10 Sketch of 3D red blood cell
modeling in a blood vessel
Motivation: According to a World Health Organization report,
malaria, a disease related to red blood cells remains a global threat. Hence
modeling red blood cells and their related disease are critical to our
life.
Methods: Lin and his collaborators have developed advanced
numerical methods in modeling red blood cell deformation and interaction in
flow. In particular, to model red blood cell (RBC) deformation and
multiple-cell interactions in flow, the lattice Boltzmann method and the
distributed Lagrange multiplier/fictitious domain method [1,2] is extended to
employ the mesoscopic network model for simulations of RBCs in flow.
In
[3], a hybrid model is developed representing the cellular structure consists
of a continuum representation of the lipid bilayer, from which the bending
force is calculated through energetic variational approach, a discrete
cytoskeleton model utilizing the worm-like chain to represent network filament,
and area/volume constraints. Guang Lin received 2015 Mathematical Biosciences
Institute Early Career Award from Mathematical Biosciences Institute in
recognition his work on modeling complex biological flow systems.
Results:
The
numerical examples have demonstrated that the developed methods are able to
effectively model the dynamics of red blood cell in flow.
Why
it Matters: Modeling red blood cells and their related disease
are critical to our life. The
developed work can be employed to model complex biological systems in many
applications of interest to NSF, NIH and DARPA.
Reference:
1. X. Shi, G. Lin*, J. Zhou, D. Fedosov, A
Lattice Boltzmann Fictitious Domain Method for Modeling Red Blood Cell
Deformation and Multiple-Cell Hydrodynamic Interaction in Flow, International
Journal for Numerical Methods in fluids, 72 (8): 895-911, 2013.
2.
X. Shi, G. Lin*,
Modeling the Sedimentation of Red Blood Cells in Flow under Strong External
Magnetic Body Force using a Lattice Boltzmann Fictitious Domain Method, Numer. Math. Theor. Meth. Appl.
72014: 512-523, 2014.
3.
W. Hao, Z. Xu, C. Liu,
G. Lin, A Fictitious Domain Method with a Hybrid Cell Model for Simulating
Motion of Cells in Fluid Flow, Journal of Computational Physics, 280: 345-362,
2015.