Professor Mayetri Gupta
- Professor of Statistics (Statistics)
telephone:
01413307753
email:
Mayetri.Gupta@glasgow.ac.uk
318, The Mathematics and Statistics Building, University of Glasgow, University Place, Glasgow, G12 8SQ
Research interests
My research primarily involves the development of novel statistical, in particular Bayesian, methodology for scientific problems arising in the fields of computational biology and genetics. Detection of sparse signals from noisy discrete data is a significant challenge in many fields, but especially so in genomic data analysis, due to latent positional or structural constraints in such data. My interests include Bayesian statistical modelling for epigenetics, genome-wide association studies (GWAS), and single cell transcriptomics. I have been involved in projects developing novel statistical approaches for statistical modelling and prediction of chromatin structure, deciphering regulatory networks of genes and transcription factors, and integrating different types of genomic data to make robust, efficient and meaningful biological inference. I also work on general Bayesian modelling and Markov chain Monte Carlo methodology for clustering, classification, and model selection with complex, high-dimensional correlated data, including regression mixture models and hidden Markov models, and various applications in biology, medicine and image analysis.
Research groups
Publications
2023
Nguyen, H. D. and Gupta, M. (2023) Finite sample inference for empirical Bayesian methods. Scandinavian Journal of Statistics, 50(4), pp. 1616-1640. (doi: 10.1111/sjos.12643)
Li, L. , Gupta, M. , Macaulay, V. and Mukhopadhyay, I. (2023) Bayesian GWAS with Evolutionary Monte Carlo. 18th Conference on Computational Intelligence Methods for Bioinformatics & Biostatistics (CIBB 2023), Padova, Italy, 06-08 Sep 2023. (Accepted for Publication)
2022
Al Alawi, M., Ray, S. and Gupta, M. (2022) A New Functional Data Clustering Technique Based on Spectral Clustering and Downsampling. 17th Conference of the International Federation of Classification Societies (IFCS 2022), Porto, Portugal, 19-23 July 2022. ISBN 9789899895591
Zhang, H., Swallow, B. and Gupta, M. (2022) Bayesian hierarchical mixture models for detecting non-normal clusters applied to noisy genomic and environmental datasets. Australian and New Zealand Journal of Statistics, 64(2), pp. 313-337. (doi: 10.1111/anzs.12370)
2021
Wu, J., Gupta, M. , Hussein, A. I. and Gerstenfeld, L. (2021) Bayesian modeling of factorial time- course data with applications to a bone aging gene expression study. Journal of Applied Statistics, 48(10), pp. 1730-1754. (doi: 10.1080/02664763.2020.1772733)
2020
Redivo, E., Nguyen, H. and Gupta, M. (2020) Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions. Computational Statistics and Data Analysis, 152, 107040. (doi: 10.1016/j.csda.2020.107040)
2019
Al Alawi, M., Ray, S. and Gupta, M. (2019) A New Framework for Distance-based Functional Clustering. In: 34th International Workshop on Statistical Modelling, Guimarães, Portugal, 07-12 Jul 2019,
2017
Che Roos, N. A., Alsanosi, S. M., Alsieni, M. A., Gupta, M. and Padmanabhan, S. (2017) Antihypertensive Drugs and Risk of Cancer: A Systematic Review and Meta-Analysis of 391, 790 Patients. The American Heart Association's Hypertension 2017 Scientific Sessions, San Francisco, CA, USA, 14-17 Sep 2017. (doi: 10.1161/hyp.70.suppl_1.p129)
2015
Moser, C. B., Gupta, M. , Archer, B. N. and White, L. F. (2015) The impact of prior information on estimates of disease transmissibility using Bayesian tools. PLoS ONE, 10(3), e0118762. (doi: 10.1371/journal.pone.0118762) (PMID:25793993) (PMCID:PMC4368801)
2014
Bis, J. C. et al. (2014) Associations of NINJ2 sequence variants with incident ischemic stroke in the Cohorts for Heart and Aging in Genomic Epidemiology (CHARGE) consortium. PLoS ONE, 9(6), e99798. (doi: 10.1371/journal.pone.0099798) (PMID:24959832) (PMCID:PMC4069013)
Lin, H. et al. (2014) Strategies to design and analyze targeted sequencing data: cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium targeted sequencing study. Circulation: Cardiovascular Genetics, 7(3), pp. 335-343. (doi: 10.1161/CIRCGENETICS.113.000350)
Gupta, M. (2014) An evolutionary Monte Carlo algorithm for Bayesian block clustering of data matrices. Computational Statistics and Data Analysis, 71, 375- 391. (doi: 10.1016/j.csda.2013.07.006)
Lin, H. et al. (2014) Targeted sequencing in candidate genes for atrial fibrillation: the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) targeted sequencing study. Heart Rhythm, 11(3), pp. 452-457. (doi: 10.1016/j.hrthm.2013.11.012)
2013
Chanialidis, C. , Craigmile, P., Davies, V. , Dean, N. , Evers, L. , Filiippone, M., Gupta, M. , Ray, S. and Rogers, S. (2013) Discussion of Henning and Liao: How to find an appropriate clustering for mixed type variables with application to socio-economic stratification. Journal of the Royal Statistical Society: Series C. 62, 309-369. Discussion Paper. Springer. (doi: 10.1111/j.1467-9876.2012.01066.x).
Gelfond, J.A., Ibrahim, J.G., Gupta, M. , Cheng, M.-H. and Cody, J.D. (2013) Differential expression analysis with global network adjustment. BMC Bioinformatics, 14(258), (doi: 10.1186/1471-2105-14-258)
2012
Gupta, M. and Ray, S. (2012) Sequence pattern discovery with applications to understanding gene regulation and vaccine design. In: Rao, C.R., Chakraborty, R. and Sen, P.K. (eds.) Handbook of Statistics. Elsevier Press.
Moser, C. and Gupta, M. (2012) A generalized hidden Markov model for determining sequence-based predictors of nucleosome positioning. Statistical Applications in Genetics and Molecular Biology, 11(2), Art. 2.
Hendricks, A.E., Dupuis, J., Gupta, M. , Logue, M.W. and Lunetta, K.L. (2012) A comparison of gene region simulation methods. PLoS ONE, 7(7), e40925. (doi: 10.1371/journal.pone.0040925) (PMID:22815869) (PMCID:PMC3399793)
2011
Gupta, M. , Cheung, C.-L., Hsu, Y.-H., Demissie, S., Cupples, L.A., Kiel, D.P. and Karasik, D. (2011) Identification of homogeneous genetic architecture of multiple genetically correlated traits by block clustering of genome-wide associations. Journal of Bone and Mineral Research, 26(6), pp. 1261-1271. (doi: 10.1002/jbmr.333)
Mitra, R. and Gupta, M. (2011) A continuous-index Bayesian hidden Markov model for prediction of nucleosome positioning in genomic DNA. Biostatistics, 12(3), pp. 462-477. (doi: 10.1093/biostatistics/kxq077)
2010
Meltzer, M., Long, K., Nie, Y., Gupta, M. , Yang, J. and Montano, M. (2010) The RNA editor gene ADAR1 is induced in myoblasts by inflammatory ligands and buffers stress response. Clinical and Translational Science, 3(3), pp. 73-80. (doi: 10.1111/j.1752-8062.2010.00199.x)
2009
Gelfond, J.A.L., Gupta, M. and Ibrahim, J.G. (2009) A Bayesian hidden Markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data. Biometrics, 65(4), pp. 1087-1095. (doi: 10.1111/j.1541-0420.2008.01180.x)
Gupta, M. (2009) Model selection and sensitivity analysis for sequence pattern models. Institute of Mathematical Statistics Collections, 1(1), pp. 390-407.
Cheng, F., Hartmann, S., Gupta, M. , Ibrahim, J.G. and Vision, T.J. (2009) A hierarchical model for incomplete alignments in phylogenetic inference. Bioinformatics, 25(5), pp. 592-598. (doi: 10.1093/bioinformatics/btp015)
Gupta, M. and Ibrahim, J.G. (2009) An information matrix prior for Bayesian analysis in generalized linear models with high dimensional data. Statistica Sinica, 19(4), pp. 1641-1663.
Zhou, Q. and Gupta, M. (2009) Regulatory motif discovery: from decoding to meta-analysis. In: Fan, J., Lin, X. and Liu, J.S. (eds.) New Developments in Biostatistics and Bioinformatics. Series: Frontiers of Statistics (1). World Scientific, pp. 179-208. ISBN 9789812837431 (doi: 10.1142/9789812837448_0008)
2008
Jeong, Y.-C., Walker, N.J., Burgin, D.E., Kissling, G., Gupta, M. , Kupper, L., Birnbaum, L.S. and Swenberg, J.A. (2008) Accumulation of M1dG DNA adducts after chronic exposure to PCBs, but not from acute exposure to polychlorinated aromatic hydrocarbons. Free Radical Biology and Medicine, 45(5), pp. 585-591. (doi: 10.1016/j.freeradbiomed.2008.04.043)
2007
Gupta, M. , Qu, P. and Ibrahim, J.G. (2007) A temporal hidden Markov regression model for the analysis of gene regulatory networks. Biostatistics, 8(4), pp. 805-820. (doi: 10.1093/biostatistics/kxm007)
Gupta, M. and Ibrahim, J.G. (2007) Variable selection in regression mixture modeling for the discovery of gene regulatory networks. Journal of the American Statistical Association, 102(479), pp. 867-880. (doi: 10.1198/016214507000000068)
Gupta, M. (2007) Generalized hierarchical markov models for the discovery of length-constrained sequence features from genome tiling arrays. Biometrics, 63(3), pp. 797-805. (doi: 10.1111/j.1541-0420.2007.00760.x)
Maki, A., Kono, H., Gupta, M. , Asakawa, M., Suzuki, T., Matsuda, M., Fujii, H. and Rusyn, I. (2007) Predictive power of biomarkers of oxidative stress and inflammation in patients with hepatitis C virus-associated hepatocellular carcinoma. Annals of Surgical Oncology, 14(3), pp. 1182-1190. (doi: 10.1245/s10434-006-9049-1)
2006
Giresi, P.G., Gupta, M. and Lieb, J.D. (2006) Regulation of nucleosome stability as a mediator of chromatin function. Current Opinion in Genetics and Development, 16(2), pp. 171-176. (doi: 10.1016/j.gde.2006.02.003)
Gupta, M. and Liu, J.S. (2006) Bayesian modeling and inference for motif discovery. In: Do, K.-A., Müller, P. and Vannucci, M. (eds.) Bayesian Inference for Gene Expression and Proteomics. Cambridge University Press: Cambridge, UK. ISBN 9780521860925
2005
Altman, N., Banks, D., Hardwick, J., Roeder, K., Craigmile, P.F., Hardin, J. and Gupta, M. (2005) The IMS New Researchers' Survival Guide. The Institute of Mathematical Statistics.
Gupta, M. and Liu, J.S. (2005) De novo cis-regulatory module elicitation for eukaryotic genomes. Proceedings of the National Academy of Sciences of the United States of America, 102(20), pp. 7079-7084. (doi: 10.1073/pnas.0408743102)
2004
Gupta, M. and Liu, J.S. (2004) Discussions on "A Bayesian Approach to DNA Sequence Segmentation". Biometrics, 60(3), pp. 582-583. (doi: 10.1111/j.0006-341X.2004.206_3.x)
2003
Gupta, M. and Liu, J.S. (2003) Discovery of conserved sequence patterns using a stochastic dictionary model. Journal of the American Statistical Association, 98(461), pp. 55-66. (doi: 10.1198/016214503388619094)
2002
Liu, J.S., Gupta, M. , Liu, X.L., Mayerhofer, L. and Lawrence, C.L. (2002) Statistical models for biological sequence motif discovery. In: Case Studies in Bayesian Statistics. Series: Lecture Notes in Statistics, 6 (167). Springer.
Articles
Nguyen, H. D. and Gupta, M. (2023) Finite sample inference for empirical Bayesian methods. Scandinavian Journal of Statistics, 50(4), pp. 1616-1640. (doi: 10.1111/sjos.12643)
Zhang, H., Swallow, B. and Gupta, M. (2022) Bayesian hierarchical mixture models for detecting non-normal clusters applied to noisy genomic and environmental datasets. Australian and New Zealand Journal of Statistics, 64(2), pp. 313-337. (doi: 10.1111/anzs.12370)
Wu, J., Gupta, M. , Hussein, A. I. and Gerstenfeld, L. (2021) Bayesian modeling of factorial time- course data with applications to a bone aging gene expression study. Journal of Applied Statistics, 48(10), pp. 1730-1754. (doi: 10.1080/02664763.2020.1772733)
Redivo, E., Nguyen, H. and Gupta, M. (2020) Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions. Computational Statistics and Data Analysis, 152, 107040. (doi: 10.1016/j.csda.2020.107040)
Moser, C. B., Gupta, M. , Archer, B. N. and White, L. F. (2015) The impact of prior information on estimates of disease transmissibility using Bayesian tools. PLoS ONE, 10(3), e0118762. (doi: 10.1371/journal.pone.0118762) (PMID:25793993) (PMCID:PMC4368801)
Bis, J. C. et al. (2014) Associations of NINJ2 sequence variants with incident ischemic stroke in the Cohorts for Heart and Aging in Genomic Epidemiology (CHARGE) consortium. PLoS ONE, 9(6), e99798. (doi: 10.1371/journal.pone.0099798) (PMID:24959832) (PMCID:PMC4069013)
Lin, H. et al. (2014) Strategies to design and analyze targeted sequencing data: cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium targeted sequencing study. Circulation: Cardiovascular Genetics, 7(3), pp. 335-343. (doi: 10.1161/CIRCGENETICS.113.000350)
Gupta, M. (2014) An evolutionary Monte Carlo algorithm for Bayesian block clustering of data matrices. Computational Statistics and Data Analysis, 71, 375- 391. (doi: 10.1016/j.csda.2013.07.006)
Lin, H. et al. (2014) Targeted sequencing in candidate genes for atrial fibrillation: the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) targeted sequencing study. Heart Rhythm, 11(3), pp. 452-457. (doi: 10.1016/j.hrthm.2013.11.012)
Gelfond, J.A., Ibrahim, J.G., Gupta, M. , Cheng, M.-H. and Cody, J.D. (2013) Differential expression analysis with global network adjustment. BMC Bioinformatics, 14(258), (doi: 10.1186/1471-2105-14-258)
Moser, C. and Gupta, M. (2012) A generalized hidden Markov model for determining sequence-based predictors of nucleosome positioning. Statistical Applications in Genetics and Molecular Biology, 11(2), Art. 2.
Hendricks, A.E., Dupuis, J., Gupta, M. , Logue, M.W. and Lunetta, K.L. (2012) A comparison of gene region simulation methods. PLoS ONE, 7(7), e40925. (doi: 10.1371/journal.pone.0040925) (PMID:22815869) (PMCID:PMC3399793)
Gupta, M. , Cheung, C.-L., Hsu, Y.-H., Demissie, S., Cupples, L.A., Kiel, D.P. and Karasik, D. (2011) Identification of homogeneous genetic architecture of multiple genetically correlated traits by block clustering of genome-wide associations. Journal of Bone and Mineral Research, 26(6), pp. 1261-1271. (doi: 10.1002/jbmr.333)
Mitra, R. and Gupta, M. (2011) A continuous-index Bayesian hidden Markov model for prediction of nucleosome positioning in genomic DNA. Biostatistics, 12(3), pp. 462-477. (doi: 10.1093/biostatistics/kxq077)
Meltzer, M., Long, K., Nie, Y., Gupta, M. , Yang, J. and Montano, M. (2010) The RNA editor gene ADAR1 is induced in myoblasts by inflammatory ligands and buffers stress response. Clinical and Translational Science, 3(3), pp. 73-80. (doi: 10.1111/j.1752-8062.2010.00199.x)
Gelfond, J.A.L., Gupta, M. and Ibrahim, J.G. (2009) A Bayesian hidden Markov model for motif discovery through joint modeling of genomic sequence and ChIP-chip data. Biometrics, 65(4), pp. 1087-1095. (doi: 10.1111/j.1541-0420.2008.01180.x)
Gupta, M. (2009) Model selection and sensitivity analysis for sequence pattern models. Institute of Mathematical Statistics Collections, 1(1), pp. 390-407.
Cheng, F., Hartmann, S., Gupta, M. , Ibrahim, J.G. and Vision, T.J. (2009) A hierarchical model for incomplete alignments in phylogenetic inference. Bioinformatics, 25(5), pp. 592-598. (doi: 10.1093/bioinformatics/btp015)
Gupta, M. and Ibrahim, J.G. (2009) An information matrix prior for Bayesian analysis in generalized linear models with high dimensional data. Statistica Sinica, 19(4), pp. 1641-1663.
Jeong, Y.-C., Walker, N.J., Burgin, D.E., Kissling, G., Gupta, M. , Kupper, L., Birnbaum, L.S. and Swenberg, J.A. (2008) Accumulation of M1dG DNA adducts after chronic exposure to PCBs, but not from acute exposure to polychlorinated aromatic hydrocarbons. Free Radical Biology and Medicine, 45(5), pp. 585-591. (doi: 10.1016/j.freeradbiomed.2008.04.043)
Gupta, M. , Qu, P. and Ibrahim, J.G. (2007) A temporal hidden Markov regression model for the analysis of gene regulatory networks. Biostatistics, 8(4), pp. 805-820. (doi: 10.1093/biostatistics/kxm007)
Gupta, M. and Ibrahim, J.G. (2007) Variable selection in regression mixture modeling for the discovery of gene regulatory networks. Journal of the American Statistical Association, 102(479), pp. 867-880. (doi: 10.1198/016214507000000068)
Gupta, M. (2007) Generalized hierarchical markov models for the discovery of length-constrained sequence features from genome tiling arrays. Biometrics, 63(3), pp. 797-805. (doi: 10.1111/j.1541-0420.2007.00760.x)
Maki, A., Kono, H., Gupta, M. , Asakawa, M., Suzuki, T., Matsuda, M., Fujii, H. and Rusyn, I. (2007) Predictive power of biomarkers of oxidative stress and inflammation in patients with hepatitis C virus-associated hepatocellular carcinoma. Annals of Surgical Oncology, 14(3), pp. 1182-1190. (doi: 10.1245/s10434-006-9049-1)
Giresi, P.G., Gupta, M. and Lieb, J.D. (2006) Regulation of nucleosome stability as a mediator of chromatin function. Current Opinion in Genetics and Development, 16(2), pp. 171-176. (doi: 10.1016/j.gde.2006.02.003)
Gupta, M. and Liu, J.S. (2005) De novo cis-regulatory module elicitation for eukaryotic genomes. Proceedings of the National Academy of Sciences of the United States of America, 102(20), pp. 7079-7084. (doi: 10.1073/pnas.0408743102)
Gupta, M. and Liu, J.S. (2004) Discussions on "A Bayesian Approach to DNA Sequence Segmentation". Biometrics, 60(3), pp. 582-583. (doi: 10.1111/j.0006-341X.2004.206_3.x)
Gupta, M. and Liu, J.S. (2003) Discovery of conserved sequence patterns using a stochastic dictionary model. Journal of the American Statistical Association, 98(461), pp. 55-66. (doi: 10.1198/016214503388619094)
Books
Altman, N., Banks, D., Hardwick, J., Roeder, K., Craigmile, P.F., Hardin, J. and Gupta, M. (2005) The IMS New Researchers' Survival Guide. The Institute of Mathematical Statistics.
Book Sections
Gupta, M. and Ray, S. (2012) Sequence pattern discovery with applications to understanding gene regulation and vaccine design. In: Rao, C.R., Chakraborty, R. and Sen, P.K. (eds.) Handbook of Statistics. Elsevier Press.
Zhou, Q. and Gupta, M. (2009) Regulatory motif discovery: from decoding to meta-analysis. In: Fan, J., Lin, X. and Liu, J.S. (eds.) New Developments in Biostatistics and Bioinformatics. Series: Frontiers of Statistics (1). World Scientific, pp. 179-208. ISBN 9789812837431 (doi: 10.1142/9789812837448_0008)
Gupta, M. and Liu, J.S. (2006) Bayesian modeling and inference for motif discovery. In: Do, K.-A., Müller, P. and Vannucci, M. (eds.) Bayesian Inference for Gene Expression and Proteomics. Cambridge University Press: Cambridge, UK. ISBN 9780521860925
Liu, J.S., Gupta, M. , Liu, X.L., Mayerhofer, L. and Lawrence, C.L. (2002) Statistical models for biological sequence motif discovery. In: Case Studies in Bayesian Statistics. Series: Lecture Notes in Statistics, 6 (167). Springer.
Research Reports or Papers
Chanialidis, C. , Craigmile, P., Davies, V. , Dean, N. , Evers, L. , Filiippone, M., Gupta, M. , Ray, S. and Rogers, S. (2013) Discussion of Henning and Liao: How to find an appropriate clustering for mixed type variables with application to socio-economic stratification. Journal of the Royal Statistical Society: Series C. 62, 309-369. Discussion Paper. Springer. (doi: 10.1111/j.1467-9876.2012.01066.x).
Conference or Workshop Item
Li, L. , Gupta, M. , Macaulay, V. and Mukhopadhyay, I. (2023) Bayesian GWAS with Evolutionary Monte Carlo. 18th Conference on Computational Intelligence Methods for Bioinformatics & Biostatistics (CIBB 2023), Padova, Italy, 06-08 Sep 2023. (Accepted for Publication)
Al Alawi, M., Ray, S. and Gupta, M. (2022) A New Functional Data Clustering Technique Based on Spectral Clustering and Downsampling. 17th Conference of the International Federation of Classification Societies (IFCS 2022), Porto, Portugal, 19-23 July 2022. ISBN 9789899895591
Che Roos, N. A., Alsanosi, S. M., Alsieni, M. A., Gupta, M. and Padmanabhan, S. (2017) Antihypertensive Drugs and Risk of Cancer: A Systematic Review and Meta-Analysis of 391, 790 Patients. The American Heart Association's Hypertension 2017 Scientific Sessions, San Francisco, CA, USA, 14-17 Sep 2017. (doi: 10.1161/hyp.70.suppl_1.p129)
Conference Proceedings
Al Alawi, M., Ray, S. and Gupta, M. (2019) A New Framework for Distance-based Functional Clustering. In: 34th International Workshop on Statistical Modelling, Guimarães, Portugal, 07-12 Jul 2019,
Supervision
- Coats, Aaron
Bayesian variable selection for genetic and genomic studies - Kettlewell, Toby
Bayesian statistical modelling of high throughput transcriptomic data
Past PhD supervision:
Tushar Ghosh (joint with V Macaulay) : Hierarchical hidden Markov models with applications to Bisulfite sequencing data.
Maryam Al Alawi (joint with S Ray) : Spectral clustering and downsampling-based model selection for functional data.
Suzy Whoriskey (joint with V Macaulay) : The effect on inferences of population size of the sampling scheme for intraspecific DNA sequences.