← All People

Sing-Hoi Sze

Sze, Sing-Hoi
Sing-Hoi Sze
Associate Professor of Computer Science and Engineering and of Biochemistry and Biophysics
HRBB / Room 328B
Undergraduate Education
B.Sc. Chinese University of Hong Kong (1990)
Graduate Education
M.S. Pennsylvania State University (1995)
Ph.D. University of Southern California (2000)
Postdoc. University of California-San Diego (2001-02)
Joined Texas A&M in 2002

Bioinformatics / Computational Biology

Our work focuses on the application of various computer science techniques to solve computational problems in molecular biology. Our current research projects cover diverse areas in bioinformatics, including motif finding algorithms and their applications, computational approaches to model transcription factor binding sites, and algorithms for EST sequence assembly and enumeration of alternatively spliced variants of a gene.

The motif finding problem can be formulated as follows: given a set of sequences, find a pattern (motif) shared by these sequences. The major biological application of this computational problem is to identify transcription factor binding sites given a set of upstream sequences of genes that are believed to be co-regulated. Existing motif finding approaches usually make simplifying assumptions in modeling these sites and we are on a constant quest to develop better models. Recently, we work with a few groups of biologists on designing experiments to verify our predictions.

Another active research project is the identification of alternatively spliced variants of a gene from EST sequences. The traditional approach to this problem is to assemble EST sequences that represent fragments of a gene into a longer linear sequence which represents the most dominant form of the gene. In order to better model the splicing structure, we develop an algorithm to assemble the given set of EST sequences into a non- linear graph structure, so that each alternatively spliced variant of a gene is represented as a path in the graph.

Recent Publications

<!-- load from cache
  1. Qiu, C, Jin, H, Vvedenskaya, I, Llenas, JA, Zhao, T, Malik, I et al.. Universal promoter scanning by Pol II during transcription initiation in Saccharomyces cerevisiae. Genome Biol. 2020;21 (1):132.
    doi: 10.1186/s13059-020-02040-0. PubMed PMID:32487207. .

  2. Zhu, Z, Rehman, KU, Yu, Y, Liu, X, Wang, H, Tomberlin, JK et al.. De novo transcriptome sequencing and analysis revealed the molecular basis of rapid fat accumulation by black soldier fly (Hermetia illucens, L.) for development of insectival biodiesel. Biotechnol Biofuels. 2019;12 :194.
    doi: 10.1186/s13068-019-1531-7. PubMed PMID:31413730. PubMed Central PMC6688347.

  3. Pimsler, ML, Sze, SH, Saenz, S, Fu, S, Tomberlin, JK, Tarone, AM et al.. Gene expression correlates of facultative predation in the blow fly Chrysomya rufifacies (Diptera: Calliphoridae). Ecol Evol. 2019;9 (15):8690-8701.
    doi: 10.1002/ece3.5413. PubMed PMID:31410272. PubMed Central PMC6686648.

  4. Fu, S, Chang, PL, Friesen, ML, Teakle, NL, Tarone, AM, Sze, SH et al.. Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus. BMC Genomics. 2019;20 (Suppl 5):425.
    doi: 10.1186/s12864-019-5702-5. PubMed PMID:31167652. PubMed Central PMC6551239.

  5. Zhang, Y, Pechal, JL, Schmidt, CJ, Jordan, HR, Wang, WW, Benbow, ME et al.. Machine learning performance in a microbial molecular autopsy context: A cross-sectional postmortem human population study. PLoS ONE. 2019;14 (4):e0213829.
    doi: 10.1371/journal.pone.0213829. PubMed PMID:30986212. PubMed Central PMC6464165.

  6. Zhang, M, Cui, Y, Liu, YH, Xu, W, Sze, SH, Murray, SC et al.. Accurate prediction of maize grain yield using its contributing genes for gene-based breeding. Genomics. 2020;112 (1):225-236.
    doi: 10.1016/j.ygeno.2019.02.001. PubMed PMID:30826444. .

  7. Song, JM, Arif, M, Zhang, M, Sze, SH, Zhang, HB. Phenotypic and molecular dissection of grain quality using the USDA rice mini-core collection. Food Chem. 2019;284 :312-322.
    doi: 10.1016/j.foodchem.2019.01.009. PubMed PMID:30744863. .

  8. Qiu, C, Erinne, OC, Dave, JM, Cui, P, Jin, H, Muthukrishnan, N et al.. Correction: High-Resolution Phenotypic Landscape of the RNA Polymerase II Trigger Loop. PLoS Genet. 2018;14 (1):e1007158.
    doi: 10.1371/journal.pgen.1007158. PubMed PMID:29298339. PubMed Central PMC5751974.

  9. Sze, SH, Parrott, JJ, Tarone, AM. A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms. BMC Genomics. 2017;18 (Suppl 10):895.
    doi: 10.1186/s12864-017-4270-9. PubMed PMID:29244008. PubMed Central PMC5731495.

  10. Sze, SH, Pimsler, ML, Tomberlin, JK, Jones, CD, Tarone, AM. A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms. BMC Genomics. 2017;18 (Suppl 4):387.
    doi: 10.1186/s12864-017-3735-1. PubMed PMID:28589866. PubMed Central PMC5461550.

Search PubMed