Alperen Taciroğlu, Variant Impact Prediction in The Obscurin and Trio Protein Families Through Evolutionary Conservation and Structural Analysis

Ph.D. Candidate: Alperen Taciroğlu
Program: Health Informatics
Date: 30.05.2025 / 10:00
Place: 
A-212

Abstract: Obscurin and Trio protein families represent evolutionarily related proteins with crucial roles in muscle development and neuronal signalling, respectively. In this study, the evolutionary relationships between these protein families across vertebrate lineages were established through comprehensive phylogenetic analysis of four key domains – kinase, Dbl-homology (DH), CRAL/TRIO, and immunoglobulin domains. Findings support the hypothesis that ancestral Titin and Mcf2 proteins underwent homologous recombination to create the ancestral Obscurin and Trio protein families, with subsequent specialisation and lineage-specific adaptations. Sequence comparison of immunoglobulin domains revealed conserved N-terminal and C-terminal clusters across vertebrates, providing further evidence for this evolutionary model.

Building on these evolutionary insights, the focus was directed towards the N-terminal DH domain of Trio (TrioN), which harbours one-third of all reported pathogenic variants in human DH domains. Trio functions primarily in neuronal migration, axon guidance, and synapse formation, with mutations associated with neurodevelopmental disorders including intellectual disability and autism spectrum disorders.

TrioNsight was developed as a meta-predictor designed to assess mutation impacts for TrioN and 12 highly similar human DH domains. This tool exploits the naïve-Bayes algorithm and leverages structural, evolutionary, and physiochemical features of approximately 1500 TrioN-like DH domains from 294 species. TrioNsight outperforms existing predictors, including AlphaMissense, achieving a Matthews' Correlation Coefficient of 0.906. Additionally, a variant impact map detailing mutation effects at each position was provided, potentially valuable for clinical assessments. This approach establishes a standardised workflow adaptable for creating domain-specific variant predictors for other protein families, offering a template for improved variant interpretation.