🧬 Bash/Python/ETL/Regression – Transcriptome-Wide Association Studies for Finding Genes Associated with IBD

This was my capstone project completed as part of getting my B.S. in Data Science at UC San Diego with 3 other classmates. We leveraged Transcriptome-Wide Association Studies to identify genes that are associated with irritable bowel disease using various genetic data, bioinformatics tools, and simple linear regression. In the end, we identified 7 genes that are highly associated with IBD. Such techniques have implications for preventative healthcare, in which they could be employed to help diagnose individuals before onset of symptoms/disease.

View this website for summary of the data, methods, and results of the project, the report for full details on our work and findings, and this repository for all the code, including how to reproduce the project.

See below or this link to the PDF for the project poster, which provides a nice overview and visualizations of the work we did and our findings.