Oversaw the development of large scale, multi-modal causal models in multiple disease areas
such as cancer, neurodegenerative, and cardiac. This includes the development of
disease-specific multi-omic
data pipelines.
Led the design and development of a flagship model building pipeline that integrates an
open-source data science tool, DVC (data version control), and proprietary causal AI
software, REFS.
Provided HPC support for platform migration from StarCluster to AWS ParallelCluster.
Data Scientist, R&D
Aitia. Somerville, MA
Feb. 2020 - Feb. 2021
Constructed Bayesian network models using proprietary Causal AI technology (REFS) to exploit
the recent explosive growth of multi-omics and clinical data in order to create “virtual”
(in silico) patients in oncology, neurodegeneration, and immunology.
Disease specific data-sets with tens of thousands of features were curated and pre-processed
for model building.
Processed and modelled novel data such as “single-cell data” which is ~10,000X larger than
standard transcriptomic data.
Research Assistant, Guasto Lab
Tufts University. Medford, MA
2016 - 2021
Studied localized stretching structures of viscoelastic fluids in porous media via DNA
visualization techniques.
Using particle tracking and Lagrangian statistical tools, discovered that dispersion is
regulated by flow geometry in viscoelastic flows.
Designed and conducted Monte-Carlo Langevin simulations of swimming cells in a viscosity
gradient to support experimental discoveries.
Discovered that disordering flow geometry affects the local flow type experienced by
viscoelastic fluids and hinders a critical flow instability responsible for chaotic velocity
fluctuations.
Utilizing high speed microscopy, invertebrate sperm flagellum buckling was studied in a
microfluidic extensional flow. An in-house flagella tracking algorithm was developed to
investigate flagellum curvature.
Data Science Consultant
Gene Network Sciences Inc. Cambridge, MA
2020
Curated real-world financial and weather data was processed and vetted to build causal and
predictive models using a proprietary causal machine learning platform (REFS).
Identified quality issues with customer-provided data by developing a bespoke outlier
detection algorithm, leading to customer modification of their internal data pipelines.
Using repeated & stratified cross-validation, demonstrated value of integrating customer
data with multi-modal financial data for building predictive models and assisting with
go/no-go decisions for modeling.
Findings were reported to clients and executives via intuitive graph visualizations (iGraph)
and presentations. Described technical methods to lay audiences.