Skip to content
View mfeldman143's full-sized avatar

Block or report mfeldman143

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mfeldman143/README.md

Michael Feldman

Research Software Engineer · Data Infrastructure · Open Science

I build the pipelines, databases, and scalable architectures that make research data usable at scale. My work sits at the intersection of scientific computing, data engineering, and open-source community building.

Previously RSE at Stanford University (METER-AI / Andrew Ng, Doerr School of Sustainability, SRCC), where I architected cloud data pipelines processing 14M+ records from 200+ sources and led the open-sourcing of a methane emissions dataset now used by Planet and CarbonMapper for climate mitigation. Before that, nearly three years at UW-Madison Radiology building serverless containerized tools for neuroimaging data management (BIDS, NIfTI, DICOM, Flywheel.io) and GB-range deidentification pipelines under HIPAA/IRB compliance.

Currently exploring open neuroscience infrastructure — studying how platforms like brainlife.io and FreeSurfer are architected, and how standards like BIDS and NWB govern the flow of data from scanner to archive.

Areas of Interest

  • Neuroimaging & neurophysiology data — BIDS, NIfTI, DICOM, NWB, EEG pipelines, deidentification
  • Data versioning & reproducibility — Git internals, containerized workflows, CI/CD for research
  • Graph-based data models — Neo4j, provenance modeling (PROV), brain connectivity, cross-archive metadata linking
  • Agentic AI for science — LLM-powered development workflows, automated data validation, standards compliance
  • Cloud data engineering — GCP (certified), AWS, BigQuery, Terraform, Docker, Kubernetes

Open Source

  • MassGen1 — Framework for AI-augmented workflows
  • Contributions and explorations in neuroinformatics tooling — studying DataLad, DANDI, HeuDiConv ecosystems

Certifications

NVIDIA Certified Professional: Agentic AI (2026) · Google Cloud Professional Data Engineer (2025) · Neo4j Certified Professional (2025) · Neo4j Graph Data Science (2025) · Google Cloud Professional ML Engineer (2020) · Google Cloud Professional Cloud Architect (2018, 2020)

Education

M.S. Applied Statistics, Penn State University (GPA 3.77) · B.S. Finance, Minor in Statistics, Penn State University

Publications & Presentations

  • Stanford SRCC, 2023 — METER-AI: BigQuery pipelines, serverless architecture, SAM (Segment Anything Model) integration
  • SIIM, 2019 — Healthcare interoperability, serverless functions, and big data analytics

Interested in open-source neuroscience infrastructure, reproducible science, and building tools that make research data FAIR and accessible. Always looking to connect with people working on these problems.

@mfeldman143's activity is private