• Own deployment, reliability, performance, and cloud infrastructure for FDA-approved, AI-powered medical devices in a HIPAA-regulated environment, partnering with AI and software teams to operationalize ML systems in production.
• Provisioned large-scale AI training infrastructure on GCP, including a 256-GPU NVIDIA H100 Slurm cluster with high-performance shared storage, enabling distributed training and removing checkpointing bottlenecks.
• Productionized LLM inference on AWS by packaging fine-tuned models as Docker/vLLM images, publishing 100GB+ containers to Amazon ECR via GitHub Actions, and deploying to SageMaker.
• Led a Lambda-based inference initiative that improved scalability and reduced cost by 23% versus EC2.
• Built a production-grade GCP Dataflow pipeline to de-identify and catalog 1.67PB of medical images in one month, reducing run costs by ~75% and enabling compliant training data preparation.
• Applied AWS Bedrock Data Automation to extract text from image-based medical reports, improving access to unstructured clinical data for downstream AI and analytics workflows.
• Strengthened multi-cloud security for ML workloads by implementing workload identity federation for EKS workloads accessing GCP resources without long-lived credentials.
• Built centralized observability and alerting for critical production systems using CloudWatch dashboards, Okta-secured access, and PagerDuty integrations, improving operational visibility, incident response, and service reliability.
• Created reusable Terraform modules adopted across 20+ AWS accounts to standardize CloudWatch dashboards, GitHub Actions OIDC federation in IAM, AWS Lambda infrastructure, and AWS Verified Access endpoints.
• Led rollout of OIDC-based AWS authentication across CI/CD pipelines, eliminating static credentials and improving security for infrastructure delivery.
• Automated AWS operational workflows with Step Functions, Lambda, and Systems Manager to orchestrate tasks such as EC2 patching, improving security, standardization, and reliability in production environments.
• Extended Terraform-based CI/CD delivery across AWS and GCP, improving consistency and automation for multi-cloud infrastructure.
• Modernized the EKS-based infrastructure delivery platform, reducing cost by ~83% while improving maintainability and supportability for containerized workloads.
• Built immutable image pipelines with AWS CodeBuild and EC2 Image Builder to support reliable application releases.
• Installed and configured the PerformanceBridge radiology analytics platform and supporting systems across Linux and Windows Server environments for healthcare customers in multiple regions.
• Resolved complex customer concerns and technical issues through deep investigation, research, and reproduction.
• Automated recurring configuration tasks, created procedural documentation, and provided training to colleagues, improving operational consistency and onboarding.
• Supported product expansion into EMEA and APAC through client implementations and technical configuration for organizations including:
• Built high-performance cloud-based applications for biobanking and laboratory automation.
• Won employee Key Strategy award for contributions to COVID-19 projects.
• Streamlined COVID-19 testing and reporting workflows, increasing efficiency and reducing errors for global biobanking processes.
• Designed a sample lineage interface for the UK NHS COVID-19 system, enabling the processing of hundreds of thousands of tests per week.
• Developed a sample normalization workflow for Curative Korva Labs, which conducted 20% of all tests in California, and integrated with state and local public health departments for reporting results.
• Built Rails-based software for clients including Roche and the National Institutes of Health.
• Reduced sample turnaround times for multiple clients from weeks to hours.
• Designed integrations with hardware devices and external systems including:
• Designed and built analytics tools for the post-acute care industry.
• Built web scrapers to collect plan data from state health insurance exchanges.
• Rebuilt a Tableau-based product using open-source software and on-demand cloud solutions, lowering costs by 50%.
• Designed and developed a medical sample inventory application for an Android RFID reader.
• Integrated with hardware devices including: