Data Engineer

Fold Health

Fold Health

Software Engineering, Data Science
Maharashtra, India · India · Marul haveli, Maharashtra2, India · Haveli, Punjab1, India
Posted on Sep 3, 2025
  • Design, build, and maintain scalable batch and streaming pipelines using Google Cloud Dataflow, Google Cloud Datastream, Airbyte, and orchestration tools (Airflow/Prefect/Dagster).
  • Develop and optimize ETL/ELT processes across AWS Postgres, Google FHIR Store, and Google BigQuery.
  • Build and maintain unified data models that integrate multiple healthcare data sources (EHR/FHIR, claims/X12, ADT/HL7, CRM, transactional Postgres, Tuva, and third-party APIs).
  • Implement dbt/OBT transformations to create curated semantic layers for AI/LLM, BI, and predictive analytics.
  • Ensure data quality, lineage, validation, and governance while maintaining HIPAA compliance and PHI/PII security.
  • Collaborate with AI/ML engineers, BI developers, and product teams to enable data-driven features, dashboards, and predictive models.
  • Implement monitoring, anomaly detection, and pipeline optimization for performance, reliability, and cost efficiency.
  • Participate in architecture discussions, code reviews, and mentoring, setting best practices for data engineering.



  • Requirements

    • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
    • 5+ years of hands-on experience in data engineering (cloud-native environments preferred).
    • Strong proficiency in SQL and Python for data engineering workflows.
    • Proven experience with:
      • Google Cloud Dataflow, Datastream, or Airbyte (streaming & ingestion pipelines)
      • AWS Postgres (transactional & analytical use cases)
      • Google FHIR Store (FHIR APIs & EHR ingestion)
      • Google BigQuery (large-scale data warehouse/lakehouse)
      • dbt/OBT for transformations and modeling
    • Experience designing and maintaining unified data models from heterogeneous healthcare data sources.
    • Familiarity with healthcare data standards: HL7, FHIR, X12 EDI (837/835), ICD, SNOMED, ADT, CCD/CCDA.
    • Experience ensuring HIPAA-compliant data handling, governance, and observability.
    • Knowledge of data modeling (star/snowflake schemas) and cloud-native architectures (AWS/GCP).

    Good to Have

    • Experience with Health care standardization frameworks.
    • Building feature stores for AI/ML pipelines.
    • Familiarity with real-time streaming.
    • Hands-on with Power BI or other BI tools for analytics enablement.
    • Prior work in Value-Based Care, ACOs, MSOs, or population health environments.
    • Ability to mentor junior engineers and establish team-wide best practices.



    Benefits

    • Build impactful data pipelines powering AI, LLMs, and BI in healthcare.
    • Work on meaningful problems that directly improve patient outcomes and provider efficiency.
    • Be part of a growing tech-first healthcare company with strong domain expertise.
    • Collaborative culture with AI, data, product, and engineering teams.
    • Competitive compensation, benefits, and career growth opportunities.