Here

Born

to build

what’s next. Together

Looking to join a remote team of high performers? Eager to be truly supported by AI and guided by purpose?

DevOps Lead

Location:

Remote/Hybrid (East Coast US Preferred)

Department:

Engineering

Employment Type:

Full-Time

Reports To:

CTO

About Baryons

Baryons is building the next generation of AI-driven flourishing and conversation platforms — intelligent systems that help people grow, reflect, and thrive through meaningful dialogue. Our products combine voice, memory, and human insight to create interactions that feel natural, personal, and alive.

As a science-backed, patent-pending organization, we’re defining a new category of human-AI connection — one that’s grounded in psychology, neuroscience, and the science of human flourishing.

Role Overview

As DevOps Lead, you’ll architect, implement, and maintain the cloud infrastructure and automation that powers our AI-driven applications. You’ll be hands-on with Azure and Google Cloud (GCP) environments, focusing on Kubernetes orchestration, scalability, automation, cost optimization, and security. You’ll guide DevOps best practices, lead our CI/CD initiatives, and ensure our systems are secure, reliable, and built to scale. Experience with LiveKit and real-time agent infrastructure is a strong plus.

Responsibilities


  • Lead the design, deployment, and management of scalable, secure infrastructure in Azure and Google Cloud (GCP).

  • Architect and manage Kubernetes clusters, ensuring high-availability, disaster recovery, and efficient orchestration of containerized workloads.

  • Build, configure, and maintain automation for infrastructure provisioning, application deployment, monitoring, and alerting (using tools like Terraform, Helm, etc.).

  • Implement and refine CI/CD pipelines for all engineering teams, ensuring rapid, safe, and repeatable delivery of code and AI models.

  • Monitor and optimize cloud infrastructure for cost management and resource utilization.

  • Develop and maintain comprehensive observability and logging systems (metrics, logs, tracing) to enable real-time monitoring, alerting, and performance optimization.

  • Implement and enforce cloud security best practices, secrets management, and compliance protocols (SOC2, HIPAA, or similar, as applicable).

  • Design and maintain disaster recovery, backup, and high-availability strategies for critical applications and data.

  • Champion DevOps best practices, automation, and a culture of ownership and operational excellence across the engineering organization.

  • Collaborate with software engineers, data scientists, and product leads to enable efficient development, deployment, and operation of AI- and voice-powered systems.

  • Ensure all infrastructure, automation, and deployment processes are well-documented and accessible.

  • Mentor and support junior DevOps and engineering team members.

  • Troubleshoot, resolve, and prevent production issues in a fast-paced environment.

Required Qualifications


  • 5+ years of experience in DevOps, Site Reliability Engineering, or related roles.

  • Deep expertise with Azure and Google Cloud (GCP) environments, including network, security, and storage services.

  • Proven experience architecting, deploying, and managing Kubernetes clusters in production environments.

  • Strong automation skills: Terraform (or equivalent IaC tools), Helm, and scripting languages (Python, Bash, etc.).

  • Demonstrated experience building and maintaining robust CI/CD pipelines (GitHub Actions, GitLab CI, or similar).

  • Solid understanding of cost optimization strategies for cloud-native applications.

  • Experience with monitoring, alerting, and observability (Prometheus, Grafana, Datadog, etc.).

  • Experience implementing security best practices and compliance protocols.

  • Experience designing and maintaining disaster recovery and high-availability solutions.

  • Excellent troubleshooting, communication, and collaboration skills.

Nice-to-Have


  • Experience with LiveKit and real-time agent infrastructure (LiveKit Agents, voice/video, WebRTC, etc.).

  • Background working with AI/ML model deployment and scaling in production.

  • Familiarity with additional clouds (AWS, OCI) or hybrid cloud architectures.

  • Prior experience in a startup or high-growth SaaS environment.

Join the Waitlist

Be among the first to practice flourishing with Baryons. Your Flourishing Partner is built to help you and your team live with clarity, lead with confidence, and grow with curiosity. Join the waitlist to gain early access and help shape the future of flourishing at work.

Join the Waitlist

Be among the first to practice flourishing with Baryons. Your Flourishing Partner is built to help you and your team live with clarity, lead with confidence, and grow with curiosity. Join the waitlist to gain early access and help shape the future of flourishing at work.

Join the Waitlist

Be among the first to practice flourishing with Baryons. Your Flourishing Partner is built to help you and your team live with clarity, lead with confidence, and grow with curiosity. Join the waitlist to gain early access and help shape the future of flourishing at work.