[BetterPic] Senior DevOps Engineer

Apply now

We're looking for a

Senior DevOps Engineer (Kubernetes, GPU Optimization, AWS)

About the job

BetterGroup (the team behind BetterPic and BetterStudio) is a profitable AI startup scaling fast, millions in revenue, millions of requests, and increasing GPU workloads across multiple products. On BetterPic, our AI makes professional headshots accessible to anyone, fast, affordable, and 4K studio-quality. Now we’re expanding into B2B, building high-impact workflows for teams and partners. On BetterStudio we’re solving photoshoots for fashion ecommerce.

We’re looking for a Senior DevOps Engineer to architect, scale, and optimize our infrastructure across AI workloads, containerized systems, and cloud deployments. You’ll work closely with product, AI, and backend teams to ensure our infra is secure, cost-efficient, and ready to scale to 10x.

We value your time, here’s what matters…

Team Perks

Equity Opportunities

Senior profiles receive stock options, aligning your success with ours.

Inclusive Culture

Join a diverse team that values every voice and perspective.

Unlimited Vacations

Choose when and how long to take time off, with trust in your responsibility.

Remote-First

Enjoy the flexibility of working from anywhere in the world.

Growth Opportunities

Clear paths for career advancement and professional development.

Innovative Work

Be at the forefront of AI technology in professional imaging. We're solving a problem that has never been solved before.

Meet your Team

All headshots generated using Betterpic ❤️

Miguel Rasero

CTO & Co-Founder
Larnaca , Cyprus

Fedor Korol

Founding AI Engineer
Podgorica, Montenegro

Goke Fadare

Comfy UI Expert
Georgia, USA

About you

Your Impact:

  • Design and manage Kubernetes clusters across AWS

  • Build pull-based task systems for containerized AI jobs

  • Scale GPU containers across providers with optimal VRAM and resource sharing

  • Set up and maintain queues (e.g., Redis, RabbitMQ, custom pipelines)

  • Architect disaster recovery plans and monitoring/alerting strategies

  • Manage IaC (Terraform/CDK), automate deploys, enforce CI/CD best practices

  • Build and maintain FinOps dashboards for spend transparency and optimization

  • Work on compliance infrastructure (SOC2, ISO27001) if/when needed


What You’ll Tackle

  • Multi-process GPU containers that don’t crash on VRAM overload

  • Real-time autoscaling based on job queue, GPU load, and user usage

  • Cost-efficient cloud orchestration, hybrid models, spot instance fleets

  • Long-term infrastructure planning (multi-region, backup, failover, etc.)

  • Containerized AI inference at scale, fast, lean, observable

  • Security and compliance automation for scaling B2B and enterprise

What You Need
Must-Have:

  • 5+ years in DevOps, Infra, or SRE roles

  • Deep experience with Kubernetes, Docker, and Cloud (AWS)

  • Experience scaling systems to millions of requests or users

  • Strong expertise in databases, queues, distributed systems, multiprocessing, IPC, async systems, and concurrent systems, with a focus on performance and reliability

  • You’ve built disaster recovery plans and long-term infra strategy before

  • You’ve optimized GPU-heavy systems (VRAM usage, container orchestration, etc.)

Bonus Points:

  • Experience with SOC2 / ISO27001 audits or tooling

  • Background in FinOps and cost optimization at scale

  • Familiarity with AI/ML workflows (ComfyUI, Diffusers, PyTorch, etc.)

  • Tools like Terraform, Helm, Prometheus, Grafana, ArgoCD, etc.

  • Past experience working with AI infra (inference + training environments)

Our Stack

  • Kubernetes (AWS EKS), Docker, Terraform

  • Node.js/Express, PostgreSQL, Redis

  • AWS (S3, EC2, Lambda, SES), Cloudflare, Render

  • Monitoring: DataDog, CloudWatch

Application Process

1. Fill out this form

2. [Take Home] Coding Challenge (1-3 hours)

  • 2.1. Small development challenge

  • 2.2. Brief [5 minutes max] Loom video explaining functionality, implementation and design decisions

3. Interview with the CTO (30 minutes)

4. Interview with the rest of the management team (30 minutes)

5. Offer from BetterPic

Final Notes
This is not an L2 cloud engineer role. You’ll own infrastructure decisions and shape how we scale AI systems, serve millions of users, and optimize every dollar spent on GPUs.

If that excites you, let’s talk!

Ready to build what’s never been built?

Time to join a winning team

BetterPic was founded with a vision to democratize professional imagery. Our diverse, global team brings together expertise from various fields to create innovative AI-powered solutions for high-quality, personalized headshots.

From our CEO's track record in building and exiting successful startups to our CTO's experience in scaling technologies across AI, gaming and job search platforms, we leverage our collective knowledge to push the boundaries of what's possible in AI and imaging technology.

Our website