[BetterPic] Senior DevOps Engineer
We're looking for a
Senior DevOps Engineer (Kubernetes, GPU Optimization, AWS)
About the job
BetterGroup (the team behind BetterPic and BetterStudio) is a profitable AI startup scaling fast, millions in revenue, millions of requests, and increasing GPU workloads across multiple products. On BetterPic, our AI makes professional headshots accessible to anyone, fast, affordable, and 4K studio-quality. Now we’re expanding into B2B, building high-impact workflows for teams and partners. On BetterStudio we’re solving photoshoots for fashion ecommerce.
We’re looking for a Senior DevOps Engineer to architect, scale, and optimize our infrastructure across AI workloads, containerized systems, and cloud deployments. You’ll work closely with product, AI, and backend teams to ensure our infra is secure, cost-efficient, and ready to scale to 10x.
We value your time, here’s what matters…
Team Perks

Equity Opportunities
Senior profiles receive stock options, aligning your success with ours.

Inclusive Culture
Join a diverse team that values every voice and perspective.

Unlimited Vacations
Choose when and how long to take time off, with trust in your responsibility.

Remote-First
Enjoy the flexibility of working from anywhere in the world.

Growth Opportunities
Clear paths for career advancement and professional development.

Innovative Work
Be at the forefront of AI technology in professional imaging. We're solving a problem that has never been solved before.
Meet your Team
All headshots generated using Betterpic ❤️

Miguel Rasero
CTO & Co-Founder
Larnaca , Cyprus

Fedor Korol
Founding AI Engineer
Podgorica, Montenegro

Goke Fadare
Comfy UI Expert
Georgia, USA
About you
Your Impact:
Design and manage Kubernetes clusters across AWS
Build pull-based task systems for containerized AI jobs
Scale GPU containers across providers with optimal VRAM and resource sharing
Set up and maintain queues (e.g., Redis, RabbitMQ, custom pipelines)
Architect disaster recovery plans and monitoring/alerting strategies
Manage IaC (Terraform/CDK), automate deploys, enforce CI/CD best practices
Build and maintain FinOps dashboards for spend transparency and optimization
Work on compliance infrastructure (SOC2, ISO27001) if/when needed
What You’ll Tackle
Multi-process GPU containers that don’t crash on VRAM overload
Real-time autoscaling based on job queue, GPU load, and user usage
Cost-efficient cloud orchestration, hybrid models, spot instance fleets
Long-term infrastructure planning (multi-region, backup, failover, etc.)
Containerized AI inference at scale, fast, lean, observable
Security and compliance automation for scaling B2B and enterprise
What You Need
Must-Have:
5+ years in DevOps, Infra, or SRE roles
Deep experience with Kubernetes, Docker, and Cloud (AWS)
Experience scaling systems to millions of requests or users
Strong expertise in databases, queues, distributed systems, multiprocessing, IPC, async systems, and concurrent systems, with a focus on performance and reliability
You’ve built disaster recovery plans and long-term infra strategy before
You’ve optimized GPU-heavy systems (VRAM usage, container orchestration, etc.)
Bonus Points:
Experience with SOC2 / ISO27001 audits or tooling
Background in FinOps and cost optimization at scale
Familiarity with AI/ML workflows (ComfyUI, Diffusers, PyTorch, etc.)
Tools like Terraform, Helm, Prometheus, Grafana, ArgoCD, etc.
Past experience working with AI infra (inference + training environments)
Our Stack
Kubernetes (AWS EKS), Docker, Terraform
Node.js/Express, PostgreSQL, Redis
AWS (S3, EC2, Lambda, SES), Cloudflare, Render
Monitoring: DataDog, CloudWatch
Application Process
1. Fill out this form
2. [Take Home] Coding Challenge (1-3 hours)
2.1. Small development challenge
2.2. Brief [5 minutes max] Loom video explaining functionality, implementation and design decisions
3. Interview with the CTO (30 minutes)
4. Interview with the rest of the management team (30 minutes)
5. Offer from BetterPic
Final Notes
This is not an L2 cloud engineer role. You’ll own infrastructure decisions and shape how we scale AI systems, serve millions of users, and optimize every dollar spent on GPUs.
If that excites you, let’s talk!
Ready to build what’s never been built?