DevOps Engineer
Who We’re Hiring:
We are seeking a highly skilled and versatile DevOps Engineer to join our dynamic team and play a critical role in formalizing and building out our cloud infrastructure and observability platforms. This is an exceptional opportunity to be a foundational member of our engineering team, directly impacting our ability to deliver innovative solutions in a secure and compliant environment.
You will be instrumental in designing, implementing, and managing our AWS-centric cloud footprint, with a strong emphasis on security and compliance aligned with NIST-171 and NIST-53 controls. You will build and optimize our DataDog observability platform, ensuring proactive monitoring and insights across our infrastructure. This role is crucial for supporting our growing portfolio of Kubernetes-based services, microservices architectures, and exciting initiatives in LLM Operations (LLM Ops).
If you are passionate about building robust, secure, and scalable infrastructure, thrive in a fast-paced environment, and are excited about the cutting edge of DevOps practices including LLM Ops, we want to hear from you!
Key Responsibilities:
- Cloud Infrastructure & Security (AWS Focus):
- Architect, build, and manage our primarily AWS cloud infrastructure, ensuring scalability, high availability, and cost-efficiency.
- Implement and maintain robust security controls in alignment with NIST-171 and NIST-53 frameworks, ensuring compliance throughout the infrastructure lifecycle.
- Design and implement secure network configurations, access controls, and data protection strategies within AWS.
- Observability Platform (DataDog):
- Lead the build-out and management of our DataDog observability platform, integrating monitoring, logging, tracing, and alerting across all systems.
- Develop and implement comprehensive dashboards and alerts to proactively identify and resolve performance and security issues.
- Optimize DataDog configurations for cost-effectiveness and maximum visibility.
- CI/CD & Automation:
- Design, implement, and optimize CI/CD pipelines using modern DevOps tools to enable rapid and secure software delivery to our Kubernetes environments.
- Drive the adoption of Infrastructure-as-Code (IaC) using Terraform to automate infrastructure provisioning and management.
- Develop automation scripts and tools to streamline operational tasks, improve efficiency, and reduce manual errors.
- Kubernetes & Microservices Support:
- Manage and optimize our Kubernetes clusters in AWS (EKS), ensuring performance, security, and scalability for our microservices and applications.
- Collaborate with development teams to ensure smooth deployment and operation of microservices within Kubernetes.
- Implement best practices for containerization and orchestration.
- LLM Operations (LLM Ops):
- Participate in the development and operationalization of infrastructure to support LLM Ops initiatives, including model deployment, monitoring, and scaling.
- Explore and implement DevOps best practices for managing and optimizing LLM infrastructure.
- Contribute to the development of robust and scalable solutions for emerging AI/ML workloads.
- Collaboration & Best Practices:
- Collaborate closely with engineering, security, and product teams to align DevOps strategies with business objectives.
- Champion DevOps best practices across the organization, promoting automation, security, and efficiency.
- Identify and resolve bottlenecks in development and deployment processes.
Basic Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent professional experience.
- 5+ years of hands-on experience as a DevOps Engineer or Platform Engineer, with a strong focus on AWS.
- Proven expertise in building and managing infrastructure within AWS.
- Solid understanding of and experience implementing security controls aligned with frameworks such as NIST-171 and NIST-53.
- Proficiency with Infrastructure-as-Code tools (Terraform / Open Tofu).
- Strong experience with containerization (Docker) and orchestration (Kubernetes).
- Hands-on experience designing, implementing, and managing CI/CD pipelines.
- Experience with observability platforms, and familiarity with DataDog is highly preferred.
- U.S. citizenship required.
Preferred Qualifications:
- Experience in a high-growth startup or B2B SaaS environment.
- Deep expertise in configuring and managing DataDog for comprehensive observability.
- Specific experience implementing and maintaining systems compliant with NIST-171 and NIST-53.
- Experience with service mesh frameworks (e.g., Istio, Linkerd).>
- Advanced AWS certifications such as AWS Certified DevOps Engineer – Professional or AWS Certified Security – Specialty.
- Experience with LLM Ops or AI/ML infrastructure.
- Background in GRC and audit certifications (SOC2, ISO27001, etc.).
Location: Remote, with a preference for candidates in Minnesota or the Central Time Zone.
Why Join Us?
This is a unique opportunity to join a company at a pivotal stage of growth and be a foundational DevOps leader. You will have the chance to build our cloud infrastructure and observability platforms from the ground up, using the latest technologies and best practices. You will be directly involved in supporting cutting-edge initiatives in LLM Ops and ensuring our systems are secure and compliant to the highest standards. If you are looking for a role where you can make a significant impact, learn and grow rapidly, and work on exciting challenges, this is the perfect opportunity for you.