We’re an award-winning global outsourcer providing contact center and back office services on behalf of our global clients. Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!
Acquire Intelligence exists to help businesses unlock smarter ways of working. We believe that by combining the best of people, process, and automation, companies can grow faster and operate with greater confidence. Our purpose is to remove complexity, improve performance, and drive intelligent transformation for organizations around the world.
As an Acquire Intelligence employee, your role is vital in achieving and exceeding individual and team targets that support company objectives, while building and maintaining stakeholder relationships. You’re also responsible for complying with and enforcing procedures aligned with our information security policies.
As a values-led organization, we expect all our team members to exemplify our four values : Curious and Clever , Entrepreneurial Energy , Fast with Intent , and Laugh and Learn .
A SNAPSHOT OF YOUR ROLE
Responsibilities of the Site Reliability Engineer will include but are not limited to :
Service Level Management & Reliability
- Define, monitor, and enforce Service Level Objectives (SLOs) and error budgets across all production systems
- Track error budget burn rates and make data-driven decisions to halt risky deployments when thresholds are exceeded
- Implement comprehensive monitoring and alerting strategies using Prometheus, Grafana, and PagerDuty
- Establish and maintain reliability standards that support business-critical uptime requirements
Infrastructure Automation & Management
Design and implement Infrastructure as Code (IaC) solutions using Pulumi with TypeScriptManage and optimize AWS services including EKS (Elastic Kubernetes Service), MSK (Managed Streaming for Kafka), SingleStore, MongoDB S3Automate operational processes to eliminate toil, targeting any task that consumes more than 2 engineer-days per quarterIncident Response & Post-Mortem Leadership
Serve as incident commander during production outages and service degradationsLead comprehensive post-mortem processes within 48 hours of incidentsDrive "never-again" corrective actions to completion, ensuring systemic improvementsMaintain and improve incident response procedures and runbooksSecurity & Compliance
Implement and enforce least-privilege IAM policies across all AWS resourcesManage security patch pipelines and vulnerability remediation processesSupport compliance initiatives including SOC2 and ISO 27001 certification requirementsEnsure security best practices are embedded in all infrastructure and operational proceduresOn-Call & Operational Excellence
Participate in follow-the-sun on-call rotation with one week primary / secondary commitment every five weeksProvide 24×7 support coverage across AU / NZ, EU / ZA, and MX time zonesMaintain operational runbooks and knowledge