Role Overview
We are seeking an experienced and highly accomplished Site Reliability Engineer (SRE) to join our elite team. The Senior Network SRE will take on the leadership role in ensuring the impeccable reliability, availability, and performance of our advanced network infrastructure. This position demands a strategic thinker who can collaborate with cross-functional teams to implement cutting-edge best practices, drive process automation, and elevate system resilience to new heights.
Role Responsibilities
- Architect and implement scalable, reliable, and secure on-premise and cloud network infrastructure solutions.
- Oversee continuous monitoring of network performance and traffic patterns to ensure optimal performance, high availability, and low latency.
- Leverage advanced monitoring tools and dashboards to detect, analyse, and resolve network anomalies.
- Proactively identify, assess, and mitigate risks impacting network reliability.
- Develop and refine automated scripts and tools to streamline network provisioning, configuration, and incident resolution processes, thereby minimizing manual intervention.
- Collaborate with cross-functional development and operations teams to ensure seamless integration and deployment of network services.
- Establish and manage comprehensive monitoring, logging, and alerting systems to proactively identify and address potential network issues.
- Continuously improve network performance, security, and scalability through regular assessments and optimizations.
- Troubleshoot complex network issues, working cross-functionally to identify root causes and provide lasting solutions.
- Analyse and forecast network growth, collaborating with the team to scale infrastructure in line with company needs.
- Maintain comprehensive documentation for network architecture, processes, and troubleshooting guides.
Experience / Competences
Essential
- Bachelor’s degree in computer science, Information Technology, or a related field.
- Strong understanding of network protocols (TCP/IP, BGP, OSPF) and network security practices.
- Strong knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and networking technologies (e.g., VPN, DNS, load balancing).
- Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Ansible, Terraform).
- Experience with network monitoring and logging tools (e.g., Thousandeyes, Prometheus, Grafana, ELK stack).
- Excellent troubleshooting skills with the ability to resolve complex network-related issues quickly and the ability to work under pressure.
- Strong collaboration skills, with the ability to communicate effectively across teams.
Desired
- Cisco, Fortinet, F5 Certifications or equivalent.
- Cloud Certifications such as AWS Certified Advanced Networking, Azure Network Engineer Associate, or equivalent.
- Experience with DevOps practices and CI/CD pipelines.
- Knowledge of SRE principles.
Job Band & Level
#LI-Hybrid #LI-MID