We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Site Reliability Engineer

Cisco Systems, Inc.
paid time off
United States, North Carolina, Cary
Apr 25, 2025
The application window is expected to close on: 04/30/2025 Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received. Meet the Team

We are a software engineering team building platforms and tools that streamline infrastructure and platform service delivery, improve reliability, and enable the automation of IT operational functions at an extensive scale.

Our customers include engineers from various engineering business units and enterprise application teams that rely on our IT infrastructure and platform services to run the business.

We operate in a DevOps model, where our developers are responsible for the complete software development lifecycle, from design through operations. While we work closely with infrastructure, solving problems through software development is at our core. This role offers a superb opportunity to work with a distributed team to transform how infrastructure and cloud platforms are developed and managed using software development, AI, and automation.

Your Impact

As a Site Reliability Engineer (SRE), you will play a crucial role in improving the reliability, performance, and efficiency of our platforms and services. You will work closely with our software engineers and infrastructure teams to identify and remediate toil, develop resiliency-focused improvement, and implement automation and software engineering solutions to meet and exceed our service level objectives (SLOs). You will be instrumental in defining service level indicators (SLIs) for new initiatives and ensuring our platforms are robust and scalable from the start.

Your responsibilities will include:

Analyze existing systems and identify areas for improvement in terms of reliability, performance, and automation.

Develop and implement automation solutions to reduce toil and improve operational efficiency.

Collaborate with software engineers to design and implement highly resilient and scalable architectures.

Define and monitor service level indicators (SLIs) and service level objectives (SLOs) using the team's observability and service assurance tooling.

Participate in operational support and responding to incidents in a timely and effective manner.

Participate in blameless postmortems and implementing preventative measures.

Implement and enforce security best practices, policies, and procedures to ensure a high degree of security hygiene

Drive adoption and education of SRE standard methodologies within our team.

Our Minimum Qualifications for this role:
  • Bachelor's degree in computer science, computer engineering, electrical engineering or equivalent is required with minimum 5 years of experience in an SRE, DevOps or related role.
  • Minimum 2 years of programming skills in Go.
  • Minimum 2 years of experience using configuration management tools, such as Terraform or Ansible
  • Proficiency in containerization technologies, demonstrated by at least 2 years of experience working with Docker and Kubernetes.
  • Strong understanding of service level indicators (SLIs) and service level objectives (SLOs), with practical experience in defining and measuring these metrics in a production environment.
Our Preferred Qualifications for this role:
  • MS degree preferred
  • Experience with Python
  • Familiarity with cloud platforms such as AWS, Azure, or GCP.
  • Knowledge with virtualization platforms such as VMware, Nutanix, OpenStack, Anthos, OpenShift
  • Working knowledge of observability tools such as Prometheus, Grafana, Splunk, and Zabbix.
  • Practical experience with scrum agile development methodologies.
  • Experience supporting business-critical enterprise applications.
  • Experience with workflow orchestration tools (e.g., Stackstorm, Argo Workflows).

#WeAreCisco where every individual brings their unique skills and perspectives together to pursue our purpose of powering an inclusive future for all.

Our passion is connection-we celebrate our employees' diverse set of backgrounds and focus on unlocking potential. Cisconians often experience one company, many careers where learning and development are encouraged and supported at every stage. Our technology, tools, and culture pioneered hybrid work trends, allowing all to not only give their best, but be their best.

We understand our outstanding opportunity to bring communities together and at the heart of that is our people. One-third of Cisconians collaborate in our 30 employee resource organizations, called Inclusive Communities, to connect, foster belonging, learn to be informed allies, and make a difference. Dedicated paid time off to volunteer-80 hours each year-allows us to give back to causes we are passionate about, and nearly 86% do!

Our purpose, driven by our people, is what makes us the worldwide leader in technology that powers the internet. Helping our customers reimagine their applications, secure their enterprise, transform their infrastructure, and meet their sustainability goals is what we do best. We ensure that every step we take is a step towards a more inclusive future for all. Take your next step and be you, with us!

Applied = 0

(web-94d49cc66-9tddw)