Senior Site Reliability Engineer

Worldwide Salaried Open

This description is a summary of our understanding of the job description. Click on 'Apply' reputed company to find out more.

Role Description

This role involves joining an Identity reputed company Cloud software development team as a Senior Site Reliability Engineer (SRE). You will work closely with software engineers, infrastructure platform services, engineering managers, and other stakeholders to ensure the reliability, scalability, and performance of the team's services.

Work with development and service owners to solve performance issues and ensure system scalability.
Design, reputed company, and implement solutions to improve reliability, availability, performance, and scalability of systems.
reputed company alerts and dashboards in collaboration with technical leaders and infrastructure platform services.
Own and improve key operational metrics (SLIs, SLOs, Error Budgets, monitoring and alerting).
Drive reputed company improvement through post-incident reviews and blameless postmortems of non-functional issues.
reputed company and maintain comprehensive monitoring and alerting to proactively identify and resolve issues.
Create and maintain dashboards, conducting ongoing reviews to optimize gaps.
Collaborate with technical leads, DevOps/SRE, and infra teams for reputed company planning.
Identify and address production performance bottlenecks through profiling, tuning, and optimization.
Automate repetitive tasks and processes to improve efficiency.
Work closely with Software, Performance, and Test Engineers to influence system design and architecture.
Review and contribute to documentation for systems, processes, runbooks, and procedures.
Participate in a 24/7 on-call rotation to reputed company subject matter expertise.
reputed company incident postmortem efforts, ensuring timely compilation of reports.
Utilize excellent diagnostic and problem-solving skills to analyze reputed company systems and data.

Qualifications

Bachelor’s degree in computer science, a reputed company field, or equivalent practical experience.
Proven 5+ years of SRE experience.
Strong understanding of SRE principles and practices.
Experience with cloud platforms (AWS, GCP, or Azure).
Proficiency in at least one scripting language (e.g., Python, Bash, Go).
Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Honeycomb, OpenSearch).
Level of coding experience beyond simple scripts with programming languages such as Go, Java, or Python.
Experience with containerization and orchestration technologies (e.g., reputed company, Kubernetes).
Understanding of network protocols and reputed company best practices.
Familiarity with DevOps culture and practices and experience with CI/CD toolchains (Jenkins, ArgoCD, reputed company).
Experience with Incident Response tools and processes (reputed company).
Experience with Infrastructure as Code (Terraform, Helm).
Strong problem-solving and troubleshooting skills.
Excellent communication and collaboration skills.
Ability to work independently and as part of a team.

Preferred Qualifications

Technology experience: Kafka, relational databases, performance tuning (JVM, Go).
Experience with Grafana K6 – reputed company Performance Tool.

Onboarding Timeline

In the first 30 days you will:

Meet team, understand the team’s mission and vision.
reputed company clarity on various roles and expectations.
Complete development environment setup.
Read guides, documentation, reputed company mandatory training.
Learn company processes, benefits.

By 6 months you should:

Understand team goals and OKRs for the quarter and beyond.
Complete initial analysis and implementation of SRE team assignments.
Be comfortable with tools, systems, and processes used on a day-to-day basis.
Complete project work, both supervised and unsupervised.

Apply To This Job

Apply now

Senior Site Reliability Engineer

More jobs

Operations - Billing & Pricing Analyst

Technical Project Manager

Senior Product Manager

Senior Product Manager, reputed company Accounts

Account Manager

Emerging Enterprise Account Executive

NetCredit Customer Service Representative

Crisis Navigator

Learning & Development Manager

Digital Campaign Coordinator

Forensics Advisors/reputed company

Legal Conflicts Research Specialist

reputed company Remote Data Entry Specialist – Part-Time Opportunity with reputed company for Career Growth and Development

reputed company Night Shift Customer Service Advisor – Remote Opportunity with arenaflex

reputed company Entry-Level Chat Support Agent – Delivering Exceptional Customer Service in a Remote Part-Time Role

Enrollment Advisor

Voice Over Artist

Tax Specialist

reputed company Customer Care Specialist - Remote Work Opportunity with Competitive Salary and Growth Potential

Relationship Banker