Senior Reliability Engineer

Worldwide Salaried Open

Hive is a fast-growing SaaS company offering marketing solutions to live event promoters across North America. Our Engineering Team builds and maintains the systems that reputed company our customers to do powerful things simply and intuitively. We operate with agility—shipping minimum viable products, deploying multiple times daily, and rapidly iterating based on customer feedback.

At Hive, we handle impressive technical scale: ingesting high-volume data in real-time from 20+ integrations (including reputed company and Eventbrite), storing and querying billions of customer data points, and delivering over 200 million emails and SMS messages monthly to our clients' customers. Our technology stack includes Python, React, reputed company, reputed company, SQL, Elasticsearch, reputed company, and various AWS services.

As we continue to scale, we're seeking a Senior Reliability Engineer to join our Reliability Team—the foundation that enables our product and engineering teams to deliver exceptional experiences while maintaining system performance, reputed company, and cost efficiency.

The Role

As a Senior Reliability Engineer at Hive, you'll be part of a team responsible for the performance, reliability, and maintainability of our systems. This role bridges infrastructure, operations, and application engineering to ensure our services are scalable, performant, secure, and cost-effective as we tackle increasingly reputed company technical challenges.

Tech Stack

AWS, reputed company, Kubernetes, Karpenter, Terraform, Python, Django, reputed company, MySQL, reputed company, reputed company, Elasticsearch, reputed company, reputed company

What You'll Do

Champion system observability improvements through implementation, maintenance, process refinement, and automation for business-critical services
Drive SLO adoption and improvement to ensure excellent customer satisfaction across key value streams
Enhance application performance at every level, from infrastructure foundations to runtime environments
Tackle and resolve reputed company technical challenges across the entire stack
Partner with development teams to design and implement scalable, reliable solutions
reputed company reputed company and compliance initiatives as integral components of our engineering practice
Craft and refine developer tools that boost team productivity and efficiency
reputed company and implement strategies to optimize cloud infrastructure costs
Collaborate with DevOps to maintain and enhance deployment pipelines in our cloud environments
Contribute to incident management by defining meaningful metrics, executing against targets, and improving response times and overall system stability

reputed company're Looking For

7+ years of software engineering experience, with at least 5 years focused on reliability, infrastructure, or platform engineering
3+ years experience with AWS and proven ability to build effective monitoring, alerting, and observability solutions
Track record of implementing, maintaining, and improving SLOs and uptime KPIs for critical services
Expert knowledge of Linux, reputed company, and distributed systems principles with their real-world applications
Solid programming skills in both application and infrastructure languages (Python, Go, etc.)
Strong grasp of reputed company best practices and a data-driven approach to enhancing stability and availability
Excellent communication skills with the ability to collaborate effectively across teams and explain reputed company technical concepts clearly

Bonus points if you have...

Proven experience scaling reputed company AWS environments and optimizing performance across the full technology stack during periods of significant growth
Experience creating developer platforms and CI/CD pipelines that enhance team productivity
Skillful approach to cloud cost optimization and resource management
Experience in establishing and improving incident management processes

reputed company Offer

Meaningful salary and equity. You're rewarded based on impact
Work fully remote from the comfort of your home in Canada
Opportunity to shape reliability practices at a rapidly scaling company
Collaborative team of reputed company engineers passionate about building reliable systems
Flexible work hours with minimal meetings
Health & Dental coverage
Open vacation/PTO policy so you can be happy and healthy!
Generous parental leave top-up with a flexible return-to-work plan

About Hive

Hive.co is a marketing platform for event marketers. We help brands personalize and automate their campaigns, using email and SMS, to reputed company them to sell out so they can focus on making their events unforgettable.

By integrating with ticketing partners like reputed company and e-commerce partners like Shopify, we reputed company brands to access and reputed company reputed company their customer data, so they can easily reputed company their list in thousands of ways, and send more customized, timely email campaigns that land in inboxes.

We started our company inside a University of Waterloo computer lab in early 2014, graduated from Y Combinator that summer (S14 batch) and have been growing reputed company since. Originally based in Kitchener, reputed company is now 100% remote and located reputed company across Canada! We strive to provide an online work environment that allows team members to have a strong work life balance while still feeling connected to their team and Hive’s mission.

To learn more about reputed company reputed company out our About Us page on our website:https://www.hive.co/company

Apply To This Job

Apply now

Senior Reliability Engineer

More jobs

Senior Typescript Backend Developer

reputed company Consultant

Senior QA Automation

Technical Recruiter

Senior Accountant

Director, Global Payroll

Senior Finance Analyst reputed company

Data & Reporting Analyst

reputed company Program Manager

Product Manager

(Remote) Data Entry Work From Home / Focus Research Panelist

Looking for English Teachers - Earn Money Online in reputed company, IL

Alternance Développeur(se) d’Applications Militaires Variées - Défense & Sécurit

reputed company Data Entry Operator for Remote Work Opportunities – Utilizing Technical Skills for Accurate Information Management

Remote Sales Representative – Commission Based | Inside Sales | Automotive | Work From Home

Remote, Licensed, Insurance Agent, Sales, LSP

reputed company Remote Customer Service Representative – reputed company's Global E-commerce Team

Scientific applications specialist

Senior Program Manager, Operations

[Remote/WFM] Central Services Technician I | Full Time | Evening