Site Reliability Engineering

Jump to Content

What is Site Reliability Engineering (SRE)?

SRE is what you get when you treat operations as if it’s a software problem. Our mission is to protect, provide for, and progress the software and systems behind all of Google’s public services — Google Search, Ads, Gmail, Android, YouTube, and App Engine, to name just a few — with an ever-watchful eye on their availability, latency, performance, and capacity.

Product-Focused Reliability for SRE

Product-Focused Reliability for SRE

Learn about how a product-focused reliability model can effectively support the overall reliability of a product.

Google SRE turns 20!

Google SRE turns 20!

Hear from our engineers on what we have learned from 2 decades of Site Reliability Engineering.

Watch our videos

SRE Careers

Hear from some of our most senior
engineers about their role at Google.

SRE Books

Read our SRE books online:
Building Secure & Reliable Systems,
The SRE Workbook,
and the original SRE book.

What is SRE?

Since 2004, SRE has evolved to become the industry-leading practice for service reliability.

Hear from key figures about the history of SRE and what’s next for the SRE community.

SRE Resources

A curated list of Site Reliability and Production Engineering resources.