Senior Site Reliability Engineer
Come be a part of the biggest software turnaround in the last decade and help us with our incredible growth! As an Apple Maps Core Services SRE you will be part of a diverse and inclusive team supporting location services being used by 1.5B+ Apple users. This role is not limited to keeping the lights on but will also participate in designing, architecting and implementing the next generation of infrastructure that our growing services will run on. This role will work in tandem with Application developers to design, architect and support highly operable and efficient services across various internal and external cloud environments.
Maps Core Services SRE is looking for a Senior Site Reliability Engineer to maintain excellent uptime and deliver fast, operable, reliable and efficient services that are used by 1B+ Apple users worldwide. The successful candidate will work closely with a development partner to ensure those services are created with strong input from SRE. You will meet with them regularly, participate in design discussions, and carry the service to implementation in production where it will be supported by the whole team. You will find opportunities to contribute to automation efforts, create and execute major projects with high value, and produce SRE tools that benefit a wider audience. You will meet new teams within Apple, collaborating on various projects and products that relate to our services. You will triage all kinds of new and wonderful issues in production that will be told around a campfire 10 years from now. In short, you will find more than a job.
- 8+ years in a Site Reliability Engineering or DevOps focused role
- BS degree in computer science or equivalent field with 8+ years of experience
- Expertise in containerization and orchestration: Docker, Kubernetes and HELM
- Proficiency in infrastructure as code tools: Terraform, Ansible, or CloudFormation
- Experience setting up and managing services running on Kubernetes
- Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation
- In-depth knowledge of monitoring and observability tools: Prometheus, Grafana, Open Telemetry, Splunk
- Knowledge of Linux operating system principles, networking fundamentals, and systems management
- Demonstrable fluency in at least one of the following languages: Java, Python, or Go
- Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.
- Cloud Native SRE experience ( Ideally 8+ years).
- Distributed Systems Software Development.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.