Site Reliability Engineer

San Jose, US
Software Engineering
Professional

Site Reliability Engineer

San Jose, US
Software Engineering
Professional

Introduction
A career in IBM Software means you’ll be part of a team that transforms our customers challenges into solutions.

Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.

We are seeking a skilled SRE to join our Platform Engineering team for Data and AI organization within IBM Software. As part of our team, you will be responsible for designing, building, maintaining the underlying infrastructure and tools necessary to support and enable software development, deployment, and operations at scale.

Your Role and Responsibilities
Automation: Develop and maintain automation tools and scripts to streamline deployment, monitoring, and management of the infrastructure and
applications.
Monitoring and Alerting: Set up and maintain monitoring and alerting systems to proactively identify and resolve issues before they impact customers.
or services.
Performance Optimization: Identify opportunities for performance optimization and work with development teams to implement improvements.
Documentation: Maintain up-to-date documentation for the infrastructure, processes, and procedures.
Collaboration: Work closely with development teams, product managers, and other stakeholders to understand requirements and ensure the reliability of the platform.
Continuous Improvement: Participate in post-incident reviews, retrospectives, and other forums to identify areas for improvement and drive continuous improvement initiatives.

Required Technical and Professional Expertise

Experience with Cloud Platforms: Strong experience with cloud platforms such as AWS, Azure, or Google Cloud Platform, including expertise in
Deploying and managing services in these environments.
Managing, and troubleshooting containerized applications.
Automation and Scripting: Strong scripting skills (e.g., Python, Bash) and experience with configuration management tools (e.g., Ansible, Chef, Puppet) to automate deployment and management tasks.

Troubleshooting and Problem Solving: Strong troubleshooting skills and the ability to quickly identify and resolve complex issues in a production environment, including experience with incident response and post-incident analysis.

Preferred Technical and Professional Expertise

DevOps Culture: Experience working in a DevOps culture and mindset, including a strong understanding of the collaboration between development and operations teams to achieve business goals.
Container Orchestration: Proficiency in container orchestration tools such as Kubernetes and OpenShift, including experience in deploying,
Monitoring and Logging: Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack) to monitor the health and performance of infrastructure and applications.
Experience with Scalable Architectures: Experience designing and implementing scalable architectures for cloud-based applications, including knowledge of best practices for scalability, performance, and reliability.
Experience with Monitoring and Observability: Experience with advanced monitoring and observability practices, including using tools such as Prometheus, Grafana, and Kubernetes-native monitoring solutions to gain insights into system performance and behavior.

Want to know what it’s like to be an IBMer?

About Business Unit

IBM Software infuses core business operations with intelligence—from machine learning to generative AI—to help make organizations more responsive, productive, and resilient. IBM Software helps clients put AI into action now to create real value with trust, speed, and confidence across digital labor, IT automation, application modernization, security, and sustainability. Critical to this is the ability to make use of all data, because AI is only as good as the data that fuels it. In most organizations data is spread across multiple clouds, on premises, in private datacenters, and at the edge. IBM’s AI and data platform scales and accelerates the impact of AI with trusted data, and provides leading capabilities to train, tune and deploy AI across business. IBM’s hybrid cloud platform is one of the most comprehensive and consistent approach to development, security, and operations across hybrid environments—a flexible foundation for leveraging data, wherever it resides, to extend AI deep into a business.

Your Life @ IBM

In a world where technology never stands still, we understand that, dedication to our clients success, innovation that matters, and trust and personal responsibility in all our relationships, lives in what we do as IBMers as we strive to be the catalyst that makes the world work better.

Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.

Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our IBMers have to make critical decisions everyday is essential to IBM becoming the catalyst for progress, always embracing challenges with resources they have to hand, a can-do attitude and always striving for an outcome focused approach within everything that they do.

Are you ready to be an IBMer?

About IBM

IBM’s greatest invention is the IBMer. We believe that through the application of intelligence, reason and science, we can improve business, society and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.

Restlessly reinventing since 1911, we are not only one of the largest corporate organizations in the world, we’re also one of the biggest technology and consulting employers, with many of the Fortune 50 companies relying on the IBM Cloud to run their business.

At IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing and blockchain. Now it’s time for you to join us on our journey to being a responsible technology innovator and a force for good in the world.

Other Relevant Job Details

Location Statement

IBM offers a competitive and comprehensive benefits program. Eligible employees may have access to: – Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being – Financial programs such as 401(k), the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long- term disability coverage, and opportunities for performance based salary incentive programs – Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs. IBM also offers paid family leave benefits to eligible employees where required by applicable law – Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals – Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences The compensation range and benefits for this position are based on a full-time schedule for a full calendar year. The salary will vary depending on your job-related skills, experience and location. Pay increment and frequency of pay will be in accordance with employment classification and applicable laws. For part time roles, your compensation and benefits will be adjusted to reflect your hours. Benefits may be pro-rated for those who start working during the calendar year. This position was posted on the date cited in the key job details section and is anticipated to remain posted for 21 days from this date or less if not needed to fill the role. We consider qualified applicants with criminal histories, consistent with applicable law.

Being You @ IBM

IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.

Key Job Details

Apply now

Don’t see a fit at this time?

Don’t worry. Join our Talent Network and get notified about the latest opportunities.

Join Talent Network >