Production Engineer
-
- Software Engineering
- Professional
Production Engineer
-
- Software Engineering
- Professional
Watsonx Orders team’s mission is to deliver advanced conversational technology solutions targeting retail and restaurant customers. We are focused on using state-of-the-art Machine Learning, AI, and related technologies along with IBM’s unparalleled scale to completely transform the customer experience!
IBM Watsonx Orders is looking for a Production Engineer to support the deployment and support of our edge Artificial Intelligence (AI) product at scale.
Your Role and Responsibilities
We are currently looking for skilled Software Developer – Production Engineering to ensure performance and reliability for AI & ML driven voice agent microservices, Edge Kubernetes clusters, network services, and storage layers.
Responsibilities:
• Work closely with other Watsonx Orders development teams in an embedded SRE model to help define & implement key metrics for uptime, reliability, and performance of these services and develop runbooks for incident management.
• Develop deep service telemetry through metric collection, distributed tracing, visualization, and reporting via Open Telemetry, Prometheus, and related tooling.
• Implement stability and performance optimizations in Python.
• Participate in the definition and management of SLIs, SLOs and error budgets for infrastructure and production services.
• Baseline gathering for proactive alerting.
• Incident Management and oncall support via PagerDuty and ServiceNow?
Required Technical and Professional Expertise
• Experience in supporting cloud-based infrastructure
• Experience in deploying, and supporting Kubernetes in cloud environments
• Experience supporting distributed systems
• Advanced experience in Python (production code debugging – metrics production coding – bot code maintenance)
• Linux experience configuring, supporting, and optimizing
Preferred Technical and Professional Expertise
• Familiarity running distributed ML workloads in cluster orchestrated environments
• Experience building and supporting telemetry and related infrastructure (Open telemetry, Jaeger, Grafana, Prometheus, New Relic, Instana)
• Experience implementing infrastructure as code pipelines
• PubSub Experience (Kafka, SQS, SNS, MQTT)
• Experience monitoring edge and microservices environments.
What we offer:
• Working for a top 5 IT company according to Forbes 2022 best employers ranking
• International and prestigious projects
• Highly skilled teams of experts
• Wide range of IBM trainings and certificates
• Unlimited access to Udemy, Harvard Business Review, Safari O’Reilly, getAbstract, IBM AI Skills Academy
And what is more:
• Contract of employment
• Competitive compensation – salary range, depending on your skills and experience
• Private medical care and life insurance
• Employee Assistance Program
• Sport, charity & other networking groups
• Summer / winter camps for children
• Discounts with IBM employee badge
• Referral Bonus Program
• Home office option
• No dress code
Want to know what it’s like to be an IBMer?
Key Job Details
Don’t see a fit at this time?
Don’t worry. Join our Talent Network and get notified about the latest opportunities.