Tech Site Reliability Engineer (AI Model Ops)
UBS
Tech Site Reliability Engineer (AI Model Ops)
India
Information Technology (IT)
Group Functions
Your role
We are seeking a highly motivated and experienced Site Reliability Engineer to join our growing AI Model Operations team to help build, maintain, and monitor a portfolio of products for AI operations in UBS.
As an SRE, you will play a crucial role in ensuring the reliability, performance, and scalability of our production systems. You will find opportunities to improve our production reliability, applying software engineering principles to infrastructure and operations problems.
• Own the reliability of central AI models and agents registry, deployment pipelines, AI SecOps products, and other products in our portfolio
• Ensure the quality, security, reliability, and compliance of solutions by applying SRE best practices
• Own incident management, root cause analysis, and implement preventative measures.
• Support capacity, disaster recovery planning, and cost management.
• Collect and analyze operational data and identify SLI’s from key metrics to define achievable SLO’s for the product set
• Collaborate with data scientists and other stakeholders to collect feedback and incorporate it into solutions
• Automate processes leveraging predictive monitoring, auto-scaling, or self-healing
• Apply performance analysis, log analytics, automated testing and communicate areas for improvement
• Work in an agile way, foster strong collaboration and communication between development and operations, and contribute to the engineering culture in the team
Job Reference #
329450BR
City
Pune
Job Type
Full Time
Your team
We are a multinational team with diverse backgrounds and we strongly value different perspectives, skills, and close collaboration which help us build the best product for our clients. If your interests and skills closely match a good part of the description below – don’t hesitate to apply!
Your expertise
• Hands-on cloud experience utilizing both the cloud portal & CLI to deploy, monitor, troubleshoot and enhance services that our products rely upon. (Azure preferred)
o Plotting metrics and creating dashboards to monitor service health
o Identifying signals from logs or metrics and creating appropriate alerts
o Experience working with the Azure and 3rd party APIs to develop automation/scripts or return data about Azure services
o Experience working with Azure and proficiency in running container workloads
o Identify cloud cost optimizations whilst balancing reliability and availability of services
• Experience with managing and operating cloud workspaces (e.g. Databricks)
• Experience dealing with vendor support teams and being able to steer incident response to achieve resolution of issues before conducting blameless post-mortems to identify root causes and implement improvements
• Proficiency with a high-level software language. (Python preferred)
• Experience with DevOps. (Gitlab preferred)
• Bonus points for:
o Experience with Change Management process and associated software such as Service Now
o Experience with mentoring, coaching and supporting team members
o Experience with KQL in Log Analytics
o Experience with Databricks Unity Catalog
o Experience with MLOps practices and Tooling
About us
UBS is the world’s largest and the only truly global wealth manager. We operate through four business divisions: Global Wealth Management, Personal & Corporate Banking, Asset Management and the Investment Bank. Our global reach and the breadth of our expertise set us apart from our competitors.
We have a presence in all major financial centers in more than 50 countries.
Join us
At UBS, we know that it's our people, with their diverse skills, experiences and backgrounds, who drive our ongoing success. We’re dedicated to our craft and passionate about putting our people first, with new challenges, a supportive team, opportunities to grow and flexible working options when possible. Our inclusive culture brings out the best in our employees, wherever they are on their career journey. And we use artificial intelligence (AI) to work smarter and more efficiently. We also recognize that great work is never done alone. That’s why collaboration is at the heart of everything we do. Because together, we’re more than ourselves.
We’re committed to disability inclusion and if you need reasonable accommodation/adjustments throughout our recruitment process, you can always contact us.
Contact Details
UBS Business Solutions SA
UBS Recruiting
Disclaimer / Policy statements
UBS is an Equal Opportunity Employer. We respect and seek to empower each individual and support the diverse cultures, perspectives, skills and experiences within our workforce.