
Cloud Azure Admin-SCB (NCS/Job/ 3809)
Job Skills
Job Description
- Setting SLA, SLO and SLI with stakeholders in relation to the four golden signals of SRE monitoring
- Effectively manage multiple stakeholder demands and expectations while maintaining quality and delivery
- Progressively adopt proactive SRE strategies like Chaos Engineering, Game Days and Synthetic Monitoring
- Partner with application developers and architects to ensure our services are built for scale and performance
- Develop the monitoring solutions on top of existing observability platforms
- Maintain open communication with Engineering and Product teams around system performance and reliability
- Write, review, and execute test plans/strategies for validating product/system performance, scalability, and reliability
- Drive product reliability improvements through monitoring, alerting, and application of software development best practices
- Identify creative ways to break the products, uncover and report defects, as well as validate systems/solutions are operating as intended
- Engage in the refinement of the development, build and deployment processes on top of our main infrastructure
- Work with the engineering teams to architect and build our platform services to simplify real-time troubleshooting and operational response to incidents and outages
- Be the expert on how to best use Cloud technologies to build our next-generation platform
- Bridge the divide between our core application engineers and our main infrastructure teams
- Provide capacity management expertise to ensure our deployments are managed for robustness and cost
- Bring best practices and own environment management, ensuring all dev/test/prod environments are reproducible with high availability
- Serve as a quality and reliability ambassador as part of an Agile software development team
- Maintain and communicate testing timelines, schedules and status reports
î
|
ROLE SPECIFIC TECHNICAL COMPETENCIES |
|
|
Advanced Knowledge of application, data, and infrastructure architecture disciplines |
Expert |
|
Experience with Agile / Scrum delivery methodology and related tools |
Advanced |
|
Advanced knowledge of object-oriented programming languages and concepts (Python, Java, Golang, etc..) |
Advanced |
|
Experience with microservices, API-first, event-driven, agent-based architecture and design |
Advanced |
|
Knowledge in DevOps – CI/CD, containerization (Docker/Kubernetes), orchestration (Ansible/Salt) |
Advanced |
|
Knowledge of different aspects of service design: including messaging protocols and behaviour, caching strategies and software design practices |
Advanced |
|
Knowledge of infrastructure (networking, hypervisors, storage, security) - experience working with a private cloud is a plus |
Core |
|
Experience with test automation with common test frameworks; and performance / load testing techniques at scale |
Advanced |
|
Experience with metrics collection, time series queries, middleware such as Telegraf, and backends such as OpenTSDB or Prometheus |
Advanced |
|
Experience with data visualization tools such as Kibana and Grafana |
Advanced |
|
Experience in Trading System Management & Capital Market Tech eco-System |
Core |