Logo

Cloud Azure Admin-SCB (NCS/Job/ 3809)

For A Reputed Large Multinational Technology Company
4 - 6 Years
Full Time
Immediate
Up to 15 LPA
1 Position(s)
Navi Mumbai
Posted Updated Today

Job Skills

Job Description

 

  • Setting SLA, SLO and SLI with stakeholders in relation to the four golden signals of SRE monitoring
  • Effectively manage multiple stakeholder demands and expectations while maintaining quality and delivery
  • Progressively adopt proactive SRE strategies like Chaos Engineering, Game Days and Synthetic Monitoring
  • Partner with application developers and architects to ensure our services are built for scale and performance
  • Develop the monitoring solutions on top of existing observability platforms
  • Maintain open communication with Engineering and Product teams around system performance and reliability
  • Write, review, and execute test plans/strategies for validating product/system performance, scalability, and reliability
  • Drive product reliability improvements through monitoring, alerting, and application of software development best practices
  • Identify creative ways to break the products, uncover and report defects, as well as validate systems/solutions are operating as intended
  • Engage in the refinement of the development, build and deployment processes on top of our main infrastructure
  • Work with the engineering teams to architect and build our platform services to simplify real-time troubleshooting and operational response to incidents and outages
  • Be the expert on how to best use Cloud technologies to build our next-generation platform
  • Bridge the divide between our core application engineers and our main infrastructure teams
  • Provide capacity management expertise to ensure our deployments are managed for robustness and cost
  • Bring best practices and own environment management, ensuring all dev/test/prod environments are reproducible with high availability
  • Serve as a quality and reliability ambassador as part of an Agile software development team
  • Maintain and communicate testing timelines, schedules and status reports

 



ROLE SPECIFIC TECHNICAL COMPETENCIES

Advanced Knowledge of application, data, and infrastructure architecture disciplines

Expert

Experience with Agile / Scrum delivery methodology and related tools

Advanced

Advanced knowledge of object-oriented programming languages and concepts (Python, Java, Golang, etc..)

Advanced

Experience with microservices, API-first, event-driven, agent-based architecture and design

Advanced

Knowledge in DevOps – CI/CD, containerization (Docker/Kubernetes), orchestration (Ansible/Salt)

Advanced

Knowledge of different aspects of service design: including messaging protocols and behaviour, caching strategies and software design practices

Advanced

Knowledge of infrastructure (networking, hypervisors, storage, security) - experience working with a private cloud is a plus

Core

Experience with test automation with common test frameworks; and performance / load testing techniques at scale

Advanced

Experience with metrics collection, time series queries, middleware such as Telegraf, and backends such as OpenTSDB or Prometheus

Advanced

Experience with data visualization tools such as Kibana and Grafana

Advanced

Experience in Trading System Management & Capital Market Tech eco-System

Core