DevOps

Sr. Site Reliability Engineer

Pune, Maharashtra
Work Type: Full Time

We are looking for a Cloud/SRE Engineer for the leading cloud-based business process automation provider that delivers flexible, managed software solutions.   

Total Experience: 8+ years  
Employment Type: Permanent  
Notice Period: Immediate to 1 month Only  
Working Model: Work from home (Only Pune-based candidates)  
Technical Skills: Docker, Kubernetes, AWS, Linux, Shell Scripting (Python, shell scripting)  

Shift Timings: Rotational Shift as per requirement (Work From Home)  

What you'll do: 
  • Day-to-day management of alerts, checking systems, and escalating issues as necessary  
  • Be part of a team that provides 24x7 on-call support for critical SaaS events.   
  • Be available in case of emergencies when team members are not available or need help.  
  • Documentation of issues and remediation steps  
  • Proactively create appropriate monitors in the EKS/K8S ecosystem  
  • Deploy to EKS/K8s cluster using Terraform and Helm  
  • Learn and maintain existing infrastructure running under Docker Swarm   
  • Improve existing infrastructure health by implementing checks and scripts to correct known issues  
  • Maintenance and development of deployment code   
  • Automating tasks that are currently executed manually   
  • Implement/integrate new technologies in our Cloud Infrastructure  
  • Collaborate with other teams and departments to provide the highest level of support and assistance  
  • Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes
  • Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers  
  • Perform RCA and take necessary corrective actions to prevent the recurrence of issues  
  • Create and assign alert-related actions to the appropriate team after the investigation  
  • Handle support requests for environment-specific actions  
  • Identify and provide automation requirements to improve RCA

What you'll need:  
  • Hands-on AWS Cloud Engineer  
  • Working knowledge of EKS/Terraform/Helm  
  • Working Experience with Docker and Docker Swarm(Optional)   
  • Good understanding of AWS IAM roles and policies  
  • Logging and Monitoring AWS Resources using CloudWatch logs.  
  • Experience working with Linux environment  
  • Proficient in Bash and/or Python scripting  
  • A strong understanding of web technologies such as REST APIs  
  • Working Experience with monitoring solutions, such as Grafana, and Prometheus  
  • Excellent oral and written communication skills  
  • Customer-facing communication skills to effectively explain issues and RCAs to them  
  • Experience in Product/Application Support for SaaS-based products 
  •  Experience in Product/Application Support for SaaS-based products

    Bonus Skills: Certified AWS Solutions Architect, Working knowledge of Bitbucket Pipelines  

Submit Your Application

You have successfully applied
  • You have errors in applying