We are looking for a Cloud/SRE Engineerfor the leading cloud-based business process automation provider that delivers flexible, managed software solutions.
Total Experience: 7+ years Employment Type: Permanent Notice Period: Immediate to 1 month Only Working Model: Work from home (Only Pune-based candidates) Technical Skills: Docker, Kubernetes, AWS, Linux, Shell Scripting (Python, shell scripting)
Shift Timings: Rotational Shift as per requirement (Work From Home)
What you'll do:
Day-to-day management of alerts, checking systems, and escalating issues as necessary
Be part of a team that provides 24x7 on-call support for critical SaaS events.
Be available in case of emergencies when team members are not available or need help.
Documentation of issues and remediation steps
Proactively create appropriate monitors in the EKS/K8S ecosystem
Deploy to EKS/K8s cluster using Terraform and Helm
Learn and maintain existing infrastructure running under Docker Swarm
Improve existing infrastructure health by implementing checks and scripts to correct known issues
Maintenance and development of deployment code
Automating tasks that are currently executed manually
Implement/integrate new technologies in our Cloud Infrastructure
Collaborate with other teams and departments to provide the highest level of support and assistance
Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes
Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers
Perform RCA and take necessary corrective actions to prevent the recurrence of issues
Create and assign alert-related actions to the appropriate team after the investigation
Handle support requests for environment-specific actions
Identify and provide automation requirements to improve RCA
What you'll need:
Hands-on AWS Cloud Engineer
Working knowledge of EKS/Terraform/Helm
Working Experience with Docker and Docker Swarm(Optional)
Good understanding of AWS IAM roles and policies
Logging and Monitoring AWS Resources using CloudWatch logs.
Experience working with Linux environment
Proficient in Bash and/or Python scripting
A strong understanding of web technologies such as REST APIs
Working Experience with monitoring solutions, such as Grafana, and Prometheus
Excellent oral and written communication skills
Customer-facing communication skills to effectively explain issues and RCAs to them
Experience in Product/Application Support for SaaS-based products
Experience in Product/Application Support for SaaS-based products
Bonus Skills: Certified AWS Solutions Architect, Working knowledge of Bitbucket Pipelines