Monitoring Engineer

  • Thiruvananthapuram
  • Knitt
Job Title: AWS L2 engineer is expected to have a deeper understanding of AWS services and infrastructure, and handle more complex tasks and troubleshooting. Handle complex incidents, lead root cause analysis, and collaborate closely with the L1 team for escalations, optimize monitoring setups, automate routine tasks, and provide technical expertise for cloud infrastructure management.

Key Responsibilities:

Advanced Monitoring and Incident Response : Monitor and respond to complex issues that are escalated from the L1 team. Conduct deeper analysis and troubleshooting of AWS services like EC2, route 53, Cloud Flare, WAF, VPC, Istio Ingress gateway, Micro services, ECR, Lambda, RDS S3 etc. Investigate root causes of recurring incidents and recommend solutions to prevent future occurrences. Monitoring Tools proficiency and customization in Elastic search, APM, CloudWatch, CloudTrail, RANCHER, Datadog. Infrastructure Management : Design, deploy, and manage AWS infrastructure using services like CloudFormation, Terraform, or AWS CDK. Implement and maintain auto-scaling, load balancing, and failover strategies to ensure high availability and reliability. Optimize the use of AWS resources to ensure cost-effectiveness while meeting performance requirements. Automation and Scripting : Develop automation scripts using tools like AWS CLI, Python, Bash, or PowerShell to streamline routine tasks. Implement Infrastructure as Code (IaC) practices to automate the deployment and management of AWS resources. Set up and manage CI/CD pipelines using AWS services like CodePipeline, CodeBuild, or third-party tools like Jenkins. Security and Compliance : Implement and manage security best practices, including IAM roles and policies, security groups, and VPC configurations. Conduct regular security audits and ensure compliance with relevant standards like SOC 2, GDPR, or HIPAA. Respond to security incidents, perform forensic analysis, and implement corrective actions. Database and Storage Management : Manage and optimize AWS database services like RDS, DynamoDB, and Aurora. Perform data migration, backup, and recovery operations. Ensure data integrity, availability, and security across all AWS storage services. Networking Configure and manage complex VPC setups, including peering, VPN connections, and Direct Connect. Troubleshoot network-related issues, such as latency, connectivity problems, or routing misconfigurations. Implement hybrid cloud solutions that integrate on-premises networks with AWS. Collaboration and Mentorship : Work closely with the L1 team to provide guidance and training on AWS best practices. Collaborate with other teams, such as DevOps, Security, and Development, to design and implement solutions. Act as a subject matter expert (SME) on AWS technologies within the organization. Documentation and Reporting : Create and maintain detailed documentation for the AWS infrastructure, including diagrams, configurations, and procedures. Generate reports on system performance, cost management, and security posture. Document lessons learned from incidents and share knowledge with the team.

Qualification & Experience: Bachelor's degree in Computer Science or equivalent. 3 to 5 years in a similar role

Skills Required: In-depth AWS Knowledge : Strong understanding of a wide range of AWS services and how to architect solutions using them. Scripting and Automation : Proficiency in scripting languages (e.g., Python, Bash) and experience with automation tools. Networking : Solid understanding of networking concepts, especially within the context of AWS (e.g., VPC, Route 53). Security : Knowledge of AWS security best practices and experience with implementing security controls. Problem-solving : Ability to diagnose and resolve complex issues that require in-depth technical knowledge. Communication : Strong communication skills to collaborate with various teams and explain technical concepts to non-technical stakeholders. Certification (Optional but Beneficial) : AWS Certified Solutions Architect – Associate or Professional, AWS Certified SysOps Administrator, or AWS Certified DevOps Engineer.

Common Tools and Technologies: AWS CloudFormation / Terraform : For infrastructure as code. AWS CLI : Command-line interface for managing AWS services. CI/CD Tools : Jenkins, GitLab CI/CD, AWS Code Pipeline. Monitoring Tools : AWS CloudWatch, Prometheus, Grafana, Elastic search, CloudWatch, CloudTrail, RANCHER, Application monitoring(APM) Security Tools : AWS IAM, AWS Config, AWS Inspector, AWS Shield.