DOCKER AND KUBERNETES, Microservices, Devops tools : Git/Github,Jenkins , Selenium, Automation test Engineer, Spinnaker, Google Cloud Platform (GCP)
Site Reliability Engineer
5+ years of experience as an SRE, devops engineer, operations engineer or similar.
Experience in cloud technologies such as architecting, developing or maintaining cloud solutions in public cloud environment (Google Cloud).
Should have hands-on experience with spinnaker, Istio, Kubernetes, Docker, Zookeeper on GCP.
Experience deploying applications in SaaS, IaaS and PaaS cloud environments
Experience with newer NOSQL and key-value based systems like Couchbase ,Cassandra, Neo4j, etc.
Experience working in micro services Architecture.
Experience in defining define service level indicators (SLIs), objectives(SLOs), and agreements (SLAs) for micro-services
Excellent knowledge and familiarity with the configuration and usage of Apache and other open source systems software
Application tuning knowledge, capacity concepts, benchmarking, trending, and monitoring..
Scripting experience in mainstream languages - JAVA, shell and python.
Familiar with using Git, Jenkins, and similar CI/CD tools.
Understanding of platform level concerns, such as configuration management, network request routing, blue/green or canary deployments
Broad knowledge of application servers and web servers, networks, firewalls, switches, load balancers.
Part of SRE team and be directly responsible for uptime of Lowes.com , m.lowes.com and Mobile applications.
SREs must be able to investigate and handle issues in a live production environment to ensure uptime on their own or by escalating to the team for assistance.
Manage the following in GCP
Configure , upgrade, Resize clusters. Cluster monitoring, alerting. Manage multi zone/region availability.
Manager VPC, Networking, Load Balancer, Port management, Cluster Ingress management
DB reliability, replication and availability
Immutable deployments, Stages - pipelines
Spinnaker - Kubernetes Integration
stateful applications and custom persistence solution
Keep everyone informed about the health and viability of the platform by reporting known issues and status of ongoing investigations.
Define service level indicators (SLIs), objectives(SLOs), and agreements (SLAs) for micro-services
Identify and advocate for changes vital to the stability and supportability of the system.
Mentor and advise teammates to ensure new features are efficient, highly available, and fault tolerant.
Determine and develop architectural approaches and solutions for improving site reliability, availability, performance, and scalability for our GCP based applications.
Provide continuous improvements to system automation and management systems.
Lead critical improvements to application deployment frameworks and processes.
Respond to outages and coordinate activities to restore service as quickly as possible.
Troubleshooting issues potentially involving any area of the network, systems or applications.
Work with technology partners on evaluating and implementing new technologies.
Lowe's Companies, Inc. (NYSE: LOW) is a FORTUNE® 50 home improvement company serving more than 18 million customers a week in the United States, Canada and Mexico. With fiscal year 2017 sales of $68.6 billion, Lowe's and its related businesses operate or service more than 2,390 home improvement and hardware stores and employ over 310,000 people. Founded in 1946 and based in Mooresville, N.C., Lowe's supports the communities it serves throu