Senior Cloud & Kubernetes Platform Engineer -

Location: Bangalore (KA) | Employment Type: Full Time - 40 hours per week | Job Level: T2 | Work Preference: Onsite at work location address | Job Code: 240169

Job Description:

About the Role:

We are seeking a Senior Cloud & Kubernetes Platform Engineer with 8 years to 12 years of experience in the Engineering Enablement (EEng) team within PlayStation Network Services focuses on developing tools and technology frameworks to support engineering teams across the organization. Core areas of work include Kubernetes, Docker, AWS, Python/GoLang, Shell scripting, Jenkins, GitHub Actions, Terraform, Observability with Datadog, Splunk, Prometheus, Grafana & Supporting high-availability systems and environments

Requirements:

  • Strong expertise in AWS (VPC, EC2, IAM, ALB/NLB, ASG, Route53, RDS, Lambda, EKS, S3, EBS).
  • Deep understanding of networking fundamentals (CIDR, routing, DNS, security groups, load balancers).
  • Hands-on experience with Terraform (modules, workspaces, state management, CI automation).
  • Strong experience with Docker (multi-stage builds, image optimization, security, multi-arch builds).
  • Advanced knowledge of Kubernetes internals, including control-plane components, API lifecycle, CRDs, operators, and etcd fundamentals.
  • Hands-on development experience building Kubernetes controllers/operators using controller-runtime patterns.
  • Strong programming skills in Python or Golang (at least one required) for platform tooling, automation, and controller development.
  • Experience managing production Kubernetes upgrades and EKS add-ons.
  • Strong Linux system administration and troubleshooting skills.
  • Proven experience handling production incidents in large-scale, distributed systems.

Preferred / Good to Have:

  • AWS Solution Architect certification.
  • CKA, CKAD,CKS certification.
  • Experience with GitOps tools (ArgoCD / FluxCD).
  • Exposure to GCP.
  • Experience with GitHub Actions.
  • Experience with Istio / Service Mesh and Envoy.
  • Experience with observability platforms (Prometheus, Grafana, ELK, Datadog, CloudWatch).

Cloud & Core Infrastructure:

  • Architect, build, and operate highly available, scalable, and secure cloud infrastructure primarily on AWS.
  • Design VPCs, IAM, compute, storage, and load balancing solutions following AWS best practices.
  • Define and implement infrastructure scalability, high availability, and disaster recovery strategies.
  • Support multi-AZ and multi-region architectures for production workloads.

Infrastructure as Code (IaC):

  • Design and maintain Terraform-based infrastructure using modules, workspaces, and remote state backends.
  • Integrate Terraform workflows into CI/CD pipelines with automated validation and provisioning.
  • Drive infrastructure standardization, governance, and reusability.

Containers & Kubernetes Platform:

  • Design, deploy, and operate Kubernetes clusters (EKS and On-Premise).
  • Own Kubernetes control-plane and node lifecycle management, including safe and repeatable upgrades.
  • Manage and upgrade EKS add-ons such as VPC CNI, CoreDNS, kube-proxy, metrics-server, CSI drivers, and ALB Ingress Controller.
  • Design and operate highly available Kubernetes clusters across multiple availability zones.
  • Implement and manage multi-cluster architectures including workload placement, regional failover, and cross-cluster service discovery.

Kubernetes Controller & Platform Development:

  • Design, develop, and maintain custom Kubernetes controllers and operators using controller-runtime patterns.
  • Build CRDs, reconciliation loops, and Kubernetes API extensions for platform capabilities and automation.
  • Handle API lifecycle management, versioning, and backward compatibility for custom resources.
  • Optimize controller performance, error handling, and resiliency for large-scale clusters.
  • Work closely with platform and application teams to translate operational requirements into Kubernetes-native abstractions.

Kubernetes Security & Governance:

  • Implement Pod Security Standards (baseline/restricted).
  • Enforce policies using OPA Gatekeeper or Kyverno.
  • Design and maintain RBAC governance with least-privilege access models.
  • Secure the container supply chain using image scanning, SBOM generation, Cosign signing/verification, and registry governance.

GitOps & CI/CD:

  • Implement GitOps workflows using ArgoCD or FluxCD for declarative deployments and environment promotion.
  • Design and operate CI/CD pipelines using Jenkins or GitHub Actions with integrated security and compliance checks.
  • Enable automated rollbacks, drift detection, and environment consistency.

Development, Scripting & Automation:

  • Develop automation, tooling, and platform services using Python or Golang.
  • Write Shell scripts for infrastructure automation, cluster operations, and tooling integration.
  • Build internal APIs, CLI tools, and controllers to improve developer and operator productivity.

Linux, Operations & Incident Management:

  • Perform advanced Linux administration and troubleshooting across compute, networking, and storage layers.
  • Lead and participate in production incident management, including triage, mitigation, and root-cause analysis.
  • Diagnose complex failures across cloud infrastructure, Kubernetes, networking, and CI/CD systems.

Observability & Service Mesh (Good to Have):

  • Implement monitoring, logging, and alerting using Prometheus/Grafana, Alertmanager, Loki/ELK, Datadog, or CloudWatch.
  • Configure and operate Istio for traffic management, mTLS, retries, timeouts, and secure service-to-service communication.
  • Understand Envoy proxy behavior within Kubernetes workloads.

 

 

At HTC Global Services, our culture is an embodiment of who we are – a value-led organization committed to success of our people and customers.

  • Hybrid and Workplace flexibility
  • Work-Life-Balance
  • Well-defined career development plan
  • Rewards & Recognition program
  • L&D focuses on upskilling
  • Hands-on experience on Emerging Technologies and Digital Transformation
  • Career Mobility programs

Join Our Talent Community

Tell us about yourself, and we will keep you informed about opportunities that match your interests.

Register