Platform Engineering Manager
Tango
5h ago
0$160k - $200kDevopsUnited Stateshimalayas
Platform-EngineeringDevOps-EngineeringCloud-Infrastructure-EngineeringInternal-Developer-Platform-(IDP)Engineering-ManagementPlatform-Engineering-ManagerCloud-Platform-Engineering-ManagerPlatform-Engineering-LeadPlatform-Engineering-ManagementSenior-Platform-Engineering-LeaderEngineering-ManagerManager
Job Description
*Applicants must be authorized to work in the U.S. for any employer.*We cannot sponsor employment-based visas at this time.Let’s Tango! Where Innovation Meets Impact.At Tango Analytics, we’re all about helping businesses make smarter decisions through powerful technology, insightful data, and a whole lot of collaboration. Whether you're a creative thinker, a strategic planner, a tech wizard, or a customer champion, there's a place for you on our team. We believe work should be meaningful and fun — so if you're ready to make a difference while enjoying the journey, come join us and let's Tango!We are looking for a Platform Engineering Manager to join our dynamic and growing Platform Engineering team.About the Role: We are seeking a Platform Engineering Manager to build and operate our AI-native Internal Developer Platform (IDP)-the foundational layer that powers engineering velocity across the organization. You will own multi-cloudinfrastructure (AWS & Azure), define golden paths, drive cloud modernization aligned to Well-ArchitectedFrameworks, and deliver the observability, shared services, and agentic infrastructure that give every team aproduction-ready foundation. A defining dimension of this role is partnering with peer engineering leaders to activelymigrate teams onto the platform and positioning it as the organization's AI-first engineering foundation.Key Responsibilities:Platform Strategy & ArchitectureOwn and execute thePlatformroadmap: compute, networking, identity, observability, shared services, and AI/MLtooling across AWS and AzureLead cloud modernization against the AWS and Azure Well-Architected Frameworks across all five pillars:operational excellence, security, reliability, performance efficiency, and cost optimizationDefine golden paths-standardized self-service workflows for service scaffolding, DB provisioning, environmentspin-up, and AI workload deployment-with escape hatches for edge casesOwn multi-cloud strategy; ensure consistent IAM, networking, and FinOps governance across providersIaC & CI/CD AutomationDriveOpenTofu/Ansibleas source of truth for all infrastructure; enforce GitOps and policy-as-code for governance,auditability, and securityBuild and mature CI/CD pipelines (GitHub Actions, ArgoCD) to maximize deployment frequency, reduce lead time,and enable zero-ticket self-service provisioningObservabilityOwn org-wide observability: metrics, logs, traces, and alerting–extended to AI/LLM signals (token usage, modellatency, inference cost, agent task completion rates)Operate a centralized observability platform (Datadog/Signoz, OpenTelemetry, Grafana/Prometheus/Loki, orequivalent) delivered via golden paths; define SLIs/SLOs as onboarding defaults for all servicesEnsure full-stack coverage across infrastructure, Kubernetes, APM, distributed tracing, AI pipelines, and costanomaly detectionShared ServicesBuild and operate a self-service shared services catalog: secrets management, API gateways,model registries, andLLM gatewaysRationalize duplicative per-team infrastructure; maintain shared services to production SLA standards with clearownership and consistent security controlsAI Platform & Agentic InfrastructureOwn GPU/accelerated compute, model serving, vector databases, RAG pipelines, and LLM API gatewaymanagement (AWS Bedrock, Azure OpenAI, Anthropic)Build AI golden paths for self-service model deployment and LLM integration; design agentic infrastructure includingorchestration runtimes, tool registries, memory/state services, and human-in-the-loop workflowsEstablish governance, cost controls, prompt injection guardrails, and model access policies for AI API usage andinference spendPartner with data science and ML engineering to translate agentic workflow requirements into reusable platformprimitivesPlatform Adoption & Team MigrationCollaborate onmigration program: partner with peer managers to plan and execute structured workload migrationsonto the platform with hands-on support-not just documentationDefine onboarding playbooks covering golden paths, shared services, observability setup, CI/CD cutover, and AIcapability onboarding; track and report adoption metrics to leadershipIdentify and remove migration blockers-technical gaps, missing services, or organizational friction—and feedthem into the platform roadmapDeveloper Experience, Leadership & CultureBuild a self-service developer portal (Backstage, GitHubor equivalent) with service catalogs, golden paths, andAI/agentic workflow templates; track DORA metrics and developer experience KPIsHire, develop, and retain high-performing platform engineers; build AI fluency across the team and foster a platform-as-a-product culture with feedback loops, OKRs, and iterative roadmappingLead architecture reviews; make pragmatic build-vs-buy decisions; partner with security and compliance ongovernance prioritiesSecurity, Compliance & FinOpsEmbed secure-by-default guardrails: IaC scanning, RBAC, secrets management, container hardening, and AI-
