We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Senior Software Engineer

Microsoft
United States, Washington, Redmond
Oct 31, 2025
OverviewAre you excited by the challenge of building and scaling large, Kubernetes-based service platforms? Do you thrive in a collaborative engineering environment where your work has global impact? If so, we'd love to connect with you. We are the Containers on Substrate-Managed Intelligent Clusters (COSMIC) Topology and Namespace Management team, part of the platform that powers some of the largest-scale services in Microsoft, including Teams backend services and Copilot. Our team focuses on the critical infrastructure that enables onboarding, cluster provisioning, and establishing standards across apps through policies (e.g., observability, deployment, security etc.).We're looking for a Senior Software Engineer to help build COSMIC AURA -an AI-first orchestration platform that will redefine how Microsoft optimizes runtime capacity, workload placement, and operational efficiency for COSMIC. You will be part of a high-performing, cross-functional team building multi-tenant, high-throughput services with an intentional focus on reliability and performance. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesDesign, build, and improve backend services for onboarding and Namespace provisioning.Develop scalable technical solutions to support COSMIC's growth.Automate provisioning pipelines, define service definitions and policies, and migrate object models.Create automation for moving services between regions and disaster recovery strategies that ensure resiliency and performance.Enhance reliability, observability, security, and compliance of services.
Applied = 0

(web-675dddd98f-rz56g)