Software Development Manager, Data Center - GenAI

Amazon

Amazon cover image
Amazon logo image
AmazonInternet

Software Development Manager, Data Center - GenAI

United States , Bellevue
Salary: $184,900 – $250,200 Annually

DESCRIPTION:

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain - and we're looking for talented people who want to help.

You'll join a diverse team of design engineers, quality/reliability engineers, supply chain specialists, field engineers, and other vital roles. You'll collaborate with people across AWS to help us deliver the highest standards for quality and reliability while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

As the Software Development Manager for the Eva Data Center Assistant team, you will lead a team of Software Development Engineers building AWS data center's agentic GenAI platform that powers AI-assisted operations across the global data center infrastructure. You will own the technical vision and strategic roadmap for the Eva platform, driving investments across agentic AI systems, full-stack serverless engineering, search and knowledge systems capabilities. Your leadership will shape the direction of a next-generation AI/ML platform that orchestrates physical work processes, automates decision-making, and enhances operational efficiency for a 30K+ globally distributed user base. You will champion platform thinking building reusable primitives, APIs, and extensible components that dozens of teams across the Data Center Community build upon.

In this role, you will drive the design and delivery of production-grade agentic systems including LLM orchestration, tool-calling patterns, agent frameworks, and intelligent workflow automation. You will partner closely with cross-functional stakeholders including data center operations, controls engineering, product management, and peer engineering teams to translate complex operational needs into scalable AI-powered solutions. You will establish and raise the bar on engineering practices including code reviews, CI/CD, progressive deployment, observability, and operational readiness for ML/AI systems in production. You will also own hiring strategy and talent development, building a high-performing engineering team with deep expertise in generative AI, distributed systems, and full-stack development, while communicating platform strategy, technical roadmaps, and business impact to senior leadership with clarity and conviction.

Key job responsibilities
- Lead and mentor a team of SDEs building and operating the Eva agentic GenAI platform, fostering a culture of ownership, innovation, and operational excellence
- Own the end-to-end technical roadmap for Eva, balancing investments across agentic AI capabilities, platform infrastructure, frontend experiences, search/knowledge systems workstreams
- Drive the architecture and delivery of agentic AI systems including LLM orchestration, prompt engineering, skills, harness, tool-calling patterns, semantic search, and agent frameworks (e.g., Strands)
- Lead the development of full-stack serverless solutions leveraging AWS Lambda, API Gateway, DynamoDB, EventBridge, CDK, and related services to deliver scalable, production-grade platform capabilities
- Own the design of search and knowledge systems including vector embeddings, hybrid retrieval, document processing pipelines, and semantic chunking to power Eva's intelligent responses
- Build and evolve platform primitives and reusable components that enable dozens of teams across the Data Center Community to build AI-powered capabilities on top of Eva
- Partner with data center operations, controls engineering, product management, and peer engineering teams to identify high-impact use cases and translate them into platform features
- Establish and enforce engineering excellence including CI/CD pipeline design, progressive deployment, synthetic monitoring, observability (CloudWatch, X-Ray, OpenTelemetry), and operational readiness reviews
- Own hiring, performance management, and career development for the team, building a diverse pipeline of engineers with expertise in GenAI, distributed systems, and full-stack development
- Communicate platform strategy, project status, and business impact to senior leadership, driving alignment on priorities and resource allocation

A day in the life
- Conduct 1:1s and team stand-ups to unblock engineers, review progress, and align priorities across agentic AI, platform infrastructure, frontend, and search/knowledge workstreams
- Review technical designs, architecture proposals, and code reviews - ensuring high standards for agentic system design, API contracts, prompt engineering patterns, and infrastructure-as-code
- Triage and prioritize incoming requests from cross-functional stakeholders (data center operations, controls engineering, product managers, peer platform teams) against Eva's roadmap and strategic pillars
- Monitor operational health of Eva's production systems including agent orchestration services, RAG pipelines, search infrastructure, APIs, and frontend applications - driving rapid resolution of any issues
- Collaborate with product and program managers to refine requirements, scope agentic AI features, and plan sprint/iteration deliverables
- Participate in hiring activities including resume reviews, phone screens, on-site interviews, and calibration sessions to build and maintain a strong engineering talent pipeline
- Engage in strategic planning discussions with senior leadership on Eva's platform direction, GenAI technology adoption, resource allocation, and long-term technical investments
- Coach and mentor engineers, providing career development guidance, actionable feedback, and growth opportunities in emerging areas like agentic AI and large-scale platform engineering

About the team
Why AWS
Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating - that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying.

Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there's nothing we can't achieve in the cloud.

Inclusive Team Culture
Here at AWS, it's in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.

Mentorship and Career Growth
We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

BASIC QUALIFICATIONS:

- 3+ years of engineering team management experience
- 7+ years of working directly within engineering teams experience
- Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
- Experience partnering with product or program management teams
- 3+ years of developing large-scale, multi-tiered distributed software systems using distributed programming experience

PREFERRED QUALIFICATIONS:

- Experience delivering products against plan in a fast-paced, multi-disciplined, distributed-responsibility and often ambiguous environment
- Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers to improve their skills, and make them more effective, product software engineers
- Knowledge of ML, NLP, Information Retrieval and Analytics
- Experience working with fast-moving, high-performance teams and driving innovative solutions tailored to unique business environments
- Experience leading teams building AI/ML or generative AI systems in production, including LLM-based applications, agentic architectures, or RAG systems

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.



USA, WA, Bellevue - 184,900.00 - 250,200.00 USD annually
Share this job