Apply Now

Senior Manager - Site Reliability Engineering (SRE)

Reston, VA, USA | Fannie Mae

  • Industry:
    Financial Services
  • Position Type:
  • Functions:
    IT / Information Technology
  • Experience:
    7-10 years
Job Description:
50 people have viewed this job

Company Description

At Fannie Mae, futures are made. The inspiring work we do makes an affordable home a reality and a difference in the lives of Americans. Every day offers compelling opportunities to modernize the nation's housing finance system while being part of an inclusive team using new, emerging technologies. Here, you will help lead our industry forward, enhance your technical expertise, and make your career.

Job Description

As a valued leader on our team, you will oversee the work of a unit, or several units, whose staff are designing, producing, testing, or implementing software, technology, or processes across multiple projects, programs, or products, as well as create and maintain IT architecture, large scale data stores, and cloud-based systems.

You will apply your expertise in software and systems engineering to ensure that both our internally critical and externally visible systems meet the appropriate performance needs of our users. You will serve as a champion of service availability, efficiency, automation, monitoring, and capacity management. Specifically, you will leverage your skills and experience in Amazon Web Services, software development with Java and/or Python, customization in Splunk and/or Dynatrace, and automation in Selenium and/or Blue Prism (among others) to enable increased feature velocity and continuous improvement.


The Sr. Manager - Site Reliability Engineering (SRE) role will offer you the flexibility to make each day your own, while working alongside people who care so that you can deliver on the following responsibilities:

• Determine the needs of large customer groups across multiple projects, programs, or products.

• Plan and direct the work of the team as they design and develop software solutions to meet needs across simultaneous projects or workstreams.

• Ensure teams use a process-driven approach in designing solutions.

• Oversee the implementation of new software technology across multiple projects, programs, or products.

• Oversee the effective and efficient ongoing maintenance of existing software.



Minimum Required Experiences

• 7+ years of relevant work experience

• Certification in AWS Solutions Architect Associate or Developer Associate

Desired Experiences

• Bachelor’s Degree in Computer Science, Management Information Systems (MIS), Systems Engineering, or related field

• Certification in Splunk Certification Developer, or Sun Certified Java Developer

• Experience creating disaster recovery plans and executing failover tests

• Experience with capacity planning and performance testing / engineering tools, such as JMeter and / or LoadRunner

• Experience with Failure Mode Effect Analysis (FMEA) and Chaos testing / engineering tools, such as Gremlin, Chaos Monkey, Chaos Toolkit, AWS Fault Injection Service (FIS)

• Experience working with code repositories such as Bitbucket and / or GitHub

• Experience with programming in Java and / or Python

• Understanding of Java performance monitors (JVM, GC, Heap Size, Message Broker)

• Experience with building automation solutions using tools such as BluePrism and / or Selenium

• Understanding of fault tolerant / resilience architectural design patterns, such as Bulkhead, Circuit-breaker, Retry, Timeout, etc.


• 3 years of experience leading teams in applications development, infrastructure, or operations

• 5 years of experience working in a Scaled Agile Framework (SAFe), Scrum, or Kanban environment using Jira and Confluence

• 4 years of experience supporting AWS cloud applications and technologies, including containerization, virtualization, microservices, and server-less architecture in tools

• 3 years of experience with J2EE frameworks

• 3 years of experience application monitoring / observability, including building dashboards, establishing service level indicators / objectives / agreements (SLIs / SLOs / SLAs), and logging / tracing using tools

• 3 years of experience with CI/CD / DevOps deployment tools

• 3 years of experience with application production / operations support, including incident response, problem management, runbooks, and knowledge articles using tools

• 3 years of experience with post-mortems, root-cause analysis (RCA), and / or AWS Correction-of-Errors (CoE)

• Understanding of error budgeting and toil reduction

• Excellent problem-solving skills and proactivity in resolving issues / blockers

• Excellent verbal / written communication, presentation, and relationship management skills, and ability to collaborate with multiple stakeholders

• Excellent people management, persuasion / influencing, and conflict resolution skills


• Skilled in Spring Boot/ Spring Cloud

• Skilled in JavaScript

• Skilled in REST

• Skilled in Amazon Web Services (AWS) offerings, development, and networking platforms

• Skilled in ServiceNow, Moogsoft, StatusHub, and / or Blameless

• Experience using AWS Elastic Container Service (ECS) and Fargate

• Experience using AWS CloudWatch, Splunk, Dynatrace, CatchPoint, and / or Datadog 

• Skilled in Jenkins, Terraform, UrbanCode Deploy (UCD), and / or GitLab

• Understanding of IT Service Management (ITSM)

Other companies hiring with Ivy Exec

 Company Logos