Director, Reliability Engineering Syndication Support
Philadelphia, PA, USA | Comcast
Industry:Media / Entertainment
Job Description:48 people have viewed this job
The Director of Reliability Engineering Syndication Support will be responsible for monitoring and tracking activities, analyzing issues, supporting the resolution of issues and conflicts for our Syndication Partners with an optimum level of service in line with Service Level Agreements (SLAs) / Operating Level Agreements.
The resource will have an in depth understanding of monitoring strategies, tools and procedures, stakeholder management and executive communications escalations. The incumbent will be expected to lead efforts to assure timely execution of scheduled and repeatable processes such as data backup, account refresh, data ingest, data clean-up and event log management. You will also ensure that service-level issues are responded to and that normal service operations are restored as quickly as possible. Identify and lead the implementation of creative process and technology solutions within the team. Lead efforts to ensure high-availability of systems and the ability to identify customer-facing issues are included in the development or deployment of new products and services. Identify and recommend opportunities for 'clean-slate' process improvement with regards to incident management, problem management, fault monitoring, triage procedures and issue escalation.
Lead the development of procedures for incident triage and problem management, metric and measure creation, management and administration of monitoring tools.
Ability to effectively identify, triage and drive resolution of incidents, assist in change management and deployment support.
Support the resolution of incidents and problems within the team.
Assist with the resolution of complex incidents ensuring he right problem solving techniques and processes are applied.
Participate in regular meetings with stakeholders, prepare and document meetings, track progress.
Collect, interpret and respond to changes in production data, as appropriate
Track the implementation of resolution tasks.
Provide regular and reliable reporting of relevant data to meet management requirements
Provide input and contribute in Residential Syndication Support Production Management related audits.
Support the collection, analysis and production metrics on process data for KPIs to determine opportunities for improvement.
Partner with the Residential Reliability Change Management Team on the implementation of application configuration and break-fix changes which are Syndication Partner impacting.
Works with the Residential Syndication Support team members to identify areas of focus, where training may improve team performance, and improve incident resolution metrics.
Drive knowledge management across the supported applications and ensure full compliance.
Implement/maintain a 24x7 fully functional event monitoring and management platform which monitors and regularly probes the service environment for anomalies.
Drive continual service experience improvements.
Key Skills and Abilities:
Excellent leadership, organizational, communication, customer service, problem solving and interpersonal skills.
Experience in supporting business objectives in a partnered/outsourced model (offshore vendor management) is a plus.
Strong hands-on technical experience (for example: Linux, Networking, Java, SQL)
Greenfield experience creating strong teams, tools and processes is preferred.
Ability to support off hours escalations for major outages.
Solid understanding of all applications, equipment and software currently being used throughout the company.
Understanding of cable and IP technologies is a plus.
Maintain an end to end view of the application and infrastructure landscape.
Bachelor's Degree or Equivalent
Engineering, Computer Science
Generally requires 10+ years related experience