Overview
Job Title : Platform Health Engineer
Are you seeking a challenging role in IT monitoring? We are looking for a proactive and talented individual to join our team and perform the following tasks :
Responsibilities
- Utilize monitoring tools to track performance, availability, and health of IT platforms, applications, and infrastructure.
- Monitor systems for errors, slowdowns, outages, and other signs of instability.
- Respond to alerts or alarms generated by monitoring systems in a timely manner.
- Analyze system data and performance metrics to identify trends or anomalies.
- Generate and deliver platform health reports to stakeholders.
- Identify and log incidents, capturing all relevant details.
- Escalate unresolved issues to technical teams or appropriate personnel based on severity and impact.
- Work closely with IT teams to troubleshoot and resolve incidents within predefined SLAs.
- Perform periodic checks to ensure system configurations, logs, and monitoring tools are functioning correctly.
- Support preventative actions to reduce the risk of downtime or platform failures.
- Configure and maintain monitoring tools, dashboards, and alerts.
- Ensure monitoring tools are updated and aligned with current infrastructure setups.
- Collaborate with engineers, system administrators, and other IT teams to maintain a stable and secure IT environment.
- Document monitoring procedures, incident-handling protocols, and resolution patterns for future reference.
#J-18808-Ljbffr