Incident Cockpit
Incident Cockpit
Description of the
Incident management is a very important part of the service management of IT systems, especially for highly available IT systems. The latter in particular have very special requirements: In the event of faults, the response must be very fast and targeted in order to limit the downtime to a few minutes. This is sometimes necessary 24 hours a day, seven days a week, whereby the number of employees present is very limited for cost reasons. All faults and the measures taken to deal with them must be continuously documented during the fault rectification process so that employees can concentrate on solving the fault. The applications supported can be very heterogeneous and also include infrastructure components. The processes underlying incident management have often been analysed, for example within the framework of ITIL.
The aim of this project is to define and implement an incident cockpit that optimally supports employees in carrying out incident management. The information from various monitoring and ticket systems must be brought together in a clear manner, quick access with search functions to stored and linked documentation must be possible and special functions such as automated logging and report generation must be supported. The employee should be able to quickly obtain a complete overview of the fault in order to then find the cause and possible measures in a knowledge database in a targeted manner. Support for internal and customer communication is another important requirement for the Incident Cockpit.