Page 1 of 1

Clear up a Problem argument please?

Posted: Fri Jan 31, 2014 4:01 am
by spinningnut
A dispute going on in the office around PM - the scenario is something like this:-

1st Scenario:-
1 incident received - workaround found - incident closed.
2 incident received - same as 1st incident logged - problem ticket raised

Q: do we link both incidents to the problem ticket? do we close the 2nd incident with the workaround whilst the newly opened problem ticket is now open?

2nd Scenario:-
5 incidents of the same issue have been logged but no workaround yet found

Q: do they all stay open as incidents until the workaround is fixed then a problem is raised or open a problem ticket whilst the workaround is being dealt with?

Hope you can help clear this up.

Posted: Fri Jan 31, 2014 4:19 am
Your problem is that you are not understanding the difference between Problem Managemetn and Incident Management.

They are SEPARATE yet related

First, an Incident lifecyle - open, work on, close - is dependent on whether the Service impacted by the incident has had a successful resolution.

In other words, the Incident ticket would remain open until the issue (incident) has been solved.
For example: If the user says the web site is down, the incident stays open until the web site can be confirmed to be up.

Regardless of how many incidents are raised for the service being impacted, the tickets remain open in line with the above - unless there is a major incident process and you have a means to consolidate / group / master / subordinate ticket - etc.

Problem Management

Problem Management has NOTHING to do with Incident Managemetnt. PM is concerning with the why it happens and does not care if the service is UP or DOWN.

Conversely, Incident Management is concerned about the service being restored to the customer / user THAT IS ALL. IM does NOT CARE why it happened only that it happened.

Probelm ticket lifecycles are independent of IM tickets. The PM ticket can have references to as many IM tickets as there are occurances for the issue.

A Problem ticket is raised if the PM Policy criteria has been met for creating a PM record. The PM ticket is then worked on in accordance with the PM Schedule and urgency schedule NOT the IM urgency schedule.

The PM issue is investigated, analysed and attempts to determine first the underlying unknown cause of the incident(s). Once this is done, then, there is either an official work around- reboot the machine being the simplest - which is giving to the IM Resolution teams to use. Granted, the work around may do nothing more than extend the life of service while a more permanent fix is found

If a solution is found for the issue and can be implemented, then the Change & Release Process is invoked.

Posted: Fri Jan 31, 2014 4:39 am
by spinningnut
Thank you for the most helpful and comprehensive response.

So e.g. if the Service Desk Analyst has now dealing with 3 incidents of the same issue but a work around hasn't been found yet are you saying that these will remain open as incidents until a workaround is found and it's not necessary to open a problem record? What I am trying to say is at which point would a problem record be triggered?

Thank you

Posted: Fri Jan 31, 2014 5:52 am
Your PM Team should have a criteria as to whether a Problem Candidate is made to the PM Team. The Problem Candidate is reviewed to determine who would do the PM analysis and then if accepted, then the PM Team works on the Problem record

In addition, you need to separate the restoral of the service (incident mgmt) and finding a broken part (problem mgmt).

If the IM Resolution team can do something - ie the cause is KNOWN and the solution - while not a perment solution - can be done.

For example the web service hangs because the file space for cache is gone. IM Team clears cache. (IM). The IM asks for more disk space for cache (IM).

The web service hangs because the file space for cache is gone. The IM team clears the cache(IM). IM asks for the more disk space(IM). Issue still continues. PM gets a ticket. Analysis finds current web app - apache version - has a undocumented bug. (PM Issue). The solution is requested from vendor. tested in a test environment. gone to CAB for produciton deployment. Installed in a maintenance window. Meanwhile IM still clears cache every few days and despite adding more size to the cache twice,... the IM team has to do this.

So the bug fix is installed.
The IM team is relieved that there are no more web hangs and the cache does not need clear
The PM watches to see if during a defined period the web site cache issue does not repeated (the period should reflect the frequency of the incidents recycle time)

So, there will be a bunch of IM tickets for the issue. The IM records are closed once the web site cache was cleared and the service restarted
The PM record(1 record) will have links or references to all of the IM records.

Now what may happen is that the SD Management may decide to have the web site cache cleared every 2 days to prevent the incidents from happening

Posted: Fri Jan 31, 2014 9:42 am
by spinningnut
Thank you. I guess therefore it's the decision made by the Service Desk handling the incidents whether or not a problem ticket is created and not as stringent in the way I was thinking. (create a PR based on 1 or more incidents of a similar nature).

Many thanks for your response and it has helped.