I will compare classical LDA modeling based on Poisson distribution, negative binomial distribution and Poisson mixture distribution with an LDA modeling with some elements of conditional probability analyses.
Suppose that the following number of operational risk events related to card frauds has been recorded (in order to preserve the confidentionality original bank’s data have been randomized, but data structure and pattern were preserved). Also the original period of increased number of events was expanded to months 14 -19 in order to make the differences of results more significant
The analyst should first investigate about the causes of the variation of the number of events and try to model them as a variable related to the changes of operational risk environment. In our case the number of clients, accounts, cards, etc. are not related to the number of events, so we will disregard these explanatory variables. The only cause of the changes in the number of events is an attack or attacks on the bank and its clients. Let us now look at different frequency modeling approaches.Modeling frequency of events with a Poisson distribution (1)
Under this assumption the probability distribution can be expressed as:

with parameter lambda (average monthly number of events equal to 9.52) and k an integer
If we assume an equal losses for operational risk event, VaR99.9% will correspond to 115 events.
However these results are not satisfactory as the variance of data is equal to 840.7 , the value of the (known to be lenient) Kolmogorov Smirnov test is 0.871 with critical value of the test for α=10% 1.22/(n)^0.5=0.176 ; for α=5% 1.36/(n)^0.5=0.196; for α=1% 1.36/(n)^0.5=0.235. In other words the data cannot be described by this Poisson distribution. This result could be expected since there are 2 different periods (one with “increased number of events”, second with a “normal” monthly number of events), a distribution with a single parameter cannot correctly model these 2 cases.
Modeling frequency of events with a negative binomial distribution
Under this assumption the probability distribution can be expressed as:

With k an integer, r>0 and β>0.
Knowing the value of the mean and variance it is easy to determine the value of parameters:
If we assume an equal losses for operational risk event, VaR99.9% will correspond to about 675.
The capital charge is now 6 times bigger than previously. However, the adjustment of the data to the parameter is still poor. The value of the Kolmogorov Smirnov test is 0.197 and is closed to the test’s critical values (remember for α=10%: 0.176; for α=5%: 0.196; for α=1% : 0.235). The explanation is simple: the distribution does not make any difference between periods of “increased” and “normal” number of events considering the former as ordinary but rare cases of the latest).
Modeling frequency of events with a Poisson distribution (2)
An other solution is to model frequency only for the period of increased number of events fo instance with a Poisson distribution.
In this case parameter lambda equals to16.67 and VaR99.9% to 245. The drawbacks of this solution are:
1) that we lose 85% of our observations,
2) we don’t really evaluate the risk at month 48 as we assume that each month we are in a situation of massive fraud.
Modeling frequency of events with a mixed Poisson distribution
Under this assumption we assume that either the number of events follows with a probability “p” a Poisson distribution with a low expected value of parameter lambda (corresponding to the situation of no massive attack on the bank and/or its clients) either the number of events follows with a probability “1-p” a Poisson distribution with a high expected value of parameter lambda (corresponding to the situation of a massive attack on the bank and/or its clients).
Under this assumption the probability distribution can be expressed as[i]:

I suggest to make a maximum likelihood estimation of parameters.
I have obtained the following results:
p=0.896; =0.767 and =84.8
This time the value of Kolmogorov-Smirnov test is 0.0567 that is at 3 and even 4 times lower than the different critical values of the test.
The number of events for VaR99.9% corresponds to 455 events intermediate between the Poisson and log normal distribution. This last results seems much more realistic than the previous ones.
Modeling frequency of events with conditional probabilities.
The final studied approach refers to conditional probabilities.
Let us compare the number of events during the consecutive months:

From the above table we can see that the probability of an attack during the next month provided that we are not incurring an attack is (19+17+4)/(19+18+4)=40/41
The probabilities of an attack during the following next 12 months are:

There is a 2.4% (=1/41) probability that an attack will occur during the first month, the probability that the attack will occur during the second month is 2.38% or 1/41*(40/41), during the third month the probability can be assessed as 1/41*(40/41)^2, for the fourth month 1/41*(40/41)^3, etc.
The next step is to evaluate the expected number of events during the normal period. The value is 0.619 events per months. For the period of attack, the number of events can be modeled using historical data (7 the first month, 56 the second, 113 the third, 95 the fourth, 135 the fifth, 25 the sixth). Of course, if additional data were available a more advanced model would be used.
Basing on these assumption and making a Monte Carlo simulation for the number of events for the “normal period” the value of VaR99.9% will correspond to 440 events. This number is closed to the results obtained with the Mixed Poisson distribution.
An additional advantage of this last model is that it can be easily adapted to the case when an massive attack has begun or is ending (calculate number of losses by using analogy with previous attack and changes in the operational environment like number of clients/cards, etc.).
Conclusion:
Of course it is hard to make generalization based on a single example. However, the analyst should be very cautious utilizing Poisson distribution in order to model operational risk event. If there is a period of increased number of events, results may be biased due to the obvious limitation of the Poisson distribution. The negative binomial distribution seems to be better suited, however its adjustment may be still insufficient. A solution may consist in a correctly weighted mixed model like a Poisson mixture (one distribution used for the “normal” period, the second for the “increased number of events”). An attractive alternative seems to be utilize a conditional probability model. The “normal” period is modelled upon a classical distribution (for instance a Poisson distribution) and the probability of the beginning of a “period of increased number of events” is evaluated with a migration probability table (see table 2). An immediate application are models related to card frauds, phishing and skimming were periods of massive attack are separated by periods of apparent calm. The later model could be adapted to the case of an attack that has already begun or is just ending).
PhD Robert M. Korona
[i])Tröbliger A.(1961) “Mathematische Untersuchungen zur Beitragsrückgewähr in der Kraftfahrversicherung” Blätter der Deutsche Gesellschaft für Versicherungsmathematik, 5, p.327-348.
Brak komentarzy:
Prześlij komentarz