Hello,
I am still confused with the difference b/w Probability and Adj. Prob.
Can someone please help explain with the below case ?
support = 913 Prob = 0.196049528301887 ADJUSTEDPROBABILITY = 0.847722596057898
support = 1 Prob = 0.000294811320754717 ADJUSTEDPROBABILITY = 0.867073106641102
Why is the ADj. Probability higher in the second case where the support and probability is much lower ?
Thanks
Kirthika
| | Kirthika Janani Monday, October 05, 2009 4:05 AM | From the FAQ section of http://www.sqlserverdatamining.com:
19. What is $AdjustedProbability or PredictAdjustedProbability()?
The MS Analysis Service DM has a formula to penalize popular items in prediction. Suppose the predicted probability of A and B being purchased are the same, say, 10% based on the customer's basket. Now, suppose A is so popular that 10% of all customers buy while B is not so popular that only 1% of them buy (this is what we call marginal probability that is what we can tell even without looking at that particular customer's basket). In this case, you probably want to recommend B more than A because B would be more interesting to that particular customer. The $AdjustedProbability is a "lift" of the predicted probability over marginal probability. The formula is our internal, undocumented, which may change in the next release, but it looks something like the following:
AdjustedProbability = PredProb * (1 - MargProb) ^ Constant
As you can see, AdjustedProbability is not a probability per se any more. It's intended to be interpreted as "lift" of probability.
Hope this helps Shuvro
MSFT, SQL Server Data Mining - Marked As Answer byJin ChenMSFT, ModeratorFriday, October 16, 2009 2:12 AM
- Unmarked As Answer byJin ChenMSFT, ModeratorFriday, October 16, 2009 2:12 AM
- Marked As Answer byRaymond-LeeMSFT, ModeratorFriday, October 16, 2009 2:06 AM
-
| | Shuvro Mitra Monday, October 05, 2009 9:14 PM | I'm not aware what exact formula is used to arrive at AdjustedProbability. However, I believe Adjusted Probability is more acceptable in some scenarios. Because, Adjusted Probability considers marginal probability/probability at overall population level. For instance, Suppose probability of predicatable attribute A being "x" turns out to be 50% and A being "y" is 10% for a particular case (this is a predicted probability based on algorithm you applied) and for a general population, probability of A being "x" in 80% and A being "y" is 1%. Now, for the above particular case, clearly, A being "y" is more probable (10 times of average probability) whereas A being "x" is lower than average population.. then for this case, Adjusted Probability includes a "lift" making Adjusted Probability for "y" more than 10% while making probability. The simple probability won't consider this factor. So, depending on the practicality (how practical the adjusted probability turns out to be), you might want to go with AdjustedProbability or the direct probability.. [This is what I understand but not very sure. Also, please double-check in your case if something else is causing it]
..hegde | | Mahesh_Hegde Monday, October 05, 2009 6:29 AM | From the FAQ section of http://www.sqlserverdatamining.com:
19. What is $AdjustedProbability or PredictAdjustedProbability()?
The MS Analysis Service DM has a formula to penalize popular items in prediction. Suppose the predicted probability of A and B being purchased are the same, say, 10% based on the customer's basket. Now, suppose A is so popular that 10% of all customers buy while B is not so popular that only 1% of them buy (this is what we call marginal probability that is what we can tell even without looking at that particular customer's basket). In this case, you probably want to recommend B more than A because B would be more interesting to that particular customer. The $AdjustedProbability is a "lift" of the predicted probability over marginal probability. The formula is our internal, undocumented, which may change in the next release, but it looks something like the following:
AdjustedProbability = PredProb * (1 - MargProb) ^ Constant
As you can see, AdjustedProbability is not a probability per se any more. It's intended to be interpreted as "lift" of probability.
Hope this helps Shuvro
MSFT, SQL Server Data Mining - Marked As Answer byJin ChenMSFT, ModeratorFriday, October 16, 2009 2:12 AM
- Unmarked As Answer byJin ChenMSFT, ModeratorFriday, October 16, 2009 2:12 AM
- Marked As Answer byRaymond-LeeMSFT, ModeratorFriday, October 16, 2009 2:06 AM
-
| | Shuvro Mitra Monday, October 05, 2009 9:14 PM |
|