SQL Server Development Bookmark and Share   
 index > Data Mining > Probability Vs. Adjusted Probability
 

Probability Vs. Adjusted Probability

Hello,

I am still confused with the difference b/w Probability and Adj. Prob.

Can someone please help explain with the below case ?

support = 913          Prob = 0.196049528301887 ADJUSTEDPROBABILITY = 0.847722596057898

support = 1              Prob = 0.000294811320754717 ADJUSTEDPROBABILITY = 0.867073106641102


Why is the ADj. Probability higher in the second case where the support and probability is much lower ?

Thanks
Kirthika
Kirthika Janani  Monday, October 05, 2009 4:05 AM
From the FAQ section of http://www.sqlserverdatamining.com:

19. What is $AdjustedProbability or PredictAdjustedProbability()?

 The MS Analysis Service DM has a formula to penalize popular items in prediction. Suppose the predicted probability of A and B being purchased are the same, say, 10% based on the customer's basket. Now, suppose A is so popular that 10% of all customers buy while B is not so popular that only 1% of them buy (this is what we call marginal probability that is what we can tell even without looking at that particular customer's basket). In this case, you probably want to recommend B more than A because B would be more interesting to that particular customer. The $AdjustedProbability is a "lift" of the predicted probability over marginal probability. The formula is our internal, undocumented, which may change in the next release, but it looks something like the following:

 AdjustedProbability = PredProb * (1 - MargProb) ^ Constant

As you can see, AdjustedProbability is not a probability per se any more. It's intended to be interpreted as "lift" of probability.

Hope this helps
Shuvro


MSFT, SQL Server Data Mining
Shuvro Mitra  Monday, October 05, 2009 9:14 PM
I'm not aware what exact formula is used to arrive at AdjustedProbability. However, I believe Adjusted Probability is more acceptable in some scenarios.
Because, Adjusted Probability considers marginal probability/probability at overall population level.
For instance, Suppose
probability of predicatable attribute A being "x" turns out to be 50% and A being "y" is 10% for a particular case (this is a predicted probability based on algorithm you applied)
and for a general population, probability of A being "x" in 80% and A being "y" is 1%.
Now, for the above particular case, clearly, A being "y" is more probable (10 times of average probability) whereas A being "x" is lower than average population.. then for this case, Adjusted Probability includes a "lift" making Adjusted Probability for "y" more than 10% while making probability. The simple probability won't consider this factor.
So, depending on the practicality (how practical the adjusted probability turns out to be), you might want to go with AdjustedProbability or the direct probability..
[This is what I understand but not very sure. Also, please double-check in your case if something else is causing it]

..hegde
Mahesh_Hegde  Monday, October 05, 2009 6:29 AM
From the FAQ section of http://www.sqlserverdatamining.com:

19. What is $AdjustedProbability or PredictAdjustedProbability()?

 The MS Analysis Service DM has a formula to penalize popular items in prediction. Suppose the predicted probability of A and B being purchased are the same, say, 10% based on the customer's basket. Now, suppose A is so popular that 10% of all customers buy while B is not so popular that only 1% of them buy (this is what we call marginal probability that is what we can tell even without looking at that particular customer's basket). In this case, you probably want to recommend B more than A because B would be more interesting to that particular customer. The $AdjustedProbability is a "lift" of the predicted probability over marginal probability. The formula is our internal, undocumented, which may change in the next release, but it looks something like the following:

 AdjustedProbability = PredProb * (1 - MargProb) ^ Constant

As you can see, AdjustedProbability is not a probability per se any more. It's intended to be interpreted as "lift" of probability.

Hope this helps
Shuvro


MSFT, SQL Server Data Mining
Shuvro Mitra  Monday, October 05, 2009 9:14 PM

You can use google to search for other answers

Custom Search

More Threads

• [Error in Processing MiningModel] Problem in Measure having COUNT aggregation
• normalization help
• Prediction students' grades
• Where to adjust Maximum_Sequence_States in BIDS?
• Microsoft.AnalysisServices.AdomdClient
• query about a dmx association rule query
• Add times
• SSAS - Clustering - Algorithm
• What could be reason that mining column doesn't appear in Model Viewer?
• Question about rights