5.4.2 The Prosecutor Fallacy Revisited P124

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

5.4.2 The Prosecutor Fallacy Revisited P124

Jun Wang
I might be wrong, but I would say
P(E) = 1/1000
P(E|H)=9/9999=1/1111

This gives P(H|E)=0.9
Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

norman@eecs.qmul.ac.uk
Administrator
I will look at this on 28 May

Norman Fenton
Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

norman@eecs.qmul.ac.uk
Administrator
In reply to this post by Jun Wang
The example says that 'about 1 in a 1000 people have this blood type'. The assumption here is that this proportion applies universally - not just to the 10,000 adults mentioned later in the example as being possible suspects. So it is fair to assert that the probability of a randonly selected innocent person having the matching blood type is about 1 in a 1000, i.e. P(E | H) = 1/1000.
In fact P(E) is precisely what is calculated in the denominator of the Bayes equation, i.e. is equal to
P(E|H)*P(H) + P(E|not H)*P(not H) = 10999/10,000,000, which is slightly more than 1/1000.

Norman
Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

Vladimir
First, thanks for the book - quite interesting and new (for me at least).

But your reasoning seems wrong:
H - hypo "Fred is innocent" and h = "not H"

Statement "about 1 in a 1000 people have this blood type" is UNCONDITIONAL probability (confidence) P(E) = 1/1000/

Then the Bayes says (for hypo "h"): P(h|E) = P(E|h)*P(h)/P(E).

If the population is 10,000 and there is no other evidence against Fred,
then "prior" P(h) = 1/10,000.

Apparently P(E|h) = 1 ("Fred WAS present at the crime scene") and then we have:
P(h|E) = 1*(1/10,000)/(1/1000) = 0.1 and P(H|E) = 0.9

To calculate P(E|H) we note P(E) = P(E|h)*P(h) + P(E|H)*P(H).
Prior P(H) = 9,999/10,000 (anybody at the scene, except Fred) and after some math:

P(E|H) = (P(E) - P(E|h)*P(h))/P(H) = (1/1000 - 1*1/10,000)/(9,999/10,000), i.e.

P(E|H) = 1/1,111.

And reasoning: if we know for sure that Fred (with his blood type) WAS NOT at the crime scene the value of P(E|H) shall be LESS than P(E) as he has to be excluded both from number of suspects (now 9 instead of 10) and number of population (9,999 instead of 10,000).

Anyway, thanks again for the book
Vladimir
Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

norman@eecs.qmul.ac.uk
Administrator
Vladimir

The confusion which you are picking up on in this example concerns the 'population', together with the background knowledge.
Neither of these is made explicit in the example (we will fix this in a future edition)
There are actually two populations being referred to: When we refer to the 1 in a 1000 having the matching blood type we are really talking about the 'world population' rather than the 10,000 adult males who were in the town (and are the only ones we assume are possible suspects).
Now  we know that one of the 10,000 (the guilty person) has the matching blood type.
The question is how many of the other 99,999 innocent people will also have that blood type.
Now if you assume that the 1 in 1000 match is somehow 'exact' for all samples of 10,000 people then we know that 10 people from the 10,000 will match. So that would mean exactly 9 innocent people of the 10,000 match. On that basis you could conclude - as you do = that P(H|E) = 0.9.
HOWEVER, because the 1 in 1000 blood match type refers to the entire population - and not uniformly to each sample size of 10,000, we have to assume that the probability of any one of the 9999 innocent people having the same blood type is still 1/1000, i.e. P(E|H) = 1/1000.

Norman Fenton
When we are talking about P(E | H)  we are actually using the 10,000 population rather than the world population so technically taking P(E | H) = 1/1000 is an approximation. We should actually have used
P(E | H) = 99/9999
Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

Vladimir
Norman, thanks for the prompt reply.

i'm sorry to say but i used that 10,000 figure just to follow your example. You can easily generalize the case for entire Earth (or Universe :)) population, let say N:

Provided there is no other evidence against Fred, he is guilty (Hypo "h") with prior probability P(h) = 1/N. I.e. P(H) = (N-1)/N. If N -> inf. then Fred is practically innocent (if there are NO other evidences - prior knowledge).

As before:
- P(E|h)=1, of course (he was at the scene with his blood type);
- P(E) = 1/1000 (or whatever, can be ANY figure!) - UNCONDITIONAL probability. Without saying N shall be > 1000!

Now P(h|E) = (1*(1/N))/P(E) = 1/(N*P(E)).

Then P(H|E) = 1-P(h|E) = (N*P(E)-1)/(N*P(E)).

Now we go to P(E|H) = (P(E) - P(E|h)*P(h))/P(H):

P(E|H) = (P(E) - 1*1/N)/(N-1)*N = (N*P(E)-1)/(N-1) = P(E)*[(N-1/P(E))/(N-1)].

Apparently the factor in the brackets = (N-1/P(E))/(N-1) < 1 and, respectively, P(E|H)  < P(E).

Therefore example in your book suggest N -> infinity and P(E|H) = P(E). In case of "limited" population, which CAN have access to the site, e.g.
- if P(E) =1/1000 then P(E|H) = 1/1000*(N-1000)/(N-1).
- if N=10,000 then 1/1000*9,000/9,999=1/1,111 as before

Hope it clarifies.

cheers
Vladimir


Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

norman@eecs.qmul.ac.uk
Administrator
Vladimir

Just as the prior for P(H) (=1/10,000)  is conditioned on the background knowledge of there being only 10,000 possible suspects, so must P(E) be conditioned on the same background knowledge.
Hence it is WRONG to assume P(E)=1/1000.
In fact P((E) = P(E|H)*P(H) + P(E|h)*P(h)  and this is exactly the denominator used on page 124, namely it works out as 10,999/10000000 = 0.0011
i.e. P(E) is greater than 1/1000

Norman Fenton
Reply | Threaded
Open this post in threaded view
|

Re: 5.4.2 The Prosecutor Fallacy Revisited P124

Vladimir
This post was updated on .
Norman,

once again - i do NOT make any suggestions on the exact figures:
P(E) = probability of statement "about 1 in a XXX people have this blood type" shall NOT depend if Fred is guilty or innocent, i.e. can be any figure: e.g. it can 1/4 in avaerage (like if 4 types of blood were know years ago equally distributed among entire Earth population).

In your reasoning we have exactly this:
- if Fred is innocent, then "about 1 in a 1000 people have this blood type"
- and what if Fred is guilty? Does it change overall [worldwide/citywide/villagewide] statistics?

If i follow your logic, i have to revise scientific (or, assumed being scientific/statistic/objective - choose name of preference) PRIOR knowledge that "about 1 in a 1000 people have this blood type" and make it depended on innocence of Fred.

I.e. I have to   assume P(E|given the city particular exactly 10,000 men are tested AND Fred is innocent) = 1/1000 - then your are right. But i'm doubtful if the meaning in your example was like this.

cheers
Vladimir