How to find the probabilities of a list of lists using python and without any libraries

Find the probabilities

Given a list of lists, each sublist will be of length 2 i.e. [[x,y],[p,q],[l,m]…[r,s]] consider its like a martrix of n rows and two columns

  1. The first column F will contain only 5 unique values (F1, F2, F3, F4, F5)
  2. The second column S will contain only 3 unique values (S1, S2, S3)

How to find
a. Probability of P(F=F1|S==S1), P(F=F1|S==S2), P(F=F1|S==S3)
b. Probability of P(F=F2|S==S1), P(F=F2|S==S2), P(F=F2|S==S3)
c. Probability of P(F=F3|S==S1), P(F=F3|S==S2), P(F=F3|S==S3)
d. Probability of P(F=F4|S==S1), P(F=F4|S==S2), P(F=F4|S==S3)
e. Probability of P(F=F5|S==S1), P(F=F5|S==S2), P(F=F5|S==S3)

I could not find any solution for this. Could anyone of you shed light on conditional probability with python.

Do you understand the mathematics of conditional probability?

1 Like

To my understanding, conditional probability is the probability of an event given that some condition is taken to be true. Mathematically
P(A|B) = P(A intersection B)/P(B)

You have, indeed, read the first line of the wikipedia page.

So keep going. Take your first problem:

Define A, B, P(B), and P(A n B).

Here, i’ll give you this as a starting point: A tester for your code.

import random
K = []
for x in range(100):
    K.append([random.randint(1,5),random.randint(1,3)])

print K
print "==="
for j in range(3):
	total = 0
	for i in range(5):
		c = condprob((i+1),(j+1),K)
		total += c
		print str(i+1)+","+str(j+1)+","+str(c)
	print "Total "+str(total)
	print "==="

def condprob(i,j,k):

All you have to do at this point is define the function. Technically I used the random library, but that doesn’t mean you need to to generate the answers.

1 Like

I am just trying to analyse it,
given n combinations. lets say n=10 combinations of F and S.
P(F=F1|S==S1) = P(F1 intersection S1) / P(S1) ;
If the knowledge that event A occurs does not change the probability that event B occurs, then A and B are independent events, and thus,
P(B|A) = P(B)
As per Baye’s theorem P(A intersection B) = P(B|A)P(A)/P(B) = P(B)P(A) / P(B)P(B) = P(A)/P(B)

therefore P(F1|S1) = P(S1|F1)P(F1)/P(S1)P(F1)=P(F1)/P(S1)
P(F1) = 1/5 and P(S1)=1/3
–> (1/5)/(1/3)=0.6 but the answer given is 1/4. How is this possible.
Here I am confused with math

Thank you very much

This is only true for the probabilistic case.
for a GIVEN list of terms, P(F1) may not be 1/5.

Also, you’ve made a very large assertion here:

But it does. If A and B are independent, then P(B|A) = P(B), and P(A|B) = P(A), and then you don’t need to do any math at all, and P(A|B) is P(A), so your answer would be 1/5.

The assumption is that A and B are NOT independant, otherwise the probability calculation is irrelevant.

So here’s a concrete example for your brain to chew on.
[F,S]:
[[1,3],[1,2],[2,4],[2,2],[3,2]]

What is P(F1|S2)?

1 Like
def compute_conditional_probabilities(A):
    lstF1=[]
    lstF1S1=[]
    lstF1S2=[]
    lstF1S3=[]
    lstF2=[]
    lstF2S1=[]
    lstF2S2=[]
    lstF2S3=[]
    lstF3=[]
    lstF3S1=[]
    lstF3S2=[]
    lstF3S3=[]
    lstF4=[]
    lstF4S1=[]
    lstF4S2=[]
    lstF4S3=[]
    lstF5=[]
    lstF5S1=[]
    lstF5S2=[]
    lstF5S3=[]
    for i in A:
        for j in i:
            if(j=='F1'):
                lstF1.append(i)
                print(lstF1)
    for k in lstF1:
        if(k[1]=='S1'):
            lstF1S1.append(k)
            print(lstF1S1)
        if(k[1]=='S2'):
            lstF1S2.append(k)
            print(lstF1S2)
        if(k[1]=='S3'):
            lstF1S3.append(k)
            print(lstF1S3)
    for i in A:
        for j in i:
            if(j=='F2'):
                lstF2.append(i)
                print(lstF2)
    for k in lstF2:
        if(k[1]=='S1'):
            lstF2S1.append(k)
            print(lstF2S1)
        if(k[1]=='S2'):
            lstF2S2.append(k)
            print(lstF2S2)
        if(k[1]=='S3'):
            lstF2S3.append(k)
            print(lstF2S3)
    for i in A:
        for j in i:
            if(j=='F3'):
                lstF3.append(i)
                print(lstF3)
    for k in lstF3:
        if(k[1]=='S1'):
            lstF3S1.append(k)
            print(lstF3S1)
        if(k[1]=='S2'):
            lstF3S2.append(k)
            print(lstF3S2)
        if(k[1]=='S3'):
            lstF3S3.append(k)
            print(lstF3S3)
    for i in A:
        for j in i:
            if(j=='F4'):
                lstF4.append(i)
                print(lstF4)
    for k in lstF4:
        if(k[1]=='S1'):
            lstF4S1.append(k)
            print(lstF4S1)
        if(k[1]=='S2'):
            lstF4S2.append(k)
            print(lstF4S2)
        if(k[1]=='S3'):
            lstF4S3.append(k)
            print(lstF4S3)
    for i in A:
        for j in i:
            if(j=='F5'):
                lstF5.append(i)
                print(lstF5)
    for k in lstF5:
        if(k[1]=='S1'):
            lstF5S1.append(k)
            print(lstF5S1)
        if(k[1]=='S2'):
            lstF5S2.append(k)
            print(lstF5S2)
        if(k[1]=='S3'):
            lstF5S3.append(k)
            print(lstF5S3)

I have written the above code. But now my problem is calculating probabilities. How can we achieve that.

You tell me.
What do you need to calculate to determine the conditional probability of a given set of values.
Based on what you’ve written, what is the formula for P(S1)? What is the formula for P(F1 n S1)?

P(A intersec B )= P(A|B) * P(B)= (P(A)*P(B|A)*P(B))/P(B)= P(A)*P(B|A)
I am still trying to figure out.

Okay, ignore the probabilities for a moment.

if I tell you that A = S2, and B = F1, then what does A n B look like? What element qualifies to be part of A n B?

SInce both are independent, There is no intersection between S2 and F1;
A n B = S2 n F1 = 0. Am I making sense?

They’re not independent, so no.

please help me understand.

Okay. Lets… go more basic. Forget the numbers altogether.

You have balls in a bag.

The balls are various colors (red,green,blue,yellow,and purple)

Let’s say in our basic and hyperbolic example, i’ve got 4 balls in the bag:
A green one
A red one
A green one
and a purple one.

Before I do anything, What is my probability of pulling a green ball out of the bag? (P(A))
What is my probability of pulling a purple ball out of the bag? (P(B))
Let’s say I pulled a green ball out of the bag. So A happened.
What NOW is the probability of me pulling a purple ball out of the bag? (P(B|A))

P(A)=1/4
P(B)=1/4
p(B|A)=1/3
Am I correct

If there are two green balls in the bag, how is P(A) 1/4?

my mistake, sorry
pulling a green ball = P(A)=2/4= 1/2

Right.

So You see that P(B) changed from 1/4 to 1/3 when i told you that A happened.

That means that P(A) and P(B) are not independent - the probabilities changed because of the event occurring.

Contrast: If I flip a (fair) coin and it lands on heads, what are the odds of it being tails when I flip it again?

A: Well, assuming a fair coin, P(A) [flipping the coin and landing on heads] = 1/2, and P(B) [flipping the coin and landing on tails] = 1/2. If I flip the coin and it lands on heads, my odds of flipping it again and it landing on tails is… still 1/2. I haven’t changed anything about the coin. So P(B|A) = P(B). The two coin flips are independent events. (This is an abstraction of disproof of the Gambler’s Fallacy - “It flipped heads 4 times in a row, so it’s gotta come up tails next!”)

Now let’s go back a step towards your problem, with the example I gave you earlier.

[F,S]:
[[1,3],[1,2],[2,4],[2,2],[3,2]]

What’s the probability of S=2? (P(S2))
What’s the probability of F=1? (P(F1))
What’s the probability of F=1 if I tell you that S IS 2? (P(F1|S2))