General > Off Topic

Statistics Question

<< < (2/4) > >>

AZPaul:

--- Quote ---B,C,D and E are all user defined, ranging basically from 100% to 0%. Becauset they are independant, B + C + D + E do not have to add up to be 100%, or 0% or 300% or any other value.
--- End quote ---

So, OK, something is amiss because right now I have:

All values are probabilities between 0-100%

Your original relationship is:

Prob A = (prob B ) or (prob C) or (prob D) or (prob E)

A, then, is the value of the highest prob in the list.

Which means that if B=20, C=10, D=50 and E=5 then A=50%, the value of D, yes?

Now you want A = 10%? Yet maintain the ratio balance between BCDE?

So the highest prob (D) must come down to 10%. The ratios with the others is just basic math. Reduce A (and thus D) to 1/5, reduce all values to 1/5. Divide everything by 5. A=10% thus B=4, C=2, D=10, E=1%

I'll take the shvarz route here and say that something is missing and I do not understand the question.

Numsgil:
Remember that Pr(A or B ) = Pr(A) + Pr(B ) - Pr( A)  * Pr( B )

If you just pick the largest, you're assuming that all the other probabilities are inside each other.

That is, you're assuming if Pr(B )  > Pr( C ), then B implies (->) C, which just isn't true.

AZPaul:

--- Quote ---Probability of A = (B or C or D or E)
--- End quote ---

The original relationship is not as above?

What then is the  formula?

If indeed the above is correct then 'A' MUST equal the highest value in the list.

Numsgil:
A by definition is defined as B or C or D or E.

A
= B or C or D or E
= (B or C) or (D or E)
= (B + C - B*C) or (D + E - D * E)
= (B + C -B*C) + (D + E - D*E) - (D + E - D*E)*(B + C -B*C)
= B + C -B*C + D + E - D*E -DB -DC -DBC - EB -EC- EBC +DEB +DEC -DEBC
= B + C + D + E - BC - DE -BD -CD -BCD - BE - CE - BCE + BDE + CDE - BCDE
= B + C + D + E - BC - BD - BE - CD - CE - DE - BCD - BCE + BDE + CDE - BCDE
(check over my math)

that's when my mind explodes.

Also, I'd like the general case to the problem, not just this specific case with 4 variables.  That is, I want A = OR (I = 1 to N) Xi, where N is variable and OR is the or of the whole sequence.

I know that it kind of follows a binomial coefficients distribution (for 4, 1 0*letter, 4 letter, 6 letter*letter, 4 letter*letter*letter, 1 letter*letter*letter*letter), which could be useful.

I know in the end you'll get N equations, N unknowns, and probably have to solve it with matrices).

shvarz:
Is this the mutation frequency calculation I was asking about?  It actually would help if you described exactly the problem you are trying to solve.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version