12 Jul '15 10:474 edits

I have been trying to understand one of the equations on this link below about the German tank problem but having problems with it:

https://en.wikipedia.org/wiki/German_tank_problem

Near the very top of that link just under “Examples”, it says

“Suppose an intelligence officer has spotted k = 4 tanks with serial numbers, 2, 6, 7, and 14, with the maximum observed serial number, m = 14. The unknown total number of tanks is called N.

...

...the Bayesian analysis below yields (primarily) a probability mass function for the number of tanks ...”

And it then gives the equation for the probability for that starting with:

“ Pr(N = n) = ...”

now, where it says the condition “if n < m “, it makes perfect sense to me because it says the probability is 0 which makes sense because you cannot have the actual total number of tanks being less than the observed number of tanks!

But, where it says the condition “if n >= m “, that is where I get confused because the equation it gives appears to me to consist of the product of two fractions with the first one being “ (k – 1)/k “. The problem I have with that is that k is supposed to be the number of observed tanks thus, if there was only one tank observed i.e. if k=1, that would mean the “(k – 1)” part of that will equal 0 and thus we have;

(k – 1)/k = 0/1 = 0

and that would mean, whatever the value of the second fraction is, the product of the two fractions is always going to be 0 (because anything multiplied by 0 is 0 ) thus, according to that equation if I read that right, the probability , for any n (when k=1 ) will always be 0, which is total nonsense! You cannot have the sum of the probabilities of an exhaustive list of mutually exclusive possibilities adding up to zero probability!

So I assume I have misread that equation? If so, how so? If not, how can it make any sense?

OK, lets just ignore that problem for a moment and just consider k values greater than 1 and look at the second fraction which I read to be the fraction consisting of the value of one binomial coefficient divided by the value of another binomial coefficient. Have I read that right?

If I read that right, each of the two binomial coefficients is expressed as a single vertical column which I cannot edit here so lets express the numerator (top half of the fraction ) one as:

C(m – 1, k – 1)

then, according the the equation shown on https://en.wikipedia.org/wiki/Binomial_coefficient , and if we let g = m – 1 and

h = k – 1, that numerator is equal to:

C(m – 1, k – 1) = C(g, h) = g!h!(g – h)!

and the denominator of that second fraction is:

C(n, k) = n!k!(n – k)!

so the whole equation for Pr(N = n) for when n >= m is:

Pr(N = n) = ( (k – 1)/k ) / ( ( g!h!(g – h)! ) / ( n!k!(n – k)! ) )

(if n >= m and we let let g = m – 1 and h = k – 1 )

Have I got that right? Because I made a java program for this and when I try and run an iteration to find the sum of all the probabilities for all possible n values with some k value for k >1, they appear to almost converge to 1 but appear to me to not quite be converging to a probability of 1 but rather appear to be converging to a probability very close to 1 but just under 1, which doesn't quite make sense. But, that said, because it doesn't appear to make sense for k=1 anyway (always giving a sum of probabilities of 0 for k=1 ) I assume that means I have somehow gone wrong before that, right?

https://en.wikipedia.org/wiki/German_tank_problem

Near the very top of that link just under “Examples”, it says

“Suppose an intelligence officer has spotted k = 4 tanks with serial numbers, 2, 6, 7, and 14, with the maximum observed serial number, m = 14. The unknown total number of tanks is called N.

...

...the Bayesian analysis below yields (primarily) a probability mass function for the number of tanks ...”

And it then gives the equation for the probability for that starting with:

“ Pr(N = n) = ...”

now, where it says the condition “if n < m “, it makes perfect sense to me because it says the probability is 0 which makes sense because you cannot have the actual total number of tanks being less than the observed number of tanks!

But, where it says the condition “if n >= m “, that is where I get confused because the equation it gives appears to me to consist of the product of two fractions with the first one being “ (k – 1)/k “. The problem I have with that is that k is supposed to be the number of observed tanks thus, if there was only one tank observed i.e. if k=1, that would mean the “(k – 1)” part of that will equal 0 and thus we have;

(k – 1)/k = 0/1 = 0

and that would mean, whatever the value of the second fraction is, the product of the two fractions is always going to be 0 (because anything multiplied by 0 is 0 ) thus, according to that equation if I read that right, the probability , for any n (when k=1 ) will always be 0, which is total nonsense! You cannot have the sum of the probabilities of an exhaustive list of mutually exclusive possibilities adding up to zero probability!

So I assume I have misread that equation? If so, how so? If not, how can it make any sense?

OK, lets just ignore that problem for a moment and just consider k values greater than 1 and look at the second fraction which I read to be the fraction consisting of the value of one binomial coefficient divided by the value of another binomial coefficient. Have I read that right?

If I read that right, each of the two binomial coefficients is expressed as a single vertical column which I cannot edit here so lets express the numerator (top half of the fraction ) one as:

C(m – 1, k – 1)

then, according the the equation shown on https://en.wikipedia.org/wiki/Binomial_coefficient , and if we let g = m – 1 and

h = k – 1, that numerator is equal to:

C(m – 1, k – 1) = C(g, h) = g!h!(g – h)!

and the denominator of that second fraction is:

C(n, k) = n!k!(n – k)!

so the whole equation for Pr(N = n) for when n >= m is:

Pr(N = n) = ( (k – 1)/k ) / ( ( g!h!(g – h)! ) / ( n!k!(n – k)! ) )

(if n >= m and we let let g = m – 1 and h = k – 1 )

Have I got that right? Because I made a java program for this and when I try and run an iteration to find the sum of all the probabilities for all possible n values with some k value for k >1, they appear to almost converge to 1 but appear to me to not quite be converging to a probability of 1 but rather appear to be converging to a probability very close to 1 but just under 1, which doesn't quite make sense. But, that said, because it doesn't appear to make sense for k=1 anyway (always giving a sum of probabilities of 0 for k=1 ) I assume that means I have somehow gone wrong before that, right?