- 25 Nov '16 15:56 / 7 editsI am trying to derive a normalizing constant for a probability distribution but completely stuck on solving the required very difficult integration for it.

I tried using wolf-alpha but couldn't get that to work for me.

I indicate subscript n with " {n} ".

h{n} ∈ {h{1}, h{2}, h{3}, ... h{m} }

where each h{n} is a randomly sampled real number quantity from a probability distribution thus all these h values are known but there is no particular mathematical pattern to this data set. m is just the number of sampled h values ans m isn't allowed to be zero.

K can be treated here as just an arbitrary constant and is one of the probability distribution's parameters.

K, x, h{n} ∈ ℝ

K, x, h{n} ∈ (–∞, ∞ )

m, n ∈ ℕ

m, n ∈ [1, ∞ )

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

This one has got me totally stumped. I fail to even see which integration rules I can use here. But, in this case, I don't mind if I don't have the derivation method as long as I have the solution.

Any help will be vary greatly appreciated. - 25 Nov '16 21:42

I want to check a few things before I try to solve this (it may be a standard integral). The integral you typed in was:*Originally posted by humy***I am trying to derive a normalizing constant for a probability distribution but completely stuck on solving the required very difficult integration for it.**

I tried using wolf-alpha but couldn't get that to work for me.

I indicate subscript n with " {n} ".

h{n} ∈ {h{1}, h{2}, h{3}, ... h{m} }

where each h{n} is a randomly sampled real number quantity fro ...[text shortened]... derivation method as long as I have the solution.

Any help will be vary greatly appreciated.

I(m) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

Moving the sum outside of the integral we have:

I(m) = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

= ∑[n=1, m] I(m, n)

I(m, n) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

The integrand is:

exp(-1/(2K^2))*(x - k)^2

where k = h(n). My first observation is that the exponent is a constant which means the integral is divergent. Should it in fact be:

e^-((( x/K)^2)/2)

which would make your problem a standard integral, just integrate by parts a couple of times and it'll turn into a Gaussian integral. Can you clarify what the integral is and I'm sure I can do it. I'm fairly sure you can, you need to take a look at Wikipedia's pages on standard integrals and just get a handle on what functions are likely to have a known result. The most important method in applied mathematics is working out how to transform your problem into one someone else has solved in the past. I am certain that this is a known integral and you can get to the correct answer with a little research. - 25 Nov '16 23:11 / 5 edits

I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.*Originally posted by DeepThought***I want to check a few things before I try to solve this (it may be a standard integral). The integral you typed in was:**

I(m) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

Moving the sum outside of the integral we have:

I(m) = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

= ∑[n=1, m] I(m, n)

I(m, n) = ∫[ ...[text shortened]... tain that this is a known integral and you can get to the correct answer with a little research.

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doing something very seriously wrong.

I checked for convergence with three concrete arbitrary examples of sets of inputs thus and, to my relief, it converged in each case and hope these examples will help clarify to you what the integral means;

for inputs (for the simplest possible example);

m=1

h{1} = 1

K=1

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/1^2 ) * ∑[n=1, 1] ( x – h{n} )^2 ) ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/1^2 ) * ( x – 1 )^2 ) dx

= 2.50663

for inputs;

m=2

h{1} = 1.1

h{2} = 2.2

K=7.7

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/7.1^2 ) * ∑[n=1, 2] ( x – h{n} )^2 ) ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/7.1^2 ) * ( ( x – 1.1 )^2 + ( x – 2.2 )^2 ) ) dx

= 12.5091

for inputs;

m=3

h{1} = 3.1

h{2} = 2.1

h{3} = -2.1

K=3

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ∑[n=1, 3] ( x – h{n} )^2 ) ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx

= 1.86324 - 26 Nov '16 04:05 / 1 edit

Doesn't he just mean moving the summation the Z I can't do on my keyboard to the left outside the integral sign? Pardon me if I am pointing out the obvious.*Originally posted by humy***I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.**

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doi ...[text shortened]... ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx

= 1.86324 - 26 Nov '16 08:39 / 1 edit

So you are saying that;*Originally posted by DeepThought***The integral you typed in was:**

I(m) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

Moving the sum outside of the integral we have:

I(m) = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

?

But I wolf-alpha these inputs below for the LHS of that;

m=3

h{1} = 3.1

h{2} = 2.1

h{3} = -2.1

K=3

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ∑[n=1, 3] ( x – h{n} )^2 ) ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx

= 1.86324

but with these same inputs but for the RHS, according to wolf-alpha;

∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 ) dx

+ ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 2.1 )^2 ) dx

+ ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – -2.1 )^2 ) dx

= 7.51988

+ 7.51988

+ 7.51988

= 22.5594

thus, if I got that right, it appears that the LHS is not equivalent to the RHS.

But I think somewhere I may have gone wrong because shouldn't the h{n} value make a difference to the output? According to wolf-alpha, each of those three integrals seems to output the same 7.51988 figure for m=3 and K=3 regardless of their h{n} input. - 26 Nov '16 11:50 / 2 edits

I now discovered that must be somehow wrong because, using a computer program and tried and tested trustworthy numerical methods, I have just tested the unnormlized formula for probability density of;*Originally posted by humy*

But I think somewhere I may have gone wrong because shouldn't the h{n} value make a difference to the output? According to wolf-alpha, each of those three integrals seems to output the same 7.51988 figure for m=3 and K=3 regardless of their h{n} input.

1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 )

just as if it was normalized and, as I expected, it gives the correct probability densities for m=1 (because already normalized for m=1) and, just as I expected, the wrong probability densities for m>1 (because not normalized for m>1).

What I found was that the numerical normalizing factor, which I still don't have a formula for, for a fixed m value and for a fixed sum of h{n}s, is still*dependent*on the individual h{n} values.

Therefore, somehow, what I THINK wolf-alpha is telling me MUST be wrong and h{n} value MUST make a difference to the output!

for example, the normalizing constant is different for m=2 and h{1} = 3 and h{2} = 2 than m=2 and h{1} = 4 and h{2} = 1.

I still haven't work out where I have gone wrong here but at least that is a big clue. - 26 Nov '16 12:52

I'll comment on the convergence of your integral in another post. That one can reorder sums and integrals, and for that matter double integrals, is a consequence of the fact that if a function is Riemann integrable it can be written as the limit of a sum. The order of summations can be changed: (a + b) + (c + d) = (a + c) + (b + d). The rest of this post just shows this in more detail for a double integral. The result for a sum and an integral can be obtained similarly.*Originally posted by humy***So you are saying that;**

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

?

But I wolf-alpha these inputs below for the LHS of that;

m=3

h{1} = 3.1

h{2} = 2.1

h{3} = -2.1

K=3

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ∑[n=1, ...[text shortened]... ntegrals seems to output the same 7.51988 figure for m=3 and K=3 regardless of their h{n} input.

It's a fundamental result in Calculus that it doesn't matter which order one does the integral in. First let's look at a 1 dimensional integral, using a simple version of Riemann's definition, which is good enough for our purposes then the definite integral of some function f(x) which is the first derivative of F(x) can be written as the limit of a sum:

F(b) - F(a) = ∫[a, b] dx f(x) = limit [N -> ∞] ∑[n=1, N] f(x_n) δx

here we're approximating the integral by a sum over n intervals of equal width and sampling the integrand at any point inside the integral. So we can choose x_n = a + (b - a)*n / N, which is just the right hand point of the interval, the width of each interval is δx = (b - a)/N. Just as a check if we choose f(x) = 1, the integral is just the length of the interval (b - a) and the sum is Nδx so we have the obvious result that the area of the whole is the sum of the areas of the parts.

A surface integral is defined similarly, suppose the region which is to be integrated over is R:

I = ∫_{R} dA f(x, y) = limit [p -> ∞] ∑[r=1, p] f(x_r, y_r) δA

Here we've approximated the region with small tiles of area δA, it does not matter what order we do the sum in. We can write this surface integral as a double integral:

I = ∫[a, b] dx ∫[c(x), e(x)] dy f(x, y).

Where a and b are the x coordinates of the left- and rightmost points in the region R and c(x) and e(x) are the top and bottom points of R for a given x coordinate. To keep things simple I'll restrict this to the case where R is a rectangle with pairs of sides parallel to the axes. This can be written as a double sum:

I = limit [M -> ∞] limit [N -> ∞] ∑[m = 1, M] ∑[n = 1, N] f(x_m, y_n) δxδy

Note that we've labeled the coordinates of the tiles differently in this sum. In the surface sum we have them labeled (x_1, y_1), (x_2, y_2), ..., (x_p, y_p). In the double sum we have them labeled (x_1, y_1), (x_1, y_2), (x_2, y_1), ..., (x_M, y_N). For a rectangular region of integration we have p = MN, so there are the same number of terms.

Here we have:

I = (f(x_1, y_1) δy + f(x_2, y_2) δy + ... + f(x_1, y_N) δy) δx + (f(x_2, y_1) δy + f(x_2, y_2) δy + ... + f(x_2, y_N) δy) δx + ... + (f(x_n, y_1) δy + f(x_n, y_2) δy + ... + f(x_n, y_N) δy) δx

I = f(x_1, y_1) δx δy + f(x_2, y_1) δx δy + f(x_1, y_2) δx δy + ...

But this can be written:

I = (f(x_1, y_1) δx + f(x_2, y_1) δx + ... + f(x_n, y_1) δx) δy + (f(x_1, y_2) δx + f(x_2, y_2) δx + ... + f(x_n, y_2) δx) δy + ... + (f(x_1, y_N) δx + f(x_2, y_N) δx + ... + f(x_n, y_N) δx) δy

I = limit [M -> ∞] limit [N -> ∞] ∑[n = 1, N] ∑[m = 1, M] f(x_m, y_n) δxδy

So we can reorder the sums. We can take the limits to get our eventual result that integrals can be reordered. We've also shown that integrals and sums can be reordered. So we have the general result that:

∫dx ∫dy f(x, y) = ∫dy ∫dx f(x, y)

∑[n = 1, N] ∫dx f_n(x) = ∫dx ∑[n = 1, N] f_n(x)

and

∑[m = 1, M] ∑[n = 1, N] f(m, n) = ∑[n = 1, N] ∑[m = 1, M] f(m, n)

I've skipped over some details regarding the boundary and this only applies if the sums and integrals are convergent. - 26 Nov '16 13:09

Right I see it now. The way you have your integrand written I mistook the nesting of the brackets, it is not of the form:*Originally posted by humy***I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.**

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doi ...[text shortened]... ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx

= 1.86324

f(x) = exp(-1/2K^2) ∑[n=1, m] x – h{n} )^2

it is of the form:

f(x) = exp( -(∑[n=1, m] x – h{n} )^2)/2K^2 )

This is not a sum it is a product, it can be rewritten as, using P to indicate a product:

f(x) = P[n = 1, m] exp( - (x - h(n))^2/(2K^2) )

multiplication and addition do not commute, (a + b) * (c + d) != ac + bd. So the reordering won't work. The integral looks like it will converge (at least for finite m). I've got an idea as to how to do this, but it's non-trivial and so I've got to work it out on paper first. - 26 Nov '16 14:44Ok, here is how to do the integral. You want to find:

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

This integral can be written as:

I = ∫[–∞, ∞] dx exp( - ∑[n=1, m] ( x – h{n} )^2 /2K^2)

First let's do the sum:

∑[n=1, m] ( x – h{n} )^2 = ∑[n=1, m] x^2 – 2h{n}x + h{n}^2 = ∑x^2 - 2( ∑ h{n}) x + ∑ h{n}^2 = m(x^2 - 2Hx + H2) = m((x - H)^2 + H2 - H^2) = m((x - H)^2 + σ^2

Where H is the mean and H2 the mean square of the set h{n} and σ the standard deviation. The integral is now:

I = ∫[–∞, ∞] dx exp( -m[( x – H )^2 + σ^2]/2K^2) = exp( -mσ^2 / 2K^2) ∫[–∞, ∞] dx exp( -m( x – H )^2 /2K^2)

Make a change of variable x -> x + H.

I = exp( -mσ^2 / 2K^2) ∫[–∞, ∞] dx exp( -m x^2 /2K^2).

We now change variables again, y = sqrt(m/2)x/K so that dx = dy K/sqrt(m/2), and obtain:

I = K sqrt(2/m) exp( -mσ^2 / 2K^2) ∫[–∞, ∞] dy exp(-y^2)

The integral is now in standard form and ∫[–∞, ∞] dy exp(-y^2) = sqrt[π], so your integral is:

I = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

As a quick check you have m = 1, h(1) = 1, K = 1 giving 2.50663. Since there is only one data point the variance is zero so I = sqrt[2π] which is the same as your result. Your second example has: m = 2, K = 7.7 h(1) = 1.1 and h(2) = 2.2.

The mean, H is 1.65 and the mean square = (1.1^2 + 2.2^2)/2 = 6.05/2 = 3.025 = σ^2, we then have:

I = 7.7 sqrt[π] exp( - 3.025 / 7.7^2) = 7.7 * 1.7725 * 0.9503 = 12.969.

You obtained 12.5091. Assuming you were using some numerical method the error is 3.67% - if this is within the accuracy of your method it seems to confirm my result. - 26 Nov '16 17:02The second example here is wrong. I got the variance wrong.

I = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

m = 2, K = 7.7 and h(1) = 1.1, h(2) = 2.2. The mean is 1.65 and the mean squared is 2.7225, the mean square H2 = (1.1^2 + 2.2^2)/2 = 3.025. The variance is then σ^2 = H2 - H^2 = 0.3025 and we have:

I = 7.7 * sqrt[π] * exp( - 2*0.3025/2*(7.7)^2) = 7.7 * sqrt[π] * exp( - 0.0051) = 13.578

Which is 8.5% different from your result of 12.5091. What do you expect the accuracy of your numerical method to be? - 27 Nov '16 07:21 / 12 edits

I have written in java the numerical method and tested it many hundreds of times over the last few months and, although its accuracy varies depending on how many iteration cycles I explicitly input (the higher that number, the greater the accuracy but at the cost of having to wait patiently longer for the result) it is always reliable and trustworthy and ALWAYS gives an accuracy greater than 1% and usually more like within 0.01% and often more like within ~0.000001% or even better.*Originally posted by DeepThought***The second example here is wrong. I got the variance wrong.**

I = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

m = 2, K = 7.7 and h(1) = 1.1, h(2) = 2.2. The mean is 1.65 and the mean squared is 2.7225, the mean square H2 = (1.1^2 + 2.2^2)/2 = 3.025. The variance is then σ^2 = H2 - H^2 = 0.3025 and we have:

I = 7.7 * sqrt[π] * exp( - 2*0.3025/2*(7.7)^2) = ...[text shortened]... nt from your result of 12.5091. What do you expect the accuracy of your numerical method to be?

Thus I must be doing something wrong here!

I will come back to you when I figured out what I am doing wrong but, so far, using what I always call the 'brute force method' (meaning using blind trial and error and stupid wild extrapolations with numerical methods to slowly and inefficiently work out the correct formula) which I only use out of desperation, I have worked out that two of the conditional normalizing factors are;

m=1 ⇒ normalizing factor = 1 (so it is already normalized)

and

m=2 ⇒ normalizing factor = |K| * 2 * √pi * e^( ((h{1} - h{2})^2) /(4 * K^2) )

-so far, I don't get it.

Now I must tediously find it via 'brute force method' for m=3 etc and extrapolate and test the pattern until I got the right one then I will come back to you. This could take me a long time.... - 27 Nov '16 14:04

I've double checked and my formula is definitely correct. How does the java program work? The integrand is more or less zero over most of its domain so does your algorithm concentrate on the region where the function is large? I'm hoping you'll say something along the lines of "It's a Monte Carlo integration method". If it tries to split the domain up into equally sized regions (within some finite range) then you can get numerical stability problems. I am absolutely certain that my formula is correct. I'm also perfectly capable of fouling up substituting into my formula so I'll do it again using a spreadsheet for the various cases you quote above.*Originally posted by humy***I have written in java the numerical method and tested it many hundreds of times over the last few months and, although its accuracy varies depending on how many iteration cycles I explicitly input (the higher that number, the greater the accuracy but at the cost of having to wait patiently longer for the result) it is always reliable and trustworthy and ALWAYS ...[text shortened]... ttern until I got the right one then I will come back to you. This could take me a long time....**

For the same set of h(n)'s try putting in a very large value of K (say about 1000) and a very small value (say 0.001). My formula is:

I(K, {h(n) | n = 1, ..., m) = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

If K is very large then the exponent is close to 1. So you should get:

I(1000, {h(n) | n = 1, ..., m}) = 1000 sqrt(2π/m) exp( -mσ^2 / 2E6) ~ 1000 sqrt(2π/m)

For the small value of K the exponent is more-or-less zero and you should get something like:

I(0.001, {h(n) | n = 1, ..., m}) = 0.001 sqrt(2π/m) exp( -mσ^2 / 2E-6) ~ 0.001 sqrt(2π/m) exp( -2000000*mσ^2) ~ 0.0

If there is a numerical stability problem with the method then it will tend to get the first of these right and the second wrong. - 27 Nov '16 15:07 / 1 edit

Ok, I've plugged some numbers into a spreadsheet and I get different answers to you for m = 2.*Originally posted by humy***I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.**

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doi ...[text shortened]... ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx

= 1.86324

For the case with 1 element in the set h

h = {1}

K = 1

Your Result = 2.50663

My Result = 2.50663

Case with m = 2

h = {1.1, 2.2}

K = 7.7

Your Result = 12.5091

My Result = 13.5784

For the case with m = 3

h = {3.1, 2.1, -2.1}

K=3

Your result = 1.86324

My result = 1.86324

So m = 2 is the only case we disagree on. K is largish for the case m = 2 which tends to spread the distribution so plausibly your program is trying to sample too small a subset of the domain. - 27 Nov '16 15:14 / 1 edit

My java program doesn't use the Monte Carlo method but rather algorithms by spliting the domain up into equally sized regions. How many small regions is specified in one of the arguments of the method and I always progressively increase that number in each program run to see what the output tends to converge to.*Originally posted by DeepThought***I've double checked and my formula is definitely correct. How does the java program work? The integrand is more or less zero over most of its domain so does your algorithm concentrate on the region where the function is large? I'm hoping you'll say something along the lines of "It's a Monte Carlo integration method". If it tries to split the domain u ...[text shortened]... problem with the method then it will tend to get the first of these right and the second wrong.**

It has always given the correct estimate of the integral in the past and for hundreds of different types of integrals without any problem but I suppose one cannot rule out the possibility from that that somehow, just for this integration, it is giving me numerical stability problems. The arguments of the java method specify the range of the domain in which algorithm concentrate on and I always try different ranges to check I get a consistent result.

However, I think you are probably right and I am probably doing something wrong here for I never had this kind of difficulty before. I am very frustrated here because I cannot see what I might be doing wrong but I will keep investigating this.

Just had an idea; if it IS a numerical stability problem, I could double check it using wolframAlpha, which I assume can handle numbers accurate to many more decimal places than java, with specific examples of input values and if wolframAlpha gives noticeably different output from my java program, that surely means it is my java program that is numerically getting it wrong. I try that next.