# Maths problem; what is the solution to this integration?

humy
Science 25 Nov '16 15:56
1. 25 Nov '16 15:567 edits
I am trying to derive a normalizing constant for a probability distribution but completely stuck on solving the required very difficult integration for it.
I tried using wolf-alpha but couldn't get that to work for me.
I indicate subscript n with " {n} ".

h{n} ∈ {h{1}, h{2}, h{3}, ... h{m} }
where each h{n} is a randomly sampled real number quantity from a probability distribution thus all these h values are known but there is no particular mathematical pattern to this data set. m is just the number of sampled h values ans m isn't allowed to be zero.

K can be treated here as just an arbitrary constant and is one of the probability distribution's parameters.

K, x, h{n} ∈ ℝ
K, x, h{n} ∈ (–∞, ∞ )
m, n ∈ ℕ
m, n ∈ [1, ∞ )

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

This one has got me totally stumped. I fail to even see which integration rules I can use here. But, in this case, I don't mind if I don't have the derivation method as long as I have the solution.

Any help will be vary greatly appreciated.
2. DeepThought
25 Nov '16 21:42
Originally posted by humy
I am trying to derive a normalizing constant for a probability distribution but completely stuck on solving the required very difficult integration for it.
I tried using wolf-alpha but couldn't get that to work for me.
I indicate subscript n with " {n} ".

h{n} ∈ {h{1}, h{2}, h{3}, ... h{m} }
where each h{n} is a randomly sampled real number quantity fro ...[text shortened]... derivation method as long as I have the solution.

Any help will be vary greatly appreciated.
I want to check a few things before I try to solve this (it may be a standard integral). The integral you typed in was:

I(m) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

Moving the sum outside of the integral we have:

I(m) = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx
= ∑[n=1, m] I(m, n)

I(m, n) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

The integrand is:

exp(-1/(2K^2))*(x - k)^2

where k = h(n). My first observation is that the exponent is a constant which means the integral is divergent. Should it in fact be:

e^-((( x/K)^2)/2)

which would make your problem a standard integral, just integrate by parts a couple of times and it'll turn into a Gaussian integral. Can you clarify what the integral is and I'm sure I can do it. I'm fairly sure you can, you need to take a look at Wikipedia's pages on standard integrals and just get a handle on what functions are likely to have a known result. The most important method in applied mathematics is working out how to transform your problem into one someone else has solved in the past. I am certain that this is a known integral and you can get to the correct answer with a little research.
3. 25 Nov '16 23:115 edits
Originally posted by DeepThought
I want to check a few things before I try to solve this (it may be a standard integral). The integral you typed in was:

I(m) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

Moving the sum outside of the integral we have:

I(m) = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx
= ∑[n=1, m] I(m, n)

I(m, n) = ∫[ ...[text shortened]... tain that this is a known integral and you can get to the correct answer with a little research.
I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doing something very seriously wrong.
I checked for convergence with three concrete arbitrary examples of sets of inputs thus and, to my relief, it converged in each case and hope these examples will help clarify to you what the integral means;

for inputs (for the simplest possible example);
m=1
h{1} = 1
K=1
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/1^2 ) * ∑[n=1, 1] ( x – h{n} )^2 ) ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/1^2 ) * ( x – 1 )^2 ) dx
= 2.50663

for inputs;
m=2
h{1} = 1.1
h{2} = 2.2
K=7.7
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/7.1^2 ) * ∑[n=1, 2] ( x – h{n} )^2 ) ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/7.1^2 ) * ( ( x – 1.1 )^2 + ( x – 2.2 )^2 ) ) dx
= 12.5091

for inputs;
m=3
h{1} = 3.1
h{2} = 2.1
h{3} = -2.1
K=3
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ∑[n=1, 3] ( x – h{n} )^2 ) ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx
= 1.86324
4. sonhouse
Fast and Curious
26 Nov '16 04:051 edit
Originally posted by humy
I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doi ...[text shortened]... ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx
= 1.86324
Doesn't he just mean moving the summation the Z I can't do on my keyboard🙂 to the left outside the integral sign? Pardon me if I am pointing out the obvious.
5. 26 Nov '16 07:39
Originally posted by sonhouse
Doesn't he just mean moving the summation the Z I can't do on my keyboard🙂 to the left outside the integral sign? .
Yes, but don't yet see why that would mathematically work.
6. 26 Nov '16 08:391 edit
Originally posted by DeepThought
The integral you typed in was:

I(m) = ∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

Moving the sum outside of the integral we have:

I(m) = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx
So you are saying that;
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

?

But I wolf-alpha these inputs below for the LHS of that;
m=3
h{1} = 3.1
h{2} = 2.1
h{3} = -2.1
K=3
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ∑[n=1, 3] ( x – h{n} )^2 ) ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx
= 1.86324

but with these same inputs but for the RHS, according to wolf-alpha;

∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 ) dx
+ ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 2.1 )^2 ) dx
+ ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – -2.1 )^2 ) dx
= 7.51988
+ 7.51988
+ 7.51988
= 22.5594
thus, if I got that right, it appears that the LHS is not equivalent to the RHS.

But I think somewhere I may have gone wrong because shouldn't the h{n} value make a difference to the output? According to wolf-alpha, each of those three integrals seems to output the same 7.51988 figure for m=3 and K=3 regardless of their h{n} input.
7. 26 Nov '16 11:502 edits
Originally posted by humy

But I think somewhere I may have gone wrong because shouldn't the h{n} value make a difference to the output? According to wolf-alpha, each of those three integrals seems to output the same 7.51988 figure for m=3 and K=3 regardless of their h{n} input.
I now discovered that must be somehow wrong because, using a computer program and tried and tested trustworthy numerical methods, I have just tested the unnormlized formula for probability density of;
1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 )
just as if it was normalized and, as I expected, it gives the correct probability densities for m=1 (because already normalized for m=1) and, just as I expected, the wrong probability densities for m>1 (because not normalized for m>1).
What I found was that the numerical normalizing factor, which I still don't have a formula for, for a fixed m value and for a fixed sum of h{n}s, is still dependent on the individual h{n} values.
Therefore, somehow, what I THINK wolf-alpha is telling me MUST be wrong and h{n} value MUST make a difference to the output!
for example, the normalizing constant is different for m=2 and h{1} = 3 and h{2} = 2 than m=2 and h{1} = 4 and h{2} = 1.
I still haven't work out where I have gone wrong here but at least that is a big clue.
8. DeepThought
26 Nov '16 12:52
Originally posted by humy
So you are saying that;
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx = ∑[n=1, m] ∫[–∞, ∞] 1 / e^(( 0.5/K^2 ) * ( x – h{n} )^2 ) dx

?

But I wolf-alpha these inputs below for the LHS of that;
m=3
h{1} = 3.1
h{2} = 2.1
h{3} = -2.1
K=3
∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx
= ∫[–∞, ∞] 1 / e^( ( 0.5/3^2 ) * ∑[n=1, ...[text shortened]... ntegrals seems to output the same 7.51988 figure for m=3 and K=3 regardless of their h{n} input.
I'll comment on the convergence of your integral in another post. That one can reorder sums and integrals, and for that matter double integrals, is a consequence of the fact that if a function is Riemann integrable it can be written as the limit of a sum. The order of summations can be changed: (a + b) + (c + d) = (a + c) + (b + d). The rest of this post just shows this in more detail for a double integral. The result for a sum and an integral can be obtained similarly.

It's a fundamental result in Calculus that it doesn't matter which order one does the integral in. First let's look at a 1 dimensional integral, using a simple version of Riemann's definition, which is good enough for our purposes then the definite integral of some function f(x) which is the first derivative of F(x) can be written as the limit of a sum:

F(b) - F(a) = ∫[a, b] dx f(x) = limit [N -> ∞] ∑[n=1, N] f(x_n) δx

here we're approximating the integral by a sum over n intervals of equal width and sampling the integrand at any point inside the integral. So we can choose x_n = a + (b - a)*n / N, which is just the right hand point of the interval, the width of each interval is δx = (b - a)/N. Just as a check if we choose f(x) = 1, the integral is just the length of the interval (b - a) and the sum is Nδx so we have the obvious result that the area of the whole is the sum of the areas of the parts.

A surface integral is defined similarly, suppose the region which is to be integrated over is R:

I = ∫_{R} dA f(x, y) = limit [p -> ∞] ∑[r=1, p] f(x_r, y_r) δA

Here we've approximated the region with small tiles of area δA, it does not matter what order we do the sum in. We can write this surface integral as a double integral:

I = ∫[a, b] dx ∫[c(x), e(x)] dy f(x, y).

Where a and b are the x coordinates of the left- and rightmost points in the region R and c(x) and e(x) are the top and bottom points of R for a given x coordinate. To keep things simple I'll restrict this to the case where R is a rectangle with pairs of sides parallel to the axes. This can be written as a double sum:

I = limit [M -> ∞] limit [N -> ∞] ∑[m = 1, M] ∑[n = 1, N] f(x_m, y_n) δxδy

Note that we've labeled the coordinates of the tiles differently in this sum. In the surface sum we have them labeled (x_1, y_1), (x_2, y_2), ..., (x_p, y_p). In the double sum we have them labeled (x_1, y_1), (x_1, y_2), (x_2, y_1), ..., (x_M, y_N). For a rectangular region of integration we have p = MN, so there are the same number of terms.

Here we have:

I = (f(x_1, y_1) δy + f(x_2, y_2) δy + ... + f(x_1, y_N) δy) δx + (f(x_2, y_1) δy + f(x_2, y_2) δy + ... + f(x_2, y_N) δy) δx + ... + (f(x_n, y_1) δy + f(x_n, y_2) δy + ... + f(x_n, y_N) δy) δx

I = f(x_1, y_1) δx δy + f(x_2, y_1) δx δy + f(x_1, y_2) δx δy + ...

But this can be written:

I = (f(x_1, y_1) δx + f(x_2, y_1) δx + ... + f(x_n, y_1) δx) δy + (f(x_1, y_2) δx + f(x_2, y_2) δx + ... + f(x_n, y_2) δx) δy + ... + (f(x_1, y_N) δx + f(x_2, y_N) δx + ... + f(x_n, y_N) δx) δy

I = limit [M -> ∞] limit [N -> ∞] ∑[n = 1, N] ∑[m = 1, M] f(x_m, y_n) δxδy

So we can reorder the sums. We can take the limits to get our eventual result that integrals can be reordered. We've also shown that integrals and sums can be reordered. So we have the general result that:

∫dx ∫dy f(x, y) = ∫dy ∫dx f(x, y)
∑[n = 1, N] ∫dx f_n(x) = ∫dx ∑[n = 1, N] f_n(x)
and
∑[m = 1, M] ∑[n = 1, N] f(m, n) = ∑[n = 1, N] ∑[m = 1, M] f(m, n)

I've skipped over some details regarding the boundary and this only applies if the sums and integrals are convergent.
9. DeepThought
26 Nov '16 13:09
Originally posted by humy
I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doi ...[text shortened]... ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx
= 1.86324
Right I see it now. The way you have your integrand written I mistook the nesting of the brackets, it is not of the form:

f(x) = exp(-1/2K^2) ∑[n=1, m] x – h{n} )^2

it is of the form:

f(x) = exp( -(∑[n=1, m] x – h{n} )^2)/2K^2 )

This is not a sum it is a product, it can be rewritten as, using P to indicate a product:

f(x) = P[n = 1, m] exp( - (x - h(n))^2/(2K^2) )

multiplication and addition do not commute, (a + b) * (c + d) != ac + bd. So the reordering won't work. The integral looks like it will converge (at least for finite m). I've got an idea as to how to do this, but it's non-trivial and so I've got to work it out on paper first.
10. DeepThought
26 Nov '16 14:44
Ok, here is how to do the integral. You want to find:

∫[–∞, ∞] 1 / e^( ( 0.5/K^2 ) * ∑[n=1, m] ( x – h{n} )^2 ) dx

This integral can be written as:

I = ∫[–∞, ∞] dx exp( - ∑[n=1, m] ( x – h{n} )^2 /2K^2)

First let's do the sum:

∑[n=1, m] ( x – h{n} )^2 = ∑[n=1, m] x^2 – 2h{n}x + h{n}^2 = ∑x^2 - 2( ∑ h{n}) x + ∑ h{n}^2 = m(x^2 - 2Hx + H2) = m((x - H)^2 + H2 - H^2) = m((x - H)^2 + σ^2

Where H is the mean and H2 the mean square of the set h{n} and σ the standard deviation. The integral is now:

I = ∫[–∞, ∞] dx exp( -m[( x – H )^2 + σ^2]/2K^2) = exp( -mσ^2 / 2K^2) ∫[–∞, ∞] dx exp( -m( x – H )^2 /2K^2)

Make a change of variable x -> x + H.

I = exp( -mσ^2 / 2K^2) ∫[–∞, ∞] dx exp( -m x^2 /2K^2).

We now change variables again, y = sqrt(m/2)x/K so that dx = dy K/sqrt(m/2), and obtain:

I = K sqrt(2/m) exp( -mσ^2 / 2K^2) ∫[–∞, ∞] dy exp(-y^2)

The integral is now in standard form and ∫[–∞, ∞] dy exp(-y^2) = sqrt[π], so your integral is:

I = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

As a quick check you have m = 1, h(1) = 1, K = 1 giving 2.50663. Since there is only one data point the variance is zero so I = sqrt[2π] which is the same as your result. Your second example has: m = 2, K = 7.7 h(1) = 1.1 and h(2) = 2.2.

The mean, H is 1.65 and the mean square = (1.1^2 + 2.2^2)/2 = 6.05/2 = 3.025 = σ^2, we then have:

I = 7.7 sqrt[π] exp( - 3.025 / 7.7^2) = 7.7 * 1.7725 * 0.9503 = 12.969.

You obtained 12.5091. Assuming you were using some numerical method the error is 3.67% - if this is within the accuracy of your method it seems to confirm my result.
11. DeepThought
26 Nov '16 17:02
The second example here is wrong. I got the variance wrong.

I = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

m = 2, K = 7.7 and h(1) = 1.1, h(2) = 2.2. The mean is 1.65 and the mean squared is 2.7225, the mean square H2 = (1.1^2 + 2.2^2)/2 = 3.025. The variance is then σ^2 = H2 - H^2 = 0.3025 and we have:

I = 7.7 * sqrt[π] * exp( - 2*0.3025/2*(7.7)^2) = 7.7 * sqrt[π] * exp( - 0.0051) = 13.578

Which is 8.5% different from your result of 12.5091. What do you expect the accuracy of your numerical method to be?
12. 27 Nov '16 07:2112 edits
Originally posted by DeepThought
The second example here is wrong. I got the variance wrong.

I = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

m = 2, K = 7.7 and h(1) = 1.1, h(2) = 2.2. The mean is 1.65 and the mean squared is 2.7225, the mean square H2 = (1.1^2 + 2.2^2)/2 = 3.025. The variance is then σ^2 = H2 - H^2 = 0.3025 and we have:

I = 7.7 * sqrt[π] * exp( - 2*0.3025/2*(7.7)^2) = ...[text shortened]... nt from your result of 12.5091. What do you expect the accuracy of your numerical method to be?
I have written in java the numerical method and tested it many hundreds of times over the last few months and, although its accuracy varies depending on how many iteration cycles I explicitly input (the higher that number, the greater the accuracy but at the cost of having to wait patiently longer for the result) it is always reliable and trustworthy and ALWAYS gives an accuracy greater than 1% and usually more like within 0.01% and often more like within ~0.000001% or even better.
Thus I must be doing something wrong here!

I will come back to you when I figured out what I am doing wrong but, so far, using what I always call the 'brute force method' (meaning using blind trial and error and stupid wild extrapolations with numerical methods to slowly and inefficiently work out the correct formula) which I only use out of desperation, I have worked out that two of the conditional normalizing factors are;

m=1 ⇒ normalizing factor = 1 (so it is already normalized)
and
m=2 ⇒ normalizing factor = |K| * 2 * √pi * e^( ((h{1} - h{2})^2) /(4 * K^2) )

-so far, I don't get it.

Now I must tediously find it via 'brute force method' for m=3 etc and extrapolate and test the pattern until I got the right one then I will come back to you. This could take me a long time....
13. DeepThought
27 Nov '16 14:04
Originally posted by humy
I have written in java the numerical method and tested it many hundreds of times over the last few months and, although its accuracy varies depending on how many iteration cycles I explicitly input (the higher that number, the greater the accuracy but at the cost of having to wait patiently longer for the result) it is always reliable and trustworthy and ALWAYS ...[text shortened]... ttern until I got the right one then I will come back to you. This could take me a long time....
I've double checked and my formula is definitely correct. How does the java program work? The integrand is more or less zero over most of its domain so does your algorithm concentrate on the region where the function is large? I'm hoping you'll say something along the lines of "It's a Monte Carlo integration method". If it tries to split the domain up into equally sized regions (within some finite range) then you can get numerical stability problems. I am absolutely certain that my formula is correct. I'm also perfectly capable of fouling up substituting into my formula so I'll do it again using a spreadsheet for the various cases you quote above.

For the same set of h(n)'s try putting in a very large value of K (say about 1000) and a very small value (say 0.001). My formula is:

I(K, {h(n) | n = 1, ..., m) = K sqrt(2π/m) exp( -mσ^2 / 2K^2)

If K is very large then the exponent is close to 1. So you should get:

I(1000, {h(n) | n = 1, ..., m}) = 1000 sqrt(2π/m) exp( -mσ^2 / 2E6) ~ 1000 sqrt(2π/m)

For the small value of K the exponent is more-or-less zero and you should get something like:

I(0.001, {h(n) | n = 1, ..., m}) = 0.001 sqrt(2π/m) exp( -mσ^2 / 2E-6) ~ 0.001 sqrt(2π/m) exp( -2000000*mσ^2) ~ 0.0

If there is a numerical stability problem with the method then it will tend to get the first of these right and the second wrong.
14. DeepThought
27 Nov '16 15:071 edit
Originally posted by humy
I am afraid your understanding is beyond mine and I got lost at the "Moving the sum outside of the integral" point; -I will have to spend some time studying that to understand how that works.

You had me seriously worried there for a minute; the integral as a whole must NOT be divergent else it cannot be a normalizing factor and that would mean I must be doi ...[text shortened]... ∞] 1 / e^( ( 0.5/3^2 ) * ( ( x – 3.1 )^2 + ( x – 2.1 )^2 + ( x – -2.1 )^2 ) ) dx
= 1.86324
Ok, I've plugged some numbers into a spreadsheet and I get different answers to you for m = 2.

For the case with 1 element in the set h
h = {1}
K = 1
My Result = 2.50663

Case with m = 2
h = {1.1, 2.2}
K = 7.7
My Result = 13.5784

For the case with m = 3
h = {3.1, 2.1, -2.1}
K=3