Discussion:
mvnpdf > 1 ?
(too old to reply)
carl
2010-02-12 14:18:13 UTC
Permalink
In my matlab code I use the mvnpdf function to compute the probability for a
set of samples based on the following covariance matrix (Sigma) and mean
(mu):

mu =

0.9556 0.2994 0.2569

Sigma =

0.0082 0.0052 0.0067
0.0052 0.0171 0.0199
0.0067 0.0199 0.0241



prob = mvnpdf(X , mu, Sigma);

where X is a 1000*3 matrix containing various samples.

If I do:

max(prob)

I get:


197.3913

And if I do:

sum(prob)

I get:

1.2987e+006


I would expect that max(prob) would be 1 (since probabilites lies in the
range [0;1]) and that max(prob) would also be 1 since the integral should
sum to 1.

I assume that me problem is that I mix the term:

probability density function
http://en.wikipedia.org/wiki/Probability_density_function

with the term:

probability distribution
http://en.wikipedia.org/wiki/Discrete_probability_distribution

But maybe someone could give me a hint to better understand the difference?
Brian Borchers
2010-02-12 15:32:37 UTC
Permalink
A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.

For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is

f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.

The probability that x is between 0 and 0.1 is

P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.
carl
2010-02-12 16:04:08 UTC
Permalink
Post by Brian Borchers
A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.
For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is
f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.
The probability that x is between 0 and 0.1 is
P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.
Ok I thought that mvnpdf corresponded to this expression:

Loading Image...

which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.

And that the plot would look like this:

http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg

depending on the parameters sigma and the mean. But cleary that this is not
mvnpdf. Maybe its best to implement things from scratch to understand how
they work.
John D'Errico
2010-02-12 16:21:06 UTC
Permalink
Post by carl
Post by Brian Borchers
A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.
For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is
f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.
The probability that x is between 0 and 0.1 is
P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.
http://upload.wikimedia.org/math/a/d/4/ad4c63257208b495d1084a74a15e0113.png
which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.
http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg
depending on the parameters sigma and the mean. But cleary that this is not
mvnpdf. Maybe its best to implement things from scratch to understand how
they work.
It does correspond to that expression. But you need
to appreciate that that expression can easily be larger
than 1.

Only the integral must be 1.

John
carl
2010-02-12 16:54:22 UTC
Permalink
Post by John D'Errico
Post by carl
Post by Brian Borchers
A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.
For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is
f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.
The probability that x is between 0 and 0.1 is
P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.
http://upload.wikimedia.org/math/a/d/4/ad4c63257208b495d1084a74a15e0113.png
which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.
http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg
depending on the parameters sigma and the mean. But cleary that this is
not mvnpdf. Maybe its best to implement things from scratch to understand
how they work.
It does correspond to that expression. But you need
to appreciate that that expression can easily be larger
than 1.
Only the integral must be 1.
Ok but when I do:

sum(prob)

I get:

1.2987e+006

which is not 1! So the sum/integral over the density returned by mvnpdf can
also be larger than 1.
John D'Errico
2010-02-12 17:46:21 UTC
Permalink
Post by carl
Post by John D'Errico
Post by carl
Post by Brian Borchers
A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.
For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is
f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.
The probability that x is between 0 and 0.1 is
P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.
http://upload.wikimedia.org/math/a/d/4/ad4c63257208b495d1084a74a15e0113.png
which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.
http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg
depending on the parameters sigma and the mean. But cleary that this is
not mvnpdf. Maybe its best to implement things from scratch to understand
how they work.
It does correspond to that expression. But you need
to appreciate that that expression can easily be larger
than 1.
Only the integral must be 1.
sum(prob)
1.2987e+006
which is not 1! So the sum/integral over the density returned by mvnpdf can
also be larger than 1.
No. I think you misunderstand what an integral is,
or at least have forgotten.

A sum is not an integral. They are different things.

John
Steven Lord
2010-02-12 17:52:34 UTC
Permalink
"carl" <***@.com> wrote in message news:4b7587c5$0$272$***@news.sunsite.dk...

*snip*
Post by carl
sum(prob)
1.2987e+006
which is not 1!
That is a correct statement and the correct behavior for SUM.
Post by carl
So the sum/integral over the density returned by mvnpdf can also be larger
than 1.
Only half of that statement is correct.

The sum of the density _at the points at which you evaluated it_ can be
greater than 1.
The integral of the density cannot be greater than 1 (modulo roundoff
error.)

Let's take a simple function whose integral we know to be 1.

x = [-1 0 0 0.5 1 1 2];
y = [0 0 1 1 1 0 0];
plot(x, y, '-o')
axis equal

Assuming that the function is zero outside the range [0, 1], the area under
this function is a square with side 1, and the integral of the function is
the area of the square. Let's double-check that its integral is 1:

trapz(x, y)

Now how about the sum of the y values?

sum(y)

In fact, you can make the sum of the y values arbitrarily large without
changing the integral of the function. Change delta in the code below and
see how the integral and the sum change.

delta = 0.1;
t = 0:delta:1;
x = [-1 0 t 1 2];
y = [0 0 ones(size(t)) 0 0];
plot(x, y, '-o')
axis equal
integral = trapz(x, y)
thesum = sum(y)
--
Steve Lord
***@mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
Peter Perkins
2010-02-12 15:33:24 UTC
Permalink
Post by carl
In my matlab code I use the mvnpdf function to compute the probability
for a set of samples based on the following covariance matrix (Sigma)
I would expect that max(prob) would be 1 (since probabilites lies in the
range [0;1]) and that max(prob) would also be 1 since the integral
should sum to 1.
The integral should _integrate_ to 1. For a discrete distribution, the individual probabilies must be less than or equal to one, because they must sum to one. But for a continuous distribution, it's an integral. Consider the uniform distribution on the interval [0,a]. It's uniform, so the density is a constant, call it c, and the integrl of the density over the interval [0,a], (a-0)*c == a*c, has to equal one. Thus the density has to be 1/a. When a is .5, say, the density is larger than one.

More towards your specific question: try plotting normpdf(linspace(-1,1,1001),0,.1) and normcdf(linspace(-1,1,1001),0,.1) and compare the two.

Probabilities have to be less than 1, Densities can be anything, even infinite (at individual points). Hope this helps.
Loading...