mvnpdf > 1 ?

Discussion:

mvnpdf > 1 ?

(too old to reply)

carl

2010-02-12 14:18:13 UTC

In my matlab code I use the mvnpdf function to compute the probability for a
set of samples based on the following covariance matrix (Sigma) and mean
(mu):

mu =

0.9556 0.2994 0.2569

Sigma =

0.0082 0.0052 0.0067
0.0052 0.0171 0.0199
0.0067 0.0199 0.0241

prob = mvnpdf(X , mu, Sigma);

where X is a 1000*3 matrix containing various samples.

If I do:

max(prob)

I get:

197.3913

And if I do:

sum(prob)

I get:

1.2987e+006

I would expect that max(prob) would be 1 (since probabilites lies in the
range [0;1]) and that max(prob) would also be 1 since the integral should
sum to 1.

I assume that me problem is that I mix the term:

probability density function
http://en.wikipedia.org/wiki/Probability_density_function

with the term:

probability distribution
http://en.wikipedia.org/wiki/Discrete_probability_distribution

But maybe someone could give me a hint to better understand the difference?

Brian Borchers

2010-02-12 15:32:37 UTC

Permalink

A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.

For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is

f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.

The probability that x is between 0 and 0.1 is

P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.

carl

2010-02-12 16:04:08 UTC

Permalink

Post by Brian Borchers
A probability density function (pdf) has to be integrated over some
range of x values to obtain a probability. If the pdf is nonzero over
a narrow range it's quite easy for the maximum of the pdf to be larger
than one, even though the integral of the pdf from x=-infinity to x=
+infinity is 1.
For example, consider a random variable X that is uniformly
distributed on the interval [0,1/2]. The pdf is
f(x)=2 0<=x<=1/2.
f(x)=0 x<0 or x>1/2.
The probability that x is between 0 and 0.1 is
P(0<=x<=0.1)=int(f,x=0..0.1)=0.2.

Ok I thought that mvnpdf corresponded to this expression:

Loading Image...

which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.

And that the plot would look like this:

http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg

depending on the parameters sigma and the mean. But cleary that this is not
mvnpdf. Maybe its best to implement things from scratch to understand how
they work.

John D'Errico

2010-02-12 16:21:06 UTC

Permalink

Post by carl

http://upload.wikimedia.org/math/a/d/4/ad4c63257208b495d1084a74a15e0113.png
which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.
http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg
depending on the parameters sigma and the mean. But cleary that this is not
mvnpdf. Maybe its best to implement things from scratch to understand how
they work.

It does correspond to that expression. But you need
to appreciate that that expression can easily be larger
than 1.

Only the integral must be 1.

John

carl

2010-02-12 16:54:22 UTC

Permalink

Post by John D'Errico

Post by carl

http://upload.wikimedia.org/math/a/d/4/ad4c63257208b495d1084a74a15e0113.png
which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.
http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg
depending on the parameters sigma and the mean. But cleary that this is
not mvnpdf. Maybe its best to implement things from scratch to understand
how they work.

It does correspond to that expression. But you need
to appreciate that that expression can easily be larger
than 1.
Only the integral must be 1.

Ok but when I do:

sum(prob)

I get:

1.2987e+006

which is not 1! So the sum/integral over the density returned by mvnpdf can
also be larger than 1.

John D'Errico

2010-02-12 17:46:21 UTC

Permalink

Post by carl

Post by John D'Errico

Post by carl

http://upload.wikimedia.org/math/a/d/4/ad4c63257208b495d1084a74a15e0113.png
which in the litterature is both referred to as the multivariate gaussian
distribution, multivariate probability density and probability mass
function.
http://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg
depending on the parameters sigma and the mean. But cleary that this is
not mvnpdf. Maybe its best to implement things from scratch to understand
how they work.

It does correspond to that expression. But you need
to appreciate that that expression can easily be larger
than 1.
Only the integral must be 1.

sum(prob)
1.2987e+006
which is not 1! So the sum/integral over the density returned by mvnpdf can
also be larger than 1.

No. I think you misunderstand what an integral is,
or at least have forgotten.

A sum is not an integral. They are different things.

John

Steven Lord

2010-02-12 17:52:34 UTC

Permalink

"carl" <***@.com> wrote in message news:4b7587c5$0$272$***@news.sunsite.dk...

*snip*

Post by carl
sum(prob)
1.2987e+006
which is not 1!

That is a correct statement and the correct behavior for SUM.

Post by carl
So the sum/integral over the density returned by mvnpdf can also be larger
than 1.

Only half of that statement is correct.

The sum of the density _at the points at which you evaluated it_ can be
greater than 1.
The integral of the density cannot be greater than 1 (modulo roundoff
error.)

Let's take a simple function whose integral we know to be 1.

x = [-1 0 0 0.5 1 1 2];
y = [0 0 1 1 1 0 0];
plot(x, y, '-o')
axis equal

Assuming that the function is zero outside the range [0, 1], the area under
this function is a square with side 1, and the integral of the function is
the area of the square. Let's double-check that its integral is 1:

trapz(x, y)

Now how about the sum of the y values?

sum(y)

In fact, you can make the sum of the y values arbitrarily large without
changing the integral of the function. Change delta in the code below and
see how the integral and the sum change.

delta = 0.1;
t = 0:delta:1;
x = [-1 0 t 1 2];
y = [0 0 ones(size(t)) 0 0];
plot(x, y, '-o')
axis equal
integral = trapz(x, y)
thesum = sum(y)

--
Steve Lord
***@mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ

Peter Perkins

2010-02-12 15:33:24 UTC

Permalink

Post by carl
In my matlab code I use the mvnpdf function to compute the probability
for a set of samples based on the following covariance matrix (Sigma)
I would expect that max(prob) would be 1 (since probabilites lies in the
range [0;1]) and that max(prob) would also be 1 since the integral
should sum to 1.

The integral should _integrate_ to 1. For a discrete distribution, the individual probabilies must be less than or equal to one, because they must sum to one. But for a continuous distribution, it's an integral. Consider the uniform distribution on the interval [0,a]. It's uniform, so the density is a constant, call it c, and the integrl of the density over the interval [0,a], (a-0)*c == a*c, has to equal one. Thus the density has to be 1/a. When a is .5, say, the density is larger than one.

More towards your specific question: try plotting normpdf(linspace(-1,1,1001),0,.1) and normcdf(linspace(-1,1,1001),0,.1) and compare the two.

Probabilities have to be less than 1, Densities can be anything, even infinite (at individual points). Hope this helps.