Prob density function is best introduced in 1-dimension. In a 2-dimensional (or higher) context like throwing a dart on a 2D surface, we have “superstructures” like marginal probability and conditional probability … but they are hard to understand fully without an intuitive feel for the density. Density is the foundation of everything.
Here’s my best explanation of pdf: to be useful, a bivariate density function has to be integrated via a double-integral, and produce a probability *mass*. In a small region where the density is assumed approximately constant, the product of the density and delta-x times delta-y (the 2 “dimensions”) would give a small amount of probability mass. (I will skip the illustrations…)
Note there are 3 factors in this product. If delta-x is zero, i.e. the random variable is held constant at a value like 3.3, then the product becomes zero i.e. zero probability mass.
My 2nd explanation of pdf — always a differential. In the 1D context, it’s dM/dx. dM represents a small amount of probability mass. In the 2D context, density is d(dM/dx)/dy. As the tiny rectangle “dx by dy” shrinks, the mass over it would vanish, but not the differential.
In the context of marginal and conditional probability, which requires “fixing” X = 7.02, it’s always useful to think of a small region around 7.02. Otherwise, the paradox with the zero-width is that the integral would evaluate to 0. This is an uncomfortable situation for many students.