<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.sarg.dev/index.php?action=history&amp;feed=atom&amp;title=Probability_mass_function</id>
	<title>Probability mass function - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.sarg.dev/index.php?action=history&amp;feed=atom&amp;title=Probability_mass_function"/>
	<link rel="alternate" type="text/html" href="https://wiki.sarg.dev/index.php?title=Probability_mass_function&amp;action=history"/>
	<updated>2026-04-14T12:56:11Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.44.2</generator>
	<entry>
		<id>https://wiki.sarg.dev/index.php?title=Probability_mass_function&amp;diff=112798&amp;oldid=prev</id>
		<title>imported&gt;Hellacioussatyr: /* Formal definition */</title>
		<link rel="alternate" type="text/html" href="https://wiki.sarg.dev/index.php?title=Probability_mass_function&amp;diff=112798&amp;oldid=prev"/>
		<updated>2025-10-20T23:16:28Z</updated>

		<summary type="html">&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Formal definition&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{Short description|Discrete-variable probability distribution}}&lt;br /&gt;
[[Image:Discrete probability distrib.svg|right|thumb|The graph of a probability mass function. All the values of this function must be non-negative and sum up to 1.]]&lt;br /&gt;
In [[probability theory|probability]] and [[statistics]], a &amp;#039;&amp;#039;&amp;#039;probability mass function&amp;#039;&amp;#039;&amp;#039; (sometimes called &amp;#039;&amp;#039;probability function&amp;#039;&amp;#039; or &amp;#039;&amp;#039;frequency function&amp;#039;&amp;#039;&amp;lt;ref&amp;gt;[https://online.stat.psu.edu/stat414/lesson/7/7.2 7.2 - Probability Mass Functions | STAT 414 - PennState - Eberly College of Science]&amp;lt;/ref&amp;gt;) is a function that gives the probability that a [[discrete random variable]] is exactly equal to some value.&amp;lt;ref&amp;gt;{{cite book|author=Stewart, William J.| title=Probability, Markov Chains, Queues, and Simulation: The Mathematical Basis of Performance Modeling|publisher=Princeton University Press|year=2011|isbn=978-1-4008-3281-1|page=105|url=https://books.google.com/books?id=ZfRyBS1WbAQC&amp;amp;pg=PT105}}&amp;lt;/ref&amp;gt;  Sometimes it is also known as the &amp;#039;&amp;#039;&amp;#039;discrete probability density function&amp;#039;&amp;#039;&amp;#039;. The probability mass function is often the primary means of defining a [[discrete probability distribution]], and such functions exist for either [[Scalar variable|scalar]] or [[multivariate random variable]]s whose [[Domain of a function|domain]] is discrete.&lt;br /&gt;
&lt;br /&gt;
A probability mass function differs from a [[probability density function|continuous probability density function]] (PDF) in that the latter is associated with continuous rather than discrete random variables. A continuous PDF must be [[integration (mathematics)|integrated]] over an interval to yield a probability.&amp;lt;ref name=&amp;quot;:0&amp;quot;&amp;gt;{{Cite book|title=A modern introduction to probability and statistics : understanding why and how|date=2005|publisher=Springer|others=Dekking, Michel, 1946-|isbn=978-1-85233-896-1|location=London|oclc=262680588}}&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The value of the random variable having the largest probability mass is called the [[mode (statistics)|mode]].&lt;br /&gt;
&lt;br /&gt;
==Formal definition==&lt;br /&gt;
Probability mass function is the probability distribution of a  [[discrete random variable]], and provides the possible values and their associated probabilities. It is the function &amp;lt;math&amp;gt;p: \R \to [0,1]&amp;lt;/math&amp;gt; defined by&lt;br /&gt;
{{Equation box 1&lt;br /&gt;
|indent = :&lt;br /&gt;
|title=&lt;br /&gt;
|equation = &amp;lt;math&amp;gt;p_X(x) = P(X = x)&amp;lt;/math&amp;gt;&lt;br /&gt;
|cellpadding= 6&lt;br /&gt;
|border&lt;br /&gt;
|border colour = #0073CF&lt;br /&gt;
|background colour=#F5FFFA}}&lt;br /&gt;
&lt;br /&gt;
for &amp;lt;math&amp;gt;-\infin &amp;lt; x &amp;lt; \infin&amp;lt;/math&amp;gt;,&amp;lt;ref name=&amp;quot;:0&amp;quot; /&amp;gt; where &amp;lt;math&amp;gt;P&amp;lt;/math&amp;gt; is a [[probability measure]]. &amp;lt;math&amp;gt;p_X(x)&amp;lt;/math&amp;gt; can also be simplified as &amp;lt;math&amp;gt;p(x)&amp;lt;/math&amp;gt;.&amp;lt;ref&amp;gt;{{Cite book|title=Engineering optimization : theory and practice| last=Rao | first = Singiresu S.|date=1996|publisher=Wiley|isbn=0-471-55034-5|edition=3rd|location=New York|oclc=62080932}}&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The probabilities associated with all (hypothetical) values must be non-negative and sum up to 1,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;\sum_x p_X(x) = 1 &amp;lt;/math&amp;gt; and &amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt; p_X(x)\geq 0.&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thinking of probability as mass helps to avoid mistakes since the physical mass is [[Conservation of mass|conserved]] as is the total probability for all hypothetical outcomes &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==Measure theoretic formulation==&lt;br /&gt;
A probability mass function of a discrete random variable &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; can be seen as a special case of two more general measure theoretic constructions: &lt;br /&gt;
the [[probability distribution|distribution]] of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; and the [[probability density function]] of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; with respect to the [[counting measure]].  We make this more precise below.&lt;br /&gt;
&lt;br /&gt;
Suppose that &amp;lt;math&amp;gt;(A, \mathcal A, P)&amp;lt;/math&amp;gt; is a [[probability space]]&lt;br /&gt;
and that &amp;lt;math&amp;gt;(B, \mathcal B)&amp;lt;/math&amp;gt; is a measurable space whose underlying [[sigma algebra|σ-algebra]] is discrete, so in particular contains singleton sets of &amp;lt;math&amp;gt;B&amp;lt;/math&amp;gt;. In this setting, a random variable &amp;lt;math&amp;gt; X \colon A \to B&amp;lt;/math&amp;gt; is discrete provided its image is countable.&lt;br /&gt;
The [[pushforward measure]] &amp;lt;math&amp;gt;X_{*}(P)&amp;lt;/math&amp;gt;—called the distribution of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; in this context—is a probability measure on &amp;lt;math&amp;gt;B&amp;lt;/math&amp;gt; whose restriction to singleton sets induces the probability mass function (as mentioned in the previous section) &amp;lt;math&amp;gt;f_X \colon B \to \mathbb R&amp;lt;/math&amp;gt; since &amp;lt;math&amp;gt;f_X(b)=P( X^{-1}( b ))=P(X=b)&amp;lt;/math&amp;gt; for each &amp;lt;math&amp;gt;b \in B&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Now suppose that &amp;lt;math&amp;gt;(B, \mathcal B, \mu)&amp;lt;/math&amp;gt; is a [[measure space]] equipped with the counting measure &amp;lt;math&amp;gt;\mu&amp;lt;/math&amp;gt;.  The probability density function &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; with respect to the counting measure, if it exists, is the [[Radon–Nikodym derivative]] of the pushforward measure of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; (with respect to the counting measure), so &amp;lt;math&amp;gt; f = d X_*P / d \mu&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; is a function from &amp;lt;math&amp;gt;B&amp;lt;/math&amp;gt; to the non-negative reals.  As a consequence, for any &amp;lt;math&amp;gt;b \in B&amp;lt;/math&amp;gt; we have&lt;br /&gt;
&amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;P(X=b)=P( X^{-1}( b) ) = X_*(P)(b) = \int_{ b } f d \mu = f(b),&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
demonstrating that &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; is in fact a probability mass function.&lt;br /&gt;
&lt;br /&gt;
When there is a natural order among the potential outcomes &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt;, it may be convenient to assign numerical values to them (or &amp;#039;&amp;#039;n&amp;#039;&amp;#039;-tuples in case of a discrete [[multivariate random variable]]) and to consider also values not in the [[Image (mathematics)|image]] of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt;. That is, &amp;lt;math&amp;gt;f_X&amp;lt;/math&amp;gt; may be defined for all [[real number]]s and &amp;lt;math&amp;gt;f_X(x)=0&amp;lt;/math&amp;gt; for all &amp;lt;math&amp;gt;x \notin X(S)&amp;lt;/math&amp;gt; as shown in the figure.&lt;br /&gt;
&lt;br /&gt;
The image of &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; has a [[countable]] subset on which the probability mass function &amp;lt;math&amp;gt;f_X(x)&amp;lt;/math&amp;gt; is one. Consequently, the probability mass function is zero for all but a countable number of values of &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The discontinuity of probability mass functions is related to the fact that the [[cumulative distribution function]] of a discrete random variable is also discontinuous. If &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; is a discrete random variable, then &amp;lt;math&amp;gt; P(X = x) = 1&amp;lt;/math&amp;gt; means that the casual event &amp;lt;math&amp;gt;(X = x)&amp;lt;/math&amp;gt; is certain (it is true in 100% of the occurrences); on the contrary, &amp;lt;math&amp;gt;P(X = x) = 0&amp;lt;/math&amp;gt; means that the casual event &amp;lt;math&amp;gt;(X = x)&amp;lt;/math&amp;gt; is always impossible. This statement isn&amp;#039;t true for a [[continuous random variable]] &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt;, for which &amp;lt;math&amp;gt;P(X = x) = 0&amp;lt;/math&amp;gt; for any possible &amp;lt;math&amp;gt;x&amp;lt;/math&amp;gt;. [[Discretization of continuous features|Discretization]] is the process of converting a continuous random variable into a discrete one.&lt;br /&gt;
&lt;br /&gt;
==Examples==&lt;br /&gt;
{{Main|Bernoulli distribution|Binomial distribution|Geometric distribution}}&lt;br /&gt;
&lt;br /&gt;
===Finite===&lt;br /&gt;
There are three major distributions associated, the [[Bernoulli distribution]], the [[binomial distribution]] and the [[geometric distribution]].&lt;br /&gt;
&lt;br /&gt;
*Bernoulli distribution: &amp;#039;&amp;#039;&amp;#039;ber(p) &amp;#039;&amp;#039;&amp;#039;, is used to model an experiment with only two possible outcomes. The two outcomes are often encoded as 1 and 0. &amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;p_X(x) = \begin{cases}&lt;br /&gt;
p, &amp;amp; \text{if }x\text{ is 1} \\&lt;br /&gt;
1-p, &amp;amp; \text{if }x\text{ is 0}&lt;br /&gt;
\end{cases}&amp;lt;/math&amp;gt; An example of the Bernoulli distribution is tossing a coin. Suppose that &amp;lt;math&amp;gt;S&amp;lt;/math&amp;gt; is the sample space of all outcomes of a single toss of a [[fair coin]], and &amp;lt;math&amp;gt;X&amp;lt;/math&amp;gt; is the random variable defined on &amp;lt;math&amp;gt;S&amp;lt;/math&amp;gt; assigning 0 to the category &amp;quot;tails&amp;quot; and 1 to the category &amp;quot;heads&amp;quot;.  Since the coin is fair, the probability mass function is &amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;p_X(x) = \begin{cases}&lt;br /&gt;
\frac{1}{2}, &amp;amp;x = 0,\\&lt;br /&gt;
\frac{1}{2}, &amp;amp;x = 1,\\&lt;br /&gt;
0, &amp;amp;x \notin \{0, 1\}.&lt;br /&gt;
\end{cases}&amp;lt;/math&amp;gt;&lt;br /&gt;
* Binomial distribution, models the number of successes when someone draws n times with replacement. Each draw or experiment is independent, with two possible outcomes. The associated probability mass function is &amp;lt;math display=&amp;quot;inline&amp;quot;&amp;gt;\binom{n}{k} p^k (1-p)^{n-k}&amp;lt;/math&amp;gt;. [[Image:Fair dice probability distribution.svg|right|thumb|The probability mass function of a [[Dice|fair die]]. All the numbers on the die have an equal chance of appearing on top when the die stops rolling.]]{{pb}}An example of the binomial distribution is the probability of getting exactly one 6 when someone rolls a fair die three times.&lt;br /&gt;
* Geometric distribution describes the number of trials needed to get one success. Its probability mass function is &amp;lt;math display=&amp;quot;inline&amp;quot;&amp;gt;p_X(k) = (1-p)^{k-1} p&amp;lt;/math&amp;gt;.{{pb}}An example is tossing a coin until the first &amp;quot;heads&amp;quot; appears. &amp;lt;math&amp;gt;p&amp;lt;/math&amp;gt; denotes the probability of the outcome &amp;quot;heads&amp;quot;, and &amp;lt;math&amp;gt;k&amp;lt;/math&amp;gt; denotes the number of necessary coin tosses. {{pb}}Other distributions that can be modeled using a probability mass function are the [[categorical distribution]] (also known as the generalized Bernoulli distribution) and the [[multinomial distribution]]. &lt;br /&gt;
* If the discrete distribution has two or more categories one of which may occur, whether or not these categories have a natural ordering, when there is only a single trial (draw) this is a categorical distribution.&lt;br /&gt;
* An example of a [[Joint probability distribution|multivariate discrete distribution]], and of its probability mass function, is provided by the [[multinomial distribution]]. Here the multiple random variables are the numbers of successes in each of the categories after a given number of trials, and each non-zero probability mass gives the probability of a certain combination of numbers of successes in the various categories.&lt;br /&gt;
{{clear}}&lt;br /&gt;
&lt;br /&gt;
===Infinite===&lt;br /&gt;
The following exponentially declining distribution is an example of a distribution with an infinite number of possible outcomes—all the positive integers: &amp;lt;math display=&amp;quot;block&amp;quot;&amp;gt;\text{Pr}(X=i)= \frac{1}{2^i}\qquad \text{for } i=1, 2, 3, \dots &amp;lt;/math&amp;gt; Despite the infinite number of possible outcomes, the total probability mass is 1/2 + 1/4 + 1/8 + ⋯ = 1, satisfying the unit total probability requirement for a probability distribution.&lt;br /&gt;
&lt;br /&gt;
==Multivariate case==&lt;br /&gt;
{{Main|Joint probability distribution}}&lt;br /&gt;
&lt;br /&gt;
Two or more discrete random variables have a joint probability mass function, which gives the probability of each possible combination of realizations for the random variables.&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
{{reflist}}&lt;br /&gt;
&lt;br /&gt;
==Further reading==&lt;br /&gt;
*{{cite book |last=Johnson |first=N. L. |last2=Kotz |first2=S. |last3=Kemp |first3=A. |year=1993 |title=Univariate Discrete Distributions |url=https://archive.org/details/univariatediscre00john_205 |url-access=limited |edition=2nd |publisher=Wiley |isbn=0-471-54897-9 |page=[https://archive.org/details/univariatediscre00john_205/page/n28 36] }}&lt;br /&gt;
&lt;br /&gt;
{{Theory of probability distributions}}&lt;br /&gt;
{{Authority control}}&lt;br /&gt;
&lt;br /&gt;
[[Category:Types of probability distributions]]&lt;/div&gt;</summary>
		<author><name>imported&gt;Hellacioussatyr</name></author>
	</entry>
</feed>