Continuous Distributions

Continuous distributions

In this section we discuss the set of continuous distributions available in AIMMS.

The three distributions with both lower and upper bound are

The five distributions with only a lower bound are

The three unbounded distributions are

  • the Normal distribution,

  • the Logistic distribution, and

  • the Extreme Value distribution.

Parameters of continuous distributions

Every parameter of a continuous distributions can be characterized as either a shape parameter \(\beta\), a location parameter \(l\), or a scale parameter \(s\). While the presence and meaning of a shape parameter is usually distribution-dependent, location and scale parameters find their origin in the common transformation

\[x \mapsto \frac{x-l}{s}\]

to shift and stretch a given distribution. By choosing \(l=0\) and \(s=1\) the standard form of a distribution is obtained. If a certain distribution has \(n\) shape parameters (\(n \geq 0\)), these shape parameters will be passed as the first \(n\) parameters to AIMMS. The shape parameters are then followed by two optional parameters, with default values 0 and 1 respectively. For double-bounded distributions these two optional parameters can be interpreted as a lower and upper bound (the value of the location parameter \(l\) for these distributions is equal to the lower bound and the value of the scale parameter \(s\) is equal to the difference between the upper and lower bound). For single-bounded distributions the bound value is often used as the location parameter \(l\). In this section, whenever the location parameter can be interpreted as a mean value or whenever the scale parameter can be interpreted as the deviation of a distribution, these more meaningful names are used to refer to the parameters. Note that the LogNormal, Gamma and Exponential distributions are distributions that will mostly be used with location parameter equal to 0.

Transformation to standard form

When transforming a distribution to standard form, distribution operators change. Scaling of Statistical Operators (scaling of statistical operators) gives the relationships between distribution operators working on random variables \(X(l,s)\) and \(X(0,1)\).

Units of measurement

When a random variable representing some real-life quantity with a given unit of measurement (see also Units of Measurement) is distributed according to a particular distribution, some parameters of that distribution are also naturally expressed in terms of this same unit while other parameters are expected to be unitless. In particular, the location and scale parameters of a distribution are measured in the same unit of measurement as the corresponding random variable, while shape parameters (within AIMMS) are implemented as unitless parameters.

Unit notation in this appendix

When you use a distribution function, AIMMS will perform a unit consistency check on its parameters and result, whenever your model contains one or more QUANTITY declarations. In the description of the continuous distributions below, the expected units of the distribution parameters are denoted in square brackets. Throughout the sequel, [\(x\)] denotes that the parameter should have the same unit of measurement as the random variable \(X\) and [-] denotes that a parameter should be unitless.

A commonly used distribution

In practice, the Normal distribution is used quite frequently. Such widespread use is due to a number of pleasant properties:

  • the Normal distribution has no shape parameters and is symmetrical,

  • random values are more likely as they are closer to the mean value,

  • it can be directly evaluated for any given mean and standard deviation because it is fully specified through the mean and standard deviation parameter,

  • it can be used as a good approximation for distributions on a finite interval, because its probability density is declining fast enough (when moving away from the mean),

  • the mean and sum of any number of uncorrelated Normal distributions are Normal distributed themselves, and thus have the same shape, and

  • the mean and sum of a large number of uncorrelated distributions are always approximately Normal distributed.

Distributions for double bounded variables

For random variables that have a known lower and upper bound, AIMMS provides three continuous distributions on a finite interval: the Uniform, Triangular and Beta distribution. The Uniform (no shape parameters) and Triangular (one shape parameter) distributions should be sufficient for most experiments. For all remaining experiments, the user might consider the highly configurable Beta (two shape parameters) distribution.

Distributions for single bounded variables

When your random variable only has a single bound, you should first check whether the Gamma distribution can be used or whether the Normal distribution is accurate enough. The LogNormal distribution should be considered if the most likely value is near but not at the bound. The Weibull or Gamma distribution (\(\beta>1\)), or even the ExtremeValue distribution are alternatives, while the Weibull or Gamma distribution (\(\beta \leq 1\)) or Pareto distribution should be considered if the bound is the most likely value.

The Gamma distribution

The Gamma (and as a special case thereof the Exponential) distribution is widely used for its special meaning. It answers the question: how long does it take for a success to occur, when you only know the average number of occurrences (like in the Poisson distribution). The Exponential distribution gives the time to the first occurrence, and its generalization, the Gamma(\(\beta\)) distribution gives the time to the \(\beta\)-th occurrence. Note that the sum of a Gamma(\(\beta_1,l_1,s\)) and Gamma(\(\beta_2,l_2,s\)) distribution has a Gamma(\(\beta_1+\beta_2,l_1+l_2,s\)) distribution.

The LogNormal distribution

If you assume the logarithm of a variable to be Normal distributed, the variable itself is LogNormal-distributed. As a result, it can be shown that the chance of an outcome in the interval \([x \!\cdot\! c_1,x \!\cdot\! c_2]\) is equal to the chance of an outcome in the interval \([x/c_2,x/c_1]\) for some \(x\). This might be a reasonable assumption in price developments, for example.

The Uniform distribution

../../_images/continuous-distributions-pspic1.svg

The Uniform(min,max) distribution:

Input parameters

min [\(x\)], max [\(x\)]

Input check

\({min} < {max}\)

Permitted values

\(\{ x \; | \; {min} \leq x \leq {max} \}\)

Standard density

\(f_{(0,1)}(x) = 1\)

Mean

\(1/2\)

Variance

\(1/12\)

In the Uniform distribution all values of the random variable occur between a fixed minimum and a fixed maximum with equal likelihood. It is quite common to use the Uniform distribution when you have little knowledge about an uncertain parameter in your model except that its value has to lie anywhere within fixed bounds. For instance, after talking to a few appraisers you might conclude that their single appraisals of your property vary anywhere between a fixed pessimistic and a fixed optimistic value.

The Triangular distribution

../../_images/continuous-distributions-pspic2.svg

The Triangular(\(\beta\),min,max) distribution:

Input parameters

shape \(\beta\) [\(-\)],min [\(x\)], max [\(x\)]

Input check

\({min} < {max }, \; 0 < \beta < 1\)

Permitted values

\(\{ x \; | \; {min} \leq x \leq {max} \}\)

Standard density

\(f_{(\beta,0,1)}(x) = \begin{cases} 2 x / \beta & \text{for $0 \leq x \leq \beta$} \\ 2 (1-x)/(1-\beta) & \text{for $\beta<x \leq1$} \end{cases}\)

Mean

\((\beta+1)/3\)

Variance

\((1-\beta+\beta^2)/18\)

Remarks

The shape parameter \(\beta\) indicates the position of the peak in relation to the range, i.e. \(\beta = \frac{{peak}-{min}}{{max}-{min}}\)

In the Triangular distribution all values of the random variable occur between a fixed minimum and a fixed maximum, but not with equal likelihood as in the Uniform distribution. Instead, there is a most likely value, and its position is not necessarily in the middle of the interval. It is quite common to use the Triangular distribution when you have little knowledge about an uncertain parameter in your model except that its value has to lie anywhere within fixed bounds and that there is a most likely value. For instance, assume that a few appraisers each quote an optimistic as well as a pessimistic value of your property. Summarizing their input you might conclude that their quotes provide not only a well-defined interval but also an indication of the most likely value of your property.

The Beta distribution

../../_images/continuous-distributions-pspic3.svg
../../_images/continuous-distributions-pspic4.svg

The Beta(\(\alpha\),\(\beta\),min,max) distribution:

Input parameters

shape \(\alpha\) [-], shape \(\beta\) [-], min [\(x\)], max [\(x\)]

Input check

\(\alpha > 0, \beta > 0, {min} < {max}\)

Permitted values

\(\{x \; | \; {min} < x < {max} \}\)

Standard density

\(f_{(\alpha,\beta,0,1)}(x) = \frac{1}{B(\alpha,\beta)} x^{\alpha - 1} (1-x)^{\beta - 1}, \; \text{where $B(\alpha,\beta)$ is the Beta function}\)

Mean

\(\alpha/(\alpha+\beta)\)

Variance

\(\alpha\beta(\alpha+\beta)^{-2}(\alpha+\beta+1)^{-1}\)

Remarks

\({$\texttt{Beta}$}(1,1,{min},{max})={$\texttt{Uniform}$}({min},{max})\)

The Beta distribution is a very flexible distribution whose two shape parameters allow for a good approximation of almost any distribution on a finite interval. The distribution can be made symmetrical, positively skewed, negatively skewed, etc. It has been used to describe empirical data and predict the random behavior of percentages and fractions. Note that for \(\alpha<1\) a singularity occurs at \(x=\text{{min}}\) and for \(\beta<1\) at \(x=\text{{max}}\).

The LogNormal distribution

../../_images/continuous-distributions-pspic5.svg

The LogNormal(\(\beta\),min,s) distribution:

Input parameters

shape \(\beta\) [-], lowerbound min [\(x\)] and scale \(s\) [\(x\)]

Input check

\(\beta > 0 \; \mbox{and} \; s > 0\)

Permitted values

\(\{ x \; | \; {min} < x < \infty \}\)

Standard density

\(f_{(\beta,0,1)}(x) = \frac{1} { \sqrt{2 \pi} x \ln(\beta^2+1) } e^{ \frac{ -(\ln(x^2(\beta^2+1)) } {2 \ln(\beta^2+1) } }\)

Mean

\(1\)

Variance

\(\beta^2\)

If you assume the logarithm of the variable to be Normal(\(\mu,\sigma\))-distributed, then the variable itself is LogNormal(\(\sqrt{e^{\sigma^2}\!\! - \!\! 1},0,e^{\mu - \sigma^2/2}\))-distributed. This parameterization is used for its simple expressions for mean and variance. A typical example is formed by real estate prices and stock prices. They all cannot drop below zero, but they can grow to be very high. However, most values tend to stay within a particular range. You usually can form some expected value of a real estate price or a stock price, and estimate the standard deviation of the prices on the basis of historical data.

The Exponential distribution

../../_images/continuous-distributions-pspic6.svg

The Exponential(min,\(s\)) distribution:

Input parameters

lowerbound min [\(x\)] and scale \(s\) [\(x\)]

Input check

\(s > 0\)

Permitted values

\(\{ x \; | \; {min} \leq x < \infty \}\)

Standard density

\(f_{(0,1)}(x) = \lambda e^{-x}\)

Mean

\(1\)

Variance

\(1\)

Remarks

Exponential (min, \(s\)) = Gamma (1, min, \(s\)), Exponential (min, \(s\)) = Weibull (1, min, \(s\))

Assume that you are observing a sequence of independent events with a constant chance of occurring in time, with s being the average time between occurrences. (in accordance with the Poisson distribution) The Exponential(\(0,s\)) distribution gives answer to the question: how long a time do you need to wait until you observe the first occurrence of an event. Typical examples are time between failures of equipment, and time between arrivals of customers at a service desk (bank, hospital, etc.).

The Gamma distribution

../../_images/continuous-distributions-pspic7.svg

The Gamma(\(\beta\),min,\(s\)) distribution:

Input parameters

shape \(\beta\) [-], lowerbound min [\(x\)] and scale \(s\) [\(x\)]

Input check

\(s > 0 \; \mbox{and} \; \beta > 0\)

Permitted values

\(\{x \; | \; {min} < x < \infty\}\)

Standard density

\(f_{(\beta,0,1)}(x) = x^{\beta - 1} e^{-x} / {\Gamma ( \beta )} \\ \mbox{where} \; \Gamma ( \beta ) \; \mbox{is the Gamma function}\)

Mean

\(\beta\)

Variance

\(\beta\)

The Gamma distribution gives answer to the question: how long a time do you need to wait until you observe the \(\beta\)-th occurrence of an event (instead of the first occurrence as in the Exponential distribution). Note that it is possible to use non-integer values for \(\beta\) and a location parameter. In these cases there is no natural interpretation of the distribution and for \(\beta<1\) a singularity exists at \(x={min}\), so one should be very careful in using the Gamma distribution this way.

The Weibull distribution

../../_images/continuous-distributions-pspic8.svg

The Weibull(\(\beta\),min,\(s\)) distribution:

Input parameters

shape \(\beta\) [-], lowerbound min [\(x\)] and scale \(s\) [\(x\)]

Input check

\(\beta > 0 \; \mbox{and} \; s > 0\)

Permitted values

\(\{x \; | \; {min} \leq x < \infty\}\)

Standard density

\(f_{(\beta,0,1)}(x) = \beta x^{\beta - 1} e^{-x^\beta}\)

Mean

\(\Gamma(1+1/\beta)\)

Variance

\(\Gamma(1+2/\beta)-\Gamma^2(1+1/\beta)\)

The Weibull distribution is another generalization of the Exponential distribution. It has been successfully used to describe failure time in reliability studies, and the breaking strengths of items in quality control testing. By using a value of the shape parameter that is less than 1, the Weibull distribution becomes steeply declining and could be of interest to a manufacturer testing failures of items during their initial period of use. Note that in that case there is a singularity at \(x={min}\).

The Pareto distribution

../../_images/continuous-distributions-pspic9.svg

The Pareto(\(\beta\),\(l\),\(s\)) distribution:

Input parameters

shape \(\beta\) [-], location \(l\) [\(x\)] and scale \(s\) [\(x\)]

Input check

\(s > 0 \; \mbox{and} \; \beta > 0\)

Permitted values

\(\{ x \; | \; l+s < x < \infty \}\)

Standard density

\(f_{(\beta,0,1)}(x) = \beta / x^{\beta + 1}\)

Mean

\(\mbox{for } \beta>1:\; \beta/(\beta-1), \infty \text{ otherwise}\)

Variance

\(\mbox{for } \beta>2:\; \beta(\beta-1)^{-2}(\beta-2)^{-1}, \infty \text{ otherwise}\)

The Pareto distribution has been used to describe the sizes of such phenomena as human population, companies, incomes, stock fluctuations, etc.

The Normal distribution

../../_images/continuous-distributions-pspic10.svg

The Normal(\(\mu\),\(\sigma\)) distribution:

Input parameters

Mean \(\mu\) [\(x\)] and standard deviation \(\sigma\) [\(x\)]

Input check

\(\sigma > 0\)

Permitted values

\(\{ x \; | \; -\infty < x < \infty \}\)

Standard density

\(f_{(0,1)}(x) = e^{-x^2/2}/\sqrt{2 \pi}\)

Mean

\(0\)

Variance

\(1\)

Remarks

Location \(\mu\), scale \(\sigma\)

The Normal distribution is frequently used in practical applications as it describes many phenomena observed in real life. Typical examples are attributes such as length, IQ, etc. Note that while the values in these examples are naturally bounded, a close fit between such data values and normally distributed values is quite common in practice, because the likelihood of extreme values away from the mean is essentially zero in the Normal distribution.

The Logistic distribution

../../_images/continuous-distributions-pspic11.svg

The Logistic(\(\mu\),\(s\)) distribution:

Input parameters

mean \(\mu\) [\(x\)] and scale \(s\) [\(x\)]

Input check

\(s > 0\)

Permitted values

\(\{x \; | \; -\infty < x < \infty \}\)

Standard density

\(f_{(0,1)}(x) = ( e^x + e^{-x} + 2 )^{-1}\)

Mean

\(0\)

Variance

\(\pi^2/3\)

The Logistic distribution has been used to describe growth of a population over time, chemical reactions, and similar processes. Extreme values are more common than in the somewhat similar Normal distribution

The Extreme Value distribution

../../_images/continuous-distributions-pspic12.svg

The Extreme Value(\(l\),\(s\)) distribution:

Input parameters

Location \(l\) [\(x\)] and scale \(s\) [\(x\)]

Input check

\(s > 0\)

Permitted values

\(\{ x \; | \; -\infty < x < \infty \}\)

Standard density

\(f_{(0,1)}(x) = e^x e^{-e^x}\)

Mean

\(\gamma=0.5772\dots\mbox{ (Euler's constant)}\)

Variance

\(\pi^2/6\)

Remarks

Extreme Value

distributions have been used to describe the largest values of phenomena observed over time: water levels, rainfall, etc. Other applications include material strength, construction design or any other application in which extreme values are of interest. In literature the Extreme Value distribution that is provided by AIMMS is known as a type 1 Gumbel distribution.