R-Package
"distr"
What is "distr" meant for?
The aim of
package "distr" is to
provide a conceptual treatment of random variables (r.v.'s) by means of
S4--classes. A mother class "Distribution"
is introduced with slots for a parameter and - most important - for the
four constitutive methods "r",
"d", "p", and "q" for simulation,
respectively for evaluation of density / c.d.f. and quantile function
of the corresponding distribution. All distributions of the "base" package for which
corresponding "r", "d", "p", and "q"- <distr.name>
functions exist (like normal, Poisson, etc.) are implemented as
subclasses of either "AbscontDistribution"
or "DiscreteDistribution",
which themselves are again subclasses of "Distribution".
This approach seems very appealing to us from a conceptual viewpoint:
Just pass an object of some derived
distribution class to a generic function as argument and let the
dispatching mechanism decide what to do on run-time.
As an example, we may automatically generate new objects of these
classes with corresponding "r",
"d", "p", and "q"-slots for the laws of
r.v.'s under standard mathematical univariate transformations and under
convolution of independent r.v.'s. For "Distribution" objects X and Y expressions like 3*X+sin(exp(-Y/4+3)) have
their natural interpretation as corresponding image distributions.
Note: Arithmetics on
distribution objects are understood as operations on corresponding
r.v.'s and not
on distribution functions or densities.
You may set global options by distroptions()
confer ?distroptions .
Up to version 1.5, additionally, we also provided classes for a
standardized treatment of simulations (also under contaminations) and
evaluations of statistical procedures on such simulations. These are
now delegated to packages distrSim
and distrTEst (see
below).
Attention:
This package has been reorganized in version 1.6; if you cannot find a
class/method/function previously in the package, also search the new
packages
Further packages built on top of package "
distr":
Manual
for version
prior to 1.8, a somewhat more detailed manual to this package in
pdf-format
is available here;
from version 1.8 on, we have converted this manual into a common
vignette to packages
distr,
distrEx,
distrSim,
distrTEst,
distrMod, distrTeach,
which is available in the mere documentation package
distrDoc. To use it you may type
require("distrDoc"); V<-vignette("distr"); print(V); edit(V)
License
Download
Windows
- to be installed by
- to be removed by
remove.packages("distr")
- to be used by
library("distr")
or
require("distr")
Linux
Sources
- included into the .tar.gz.file
- as zipped
source (for Versions <1.8.0)
- procede as follows:
- unzip the zip File
- consult the README -File in the zip-archive and follow the
instructions therein
- (is the only possiblity for versions 1.7.0 and 1.7.1)
Demos
also see demo(package="distr")
--- after installation of "distr"
Version history:
Changes from 1.0 to
1.1 (03-12-04)
Changes from 1.1 to
1.2, 1.3
- changes in the Help-File to pass Rcmd check
Changes from 1.3 to
1.4
Changes from 1.4 to
1.5
- package is now using lazy
loading
- minor changes in the help pages
- minor enhancements in plot for distributions (Gamma, discrete distributions)
- package now includes a demo - folder; try demo("distr")
- class Gamma has
been renamed Gammad to
avoid name collisions
- we have a CITATION file now; consider citation("distr")
- enhanced demos:
- convolution of uniform variables now includes exact expressions
- min/ max of two variables now
available for discrete distributions
- rd-Files have now a keyword entry for distribution and thus may
be found by the search engine
- exact formula for "Unif"
o "numeric" where o \in
{ +,-,*,/ }
Changes from 1.5 to
1.6
- Our package is reorganized:
- distr from now
on only comprises distribution classes and methods
- simulation classes and methods have been moved to the new
package distrSim
- evalation classes and methods have been moved to the new
package distrTEst
- a new class distrEx has been added by
Matthias Kohl,
providing additional features like distances between distributions,
expectation operators etc
- a new class RandVar has been added by
Matthias Kohl,
providing conceptual treatment of random variables as measurable
mappings
Changes from 1.6 to
1.7
- taking up a suggestion by Martin Mächler, we now issue warnings
as to the intepretation of arithmetics applied to distributions, as
well as to the
accuracy of slots p,d,q filled by means of simulations; these warnings
are issued at two places:
- (1) on attaching the package
- (2) at every show/print of a distribution
- (2) can be cancelled by switching off a corresponding global
option in distroptions()
-- see ?distroptions
.
- distroptions() / getdistrOption() now behave
exactly like options() /
getOption() options
--- also compare mail
"Re: [Rd] How to implement package-specific options?" by Brian Ripley
on
r-devel, Fri 09 Dec 2005 - 11:52:46, see http://tolstoy.newcastle.edu.au/R/devel/05/12/3408.html
- all specific distributions (those realized as [r|d|p|q]<name>
like rnorm in package stats)
now have valid prototypes
- fixed arguments xlim and
ylim for plot(signature("AbscontDistribution"
or
"DiscreteDistribution"))
thus: plot(Cauchy(),xlim=c(-4,4))
gives reasonable result (and plot(Cauchy())
does not)
- Internationalization: use of gettext, gettextf for output
- explicitly implemented is()
relations: R "knows" that
- an Exponential(lambda) distribution also is a Weibull(shape =
1, scale = 1/lambda) distribution, as well as a Gamma(shape = 1, scale
= 1/lambda) distribution
- a Geometric(p) distribution also is a Negativ Binomial(size =
1,p) distribution
- a Uniform(0,1) distribution also is a Beta(1,1) distribution
- a Cauchy(0,1) distribution also is a T(df=1, ncp=0)
distribution
- a Chisq(df=n, ncp=0) distribution also is a Gamma(shape=n/2,
scale=2) distribution
- noncentrality parameter included for Beta, T, F distribution
Changes from 1.7 to
1.8
- a class "DExp" for
Laplace/Double Exponential distributions
- a method dim which
for distributions returns the dimension of the
support
- show for
distributions now acts as print
Changes from 1.8 to
1.9
- in demos, made calls to uniroot(),
integrate(), optim(ize)() compliant to https://stat.ethz.ch/pipermail/r-devel/2007-May/045791.html
- new methods shape()
and scale() for class "Chisq" with ncp==0
- derivation of a class LatticeDistribution from DiscreteDistribution to be
able to easily apply FFT
- new class 'Lattice' to formalize an
affine linearly generated grid of
(support) points pivot +
(0:(Length-1)) * width
- usual accessor /prelacement functions to
handle slots
- new class 'LatticeDistribution' as
intermediate class between 'DiscreteDistribution'
and all specific discrete distributions from 'stats' package
- with a particular convolution method
using FFT (also for 'convpow')
- usual accessor function 'lattice' for slot 'lattice'
- moved some parts from
from package 'distrEx'
to package 'distr'
- generating function 'DiscreteDistribution'
- univariate methods of 'liesInSupport()'
- classes 'DistrList' and 'UnivariateDistrList'
- generating functions EuclideanSpace() ,Reals(), Naturals()
- cleaning up of source files:
- checked all source file to adhere to the
80char's-per-line rule
- added S4-method 'convpow' for convolutional
powers from the examples
of package 'distr' with
methods for
- 'LatticeDistribution'
and 'AbscontDistribution'
- and particular methods for
- Norm,
Cauchy, Pois, Nbinom, Binom, Dirac,
and ExpOrGammaOrChisq
(if summand 'is' of class Gammad)
- new exact arithmetic formulae:
+ 'Cauchy' + 'Cauchy'
: gives 'Cauchy'
+ 'Weibull'
* 'numeric'
: gives 'Weibull' resp. 'Dirac'
resp 'AbscontDistribution'
: acc. to 'numeric' >, =, <
0
+ 'Logis'
* 'numeric'
: gives 'Logis' resp. 'Dirac' resp 'AbscontDistribution'
: acc. to 'numeric' >, =, <
0
+ 'Logis'
+ 'numeric'
: gives 'Logis'
+ 'Lnorm'
* 'numeric'
: gives 'Lnorm' resp. 'Dirac' resp 'AbscontDistribution'
: acc. to 'numeric'
>, =, <
0
+ 'numeric'
/ 'Dirac'
: gives 'Dirac' resp. error acc.
to 'location(Dirac)' ==, != 0
+
'DiscreteDistribution' * 1 returns
the original distribution
+
'AbscontDistribution' * 1 returns
the original distribution
+
'DiscreteDistribution' + 0 returns
the original distribution
+
'AbscontDistribution' + 0 returns
the original distribution
- new file MASKING and corresponding command 'distrMASK()' to
describe the intended maskings
- mentioned in package-help: startup messages
may now also be suppressed by
suppressPackageStartupMessages()
(from package 'base')
- revised generating functions/initialize
methods according to
http://tolstoy.newcastle.edu.au/R/e2/devel/07/01/1976.html
in particular all Parameter(-sub-)classes gain a valid prototype
- formals for slots p,q,d
as in package stats to
enhance accuracy
- p(X)(q, lower.tail = TRUE,
log.p = FALSE)
- q(X)(p, lower.tail = TRUE,
log.p = FALSE)
- d(X)(x, log = FALSE)
- used wherever possible;
but backwards compatibility: always checked whether lowert.tail / log / log.p are
formals
- unified form for automatically generated r, d, p, q-slots:
- using (internal) standardized generators
- .makeDNew, .makePNew, .makeQNew
- .makeD, .makeP, .makeQ
- revised "*",
"+" ("Discrete/AbscontDistribution","numeric") methods (using .makeD, .makeP, .makeQ)
- revised RtoDPQ[.d]
(using .makeDNew, .makePNew, .makeQNew)
- revised convolution
methods (using .makeDNew,
.makePNew, .makeQNew)
- revised convpow()
methods (using .makeDNew, .makePNew, .makeQNew)
- cleaning up of environment of r,d,p,q-slot - removed
no longer needed objects
- left-continuous c.d.f. method (p.l) and
right-continuous quantile function (q.r) for DiscreteDistributions
- methods getLow, getUp for upper and lower
endpoint of support of DiscreteDistribution
or AbscontDistribution
(truncated to lower/upper TruncQuantile if infinite)
- analytically exact slots d,p (and higher accuracy for q) for
distribution objects generated by functions
abs, exp, log for classes AbscontDistribution
and DiscreteDistribution
- new (internally used) classes AffLinAbscontDistribution,
AffLinLatticeDistribution
and AffLinLatticeDistribution to
capture the results of transformations
Y <- a * X0 + b
for a, b numeric and X0 Abscont/Discrete/LatticeDistribution
and a class union AffLinDistribution
of AffLinAbscontDistribution and AffLinLatticeDistribution
to use this for more exact evaluations of functionals in package 'distrEx'
- Version-management for changed class definitions to
- AbscontDistribution
(gains slot gaps)
- subclasses of LatticeDistribution
(Geom, Binom, Nbinom, Dirac,
Pois, Hyper):
(changed by inheriting from LatticeDistribution,
gaining slot lattice
!)
realized by
- moved generics to isOldVersion(),
conv2NewVersion()
from 'distrSim'
to 'distr'
- moved (slightly generalized version of) isOldVersion()
(now for signature ANY) from 'distrSim'
to 'distr'
- new methods for conv2NewVersion
for signature
- ANY : fills
missing slots with
corresponding entries from prototype
- LatticeDistribution:
generates a new
instance (with slot lattice(!))
by new(class(object),
<list of parameters>)
- enhanced plot()
methods (see ?"plot-methods" )
- for both AbscontDistributions
and DiscreteDistributions
- optional width
and height argument for
the display (default
16in : 9in)
- opens a new window for each plot
- does not work with Sweave;
workaround: argument withSweave
= TRUE
in .Rnw-file: use width and height argument like in
<<plotex1,eval=TRUE,fig=TRUE, width=8,height=4.5>>=
....
@
- optional main, inner titles and subtitles with main / sub / inner
- preset strings substituted in both expression and character
vectors (x : argument
with which plot() was
called)
- %A deparsed argument x
- %C class of argument x
- %P comma-separated list of parameter values of slot
param of
argument x
- %Q comma-separated list of parameter values of slot param of argument x in () unless this list is
empty - then ""
- %N comma-separated <name>=<value> - list of
parameter values of slot param
of argument x
- %D time/date at which plot is/was generated
- title sizes with cex.main
/ cex.inner / cex.sub
- bottom / top margin with bmar, tmar
- setting of colors with col / col.main / col.inner / col.sub
- can cope with log-arguments
- setting of plot symbols with pch
/ pch.a / pch.u
- different symbols for unattained [pch.u] / attained [pch.a] one-sided
limits
- do.points
argument as in plot.stepfun()
- verticals
argument as in plot.stepfun()
- setting of colors with col
/ col.points / col.vert / col.hor
- setting of symbol size with with cex / cex.points
- for AbscontDistributions
- (panel "q"): takes
care of finite left/right endpoints of
support
- (panel "q"):
optionally takes care of constancy region (via do.points / verticals)
- ngrid argument
to set the number of grid points
- for DiscreteDistributions
:
- DEPRECATED:
- class 'GeomParameter' --- no longer
needed
as this the parameter
of a 'Nbinom'
with size 1
Changes from 1.9 to
2.0
- made calls to 'uniroot()', 'integrate()', 'optim(ize)()'
compliant to
https://stat.ethz.ch/pipermail/r-devel/2007-May/045791.html
- new generating function 'AbscontDistribution'
- new class 'UnivarMixingDistribution' for mixing
distributions with
methods / functions:
- 'UnivarMixingDistribution' (generating function)
- flat.mix to make out of it a distribution of class
'UnivarLebDecDistribution'
- new class 'AffLinUnivarLebDecDistribution' for affine
linear transformations
of 'UnivarLebDecDistribution' (in particular for use with E())
- new class union 'AcDcLcDistribution' as common mother
class
for 'UnivarLebDecDistribution', 'AbscontDistribution',
'DiscreteDistribution'; corresponding methods / functions:
- enhanced arithmetic: (for 'AcDcLcDistribution')
- convolution for 'UnivarLebDecDistribution'
- affine linear trafos for 'UnivarLebDecDistribution'
- 'numeric' / 'AcDcLcDistribution'
- 'numeric' ^ 'AcDcLcDistribution'^
- 'AcDcLcDistribution' ^ 'numeric'
- binary operations for independent distributions:
- 'AcDcLcDistribution' * 'AcDcLcDistribution'
- 'AcDcLcDistribution' / 'AcDcLcDistribution'
- 'AcDcLcDistribution' ^ 'AcDcLcDistribution'
- (better) exact transformations for exp() and log()
- Minimum Maximum Truncation Huberization
- convpow for 'UnivarLebDecDistribution'
- new generating function 'AbscontDistribution'
- 'decomposePM' decomposes distributions in positive /
negative part
(and in Dirac(0) if discrete)
- 'simplifyD' tries to cast to simpler classes (e.g. if a
weight is 0)
Changes from 2.0 to
2.0.3
TEMPLATE
Our plans for
the next version:
- overloading binary operators of group Math2 for independent
distributions
[done partially as for version 2.0]
- defining a subgroup of Math2
of invertible binary operators
- application of analytic Fourier Transforms
or FFT to any univariate
distributions ---perhaps also to be controlled
by a parameter/option
- use the q-slot applied to runif in simplifyr for continuous
distributions
- further exact formulae for binary arithmetic
operations like "*"
[done partially as for version 2.0]
- redo the initialize- and the math-method for discrete
distributions when only slot r
is given
- special group generic for invertible operators for the exact
determination of image distributions
- liesInSupport:
allow for logical operations for slot img of distributions
Things we invite
other people to do
- multivariate distributions
- conditional distributions
- copula