| Products & Services | Solutions | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| R2010b Documentation → Statistics Toolbox |
| Contents | Index |
| Learn more about Statistics Toolbox |
| On this page… |
|---|
Object-Supported Distributions |
For many distributions supported by Statistics Toolbox software, objects are available for statistical analysis. This section gives a general overview of the uses of distribution objects, including sample work flows. For information on objects available for specific distributions, see Object-Supported Distributions.
Probability distribution objects allow you to easily fit, access, and store distribution information for a given data set. The following operations are easier to perform using distribution objects:
Grouping a single dataset in a number of different ways using group names, and then fit a distribution to each group. For an example of how to fit distributions to grouped data, see Example: Fitting Distributions to Grouped Data Within a Single Dataset.
Fitting different distributions to the same set of data. For an example of how objects make fitting multiple distribution types easier, see Example: Fitting Multiple Distribution Types to a Single Dataset.
Sharing fitted distributions across workspaces. For an example of sharing information using probability distribution objects, see Example: Saving and Sharing Distribution Fit Data.
If you know the type of distribution you would like to use, objects provide a less complex interface than functions and a more efficient functionality than the dfittool GUI.
If you are a novice statistician who would like to explore how various distributions look without having to manipulate data, see Working with Distributions Through GUIs.
If you have no data to fit, but want to calculate a pdf, cdf, etc for various parameters, see Statistics Toolbox Distribution Functions.
Objects are, in short, a convenient way of storing data. They allow you to set rules for the types of data to store, while maintaining some flexibility for the actual values of the data. For example, in statistics groups of distributions have some general things in common:
All distributions have a name (ex, Normal).
Parametric distributions have parameters.
Nonparametric distributions have kernel-smoothing functions.
Objects store all this information within properties. Classes of related objects (for example, all univariate parametric distributions) have the same properties with values and types relevant to a specified distribution. In addition to storing information within objects, you can perform certain actions (called methods) on objects.
Subclasses (for example, ProbDistParametric is a subclass of ProbDist) contain the same properties and methods as the original class, in addition to other properties relevant to that subclass. This concept is called inheritance. Inheritance means that subclasses of a class have all of its properties and methods. For example, parametric distributions, which are a subset (subclass) of probability distributions, have input data and a distribution name. The following diagram illustrates this point:

The left side of this diagram shows the inheritance line from all probability distributions down to univariate parametric probability distributions. The right side shows the lineage down to univariate kernel distributions. Here is how to interpret univariate parametric distribution lineage:
ProbDist is a class of objects that includes all probability distributions. All probability distribution objects have at least these properties:
DistName — the name of the distribution (for example Normal or Weibull)
InputData — the data fit to the distribution
In addition, you can perform the following actions on these objects, using the following methods:
ProbDistParametric is a class of objects that includes all parametric probability distributions. All parametric probability distribution objects have the properties and methods of a ProbDist object, in addition to at least the following properties:
NLogL — Negative log likelihood for input data
NumParams — Number of parameters for that distribution
ParamCov — Covariance matrix of parameter estimates
ParamDescription — Descriptions of parameters
ParamNames — Names of parameters
Params — Values of parameters
No additional unique methods apply to ProbDistParametric objects.
ProbDistUnivParam is a class of objects that includes only univariate parametric probability distributions. In addition to the properties and methods of ProbDist and ProbDistParametric objects, these objects also have at least the following methods:
icdf — Return the inverse cumulative distribution function for a specified distribution based on a given set of data.
iqr — Return the interquartile range for a specified distribution based on a given set of data.
mean — Return the mean for a specified distribution based on a given set of data.
median — Return the median for a specified distribution based on a given set of data.
paramci — Return the parameter confidence intervals for a specified distribution based on a given set of data.
std — Return the standard deviation for a specified distribution based on a given set of data.
var — Return the variance for a specified distribution based on a given set of data.
No additional unique properties apply to ProbDistUnivParam objects.
The univariate nonparametric lineage reads in a similar manner, with different properties and methods. For more information on nonparametric objects and their methods and properties, see ProbDistKernel and ProbDistUnivKernel.
For more detailed information on object-oriented programming in MATLAB, see Object-Oriented Programming.
There are two ways to create distribution objects:
Use the fitdist function. See Creating Distribution Objects Using fitdist.
Use the object constructor. See Creating Distribution Objects Using Constructors.
Using the fitdist function is the simplest way of creating distribution objects. Like the *fit functions, fitdist fits your data to a specified distribution and returns relevant distribution information. fitdist creates an object relevant to the type of distribution you specify: if you specify a parametric distribution, it returns a ProbDistUnivParam object. For examples of how to use fitdist to fit your data, see Performing Calculations Using Distribution Objects.
If you know the distribution you would like to use and would like to create a univariate parametric distribution with known parameters, you can use the ProbDistUnivParam constructor. For example, create a normal distribution with mean 100 and standard deviation 10:
pd = ProbDistUnivParam('normal',[100 10])For nonparametric distributions, you must have a dataset. Using fitdist is a simpler way to fit nonparametric data, but you can use the ProbDistUnivKernel constructor as well. For example, create a nonparametric distribution of the MPG data from carsmall.mat:
load carsmall pd = ProbDistUnivKernel(MPG)
Object-oriented programming in the Statistics Toolbox supports the following distributions.
Use the following distribution to create ProbDistUnivParam objects using fitdist. For more information on the cumulative distribution function (cdf) and probability density function (pdf) methods, as well as other available methods, see the ProbDistUnivParam class reference page.
| Supported Distribution | Input to fitdist |
|---|---|
| Beta Distribution | 'beta' |
| Binomial Distribution | 'binomial' |
| Birnbaum-Saunders Distribution | 'birnbaumsaunders' |
| Exponential Distribution | 'exponential' |
| Extreme Value Distribution | 'extreme value' or 'ev' |
| Gamma Distribution | 'gamma' |
| Generalized Extreme Value Distribution | 'generalized extreme value' or 'gev' |
| Generalized Pareto Distribution | 'generalized pareto' or 'gp' |
| Inverse Gaussian Distribution | 'inversegaussian' |
| Logistic Distribution | 'logistic' |
| Loglogistic Distribution | 'loglogistic' |
| Lognormal Distribution | 'lognormal' |
| Nakagami Distribution | 'nakagami' |
| Negative Binomial Distribution | 'negative binomial' or 'nbin' |
| Normal Distribution | 'normal' |
| Poisson Distribution | 'poisson' |
| Rayleigh Distribution | 'rayleigh' |
| Rician Distribution | 'rician' |
| t Location-Scale Distribution | 'tlocationscale' |
| Weibull Distribution | 'weibull' or 'wbl' |
Use the following distributions to create ProbDistUnivKernel objects. For more information on the cumulative distribution function (cdf) and probability density function (pdf) methods, as well as other available methods, see the ProbDistUnivKernel class reference page.
| Supported Distribution | Input to fitdist |
|---|---|
| Nonparametric Distributions | 'kernel' |
Distribution objects make it easier for you to perform calculations on complex datasets. The following sample workflows show some of the functionality of these objects.
Example: Fitting Distributions to Grouped Data Within a Single Dataset
Example: Fitting Multiple Distribution Types to a Single Dataset
Fit a single Normal distribution to a dataset using fitdist:
load carsmall
NormDist = fitdist(MPG,'normal')
NormDist =
normal distribution
mu = 23.7181
sigma = 8.03573The output MATLAB returns is a ProbDistUnivParam object with a DistName property of 'normal distribution'. The ParamNames property contains the strings mu and sigma, while the Params property contains the parameter values.
Often, datasets are collections of data you can group in different ways. Using fitdist and the data from carsmall.mat, group the MPG data by country of origin, then fit a Weibull distribution each group:
load carsmall
[WeiByOrig, Country] = fitdist(MPG,'weibull','by',Origin)
Warning: Error while fitting group 'Italy':
Not enough data in X to fit this distribution.
> In fitdist at 171
WeiByOrig =
Columns 1 through 4
[1x1 ProbDistUnivParam] [1x1 ProbDistUnivParam] ...
[1x1 ProbDistUnivParam] [1x1 ProbDistUnivParam]
Columns 5 through 6
[1x1 ProbDistUnivParam] []
Country =
'USA'
'France'
'Japan'
'Germany'
'Sweden'
'Italy'
A warning appears informing you that, since the data only represents one Italian car, fitdist cannot fit a Weibull distribution to that group. Each one of the five other groups now has a distribution object associated with it, represented in the cell array wd. Each object contains properties that hold information about the data, the distribution, and the parameters. For more information on what properties exist and what information they contain, see ProbDistUnivParam or ProbDistUnivKernel.
Now access two of the objects and their properties:
% Get USA fit
distusa = WeiByOrig{1};
% Use the InputData property of ProbDistUnivParam objects to see
% the actual data used to fit the distribution:
dusa = distusa.InputData.data;
% Get Japan fit and data
distjapan = WeiByOrig{3};
djapan = distjapan.InputData.data;
Now you can easily compare PDFs using the pdf method of the ProbDistUnivParam class:
time = linspace(0,45);
pdfjapan = pdf(distjapan,time);
pdfusa = pdf(distusa,time);
hold on
plot(time,[pdfjapan;pdfusa])
l = legend('Japan','USA')
set(l,'Location','Best')
xlabel('MPG')
ylabel('Probability Density')

You could then further group the data and compare, for example, MPG by year for American cars:
load carsmall
[WeiByYearOrig, Names] = fitdist(MPG,'weibull','by',...
{Origin Model_Year});
USA70 = WeiByYearOrig{1};
USA76 = WeiByYearOrig{2};
USA82 = WeiByYearOrig{3};
time = linspace(0,45);
pdf70 = pdf(USA70,time);
pdf76 = pdf(USA76,time);
pdf82 = pdf(USA82,time);
line(t,[pdf70;pdf76;pdf82])
l = legend('1970','1976','1982')
set(l,'Location','Best')
title('USA Car MPG by Year')
xlabel('MPG')
ylabel('Probability Density')

Distribution objects make it easy to fit multiple distributions to the same dataset, while minimizing workspace clutter. For example, use fitdist to group the MPG data by country of origin, then fit Weibull, Normal, Logistic, and nonparametric distributions for each group:
load carsmall; [WeiByOrig, Country] = fitdist(MPG,'weibull','by',Origin); [NormByOrig, Country] = fitdist(MPG,'normal','by',Origin); [LogByOrig, Country] = fitdist(MPG,'logistic','by',Origin); [KerByOrig, Country] = fitdist(MPG,'kernel','by',Origin);
Extract the fits for American cars and compare the fits visually against a histogram of the original data:
WeiUSA = WeiByOrig{1};
NormUSA = NormByOrig{1};
LogUSA = LogByOrig{1};
KerUSA = KerByOrig{1};
% Since all three distributions use the same set of data,
% you can extract the data from any of them:
data = WeiUSA.InputData.data;
% Create a histogram of the data:
[n,y] = hist(data,10);
b = bar(y,n,'hist');
set(b,'FaceColor',[1,0.8,0])
% Scale the density by the histogram area, for easier display:
area = sum(n) * (y(2)-y(1));
time = linspace(0,45);
pdfWei = pdf(WeiUSA,time);
pdfNorm = pdf(NormUSA,time);
pdfLog = pdf(LogUSA,time);
pdfKer = pdf(KerUSA,time);
allpdf = [pdfWei;pdfNorm;pdfLog;pdfKer];
line(t,area * allpdf)
l = legend('Data','Weibull','Normal','Logistic','Kernel')
set(l,'Location','Best')
title('USA Car')
xlabel('MPG')

You can see that only the nonparametric kernel distribution, KerUSA, comes close to revealing the two modes in the data.
Distribution objects allow you to share both your dataset and your analysis results simply by saving the information to a .mat file.
Using the premise from the previous set of examples, group the MPG data in carsmall.mat by country of origin and fit four different distributions to each of the six sets of data:
load carsmall; [WeiByOrig, Country] = fitdist(MPG,'weibull','by',Origin); [NormByOrig, Country] = fitdist(MPG,'normal','by',Origin); [LogByOrig, Country] = fitdist(MPG,'logistic','by',Origin); [KerByOrig, Country] = fitdist(MPG,'kernel','by',Origin);
Combine all four fits and the country labels into a single cell array, including "headers" to indicate which distributions correspond to which objects. Then, save the array to a .mat file:
AllFits = cell(['Country' Country'; 'Weibull' WeiByOrig;...
'Normal' NormByOrig; 'Logistic' LogByOrig; 'Kernel',...
KerByOrig]);
save('CarSmallFits.mat','AllFits');To show that the data is both safely saved and easily restored, clear your workspace of relevant variables. This command clears only those variables associated with this example:
clear('Weight','Acceleration','AllFits','Country',...
'Cylinders','Displacement','Horsepower','KerByOrig',...
'LogByOrig','MPG','Model','Model_Year','NormByOrig',...
'Origin','WeiByOrig')Now, load the data:
load CarSmallFits AllFits
You can now access the distributions objects as in the previous examples.
![]() | Statistics Toolbox Distribution Functions | Probability Distributions Used for Multivariate Modeling | ![]() |
| © 1984-2010- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |