10.5 Continuous Distributions

 

• Histograms are useful for grouped data, but in the cases where the data is continuous, we use distributions of probability.

 

• In general

- the area under the graph = 1.00000.......

- the graphs often stretch (asymptotically) to infinity

 

• In specific, some of the distribution properties are,

 

 

• In addition the centre of the distribution can vary (i.e. the average or mean)

 

• More on distribution later

 

 

 

10.5.1 Describing Distribution Centers With Numbers

 

• The best known method is the average. This gives the centre of a distribution

 

 

 

 

 

 

• Another good measure is the median

 

- If odd number of samples it is the middle number

- if an even number of samples, it is the average of the left and right bounding numbers

 

 

• If the numbers are grouped the median becomes

 

 

• Mode can be useful for identifying repeated patterns

- a mode is a repeated value that occurs the most. Multiple modes are possible.

 

 

 

 

10.5.2 Dispersion As A Measure of Distribution

 

• The range of values covered by a distribution are important

 

- Range is the difference between the highest and lowest numbers.

 

 

• Standard deviation is a classical measure of grouping (for normal distributions??????????)

 

- the equation is,

 

 

 

• When we use a standard deviation, we can estimate the distribution of the samples.

 

 

• By adding standard deviations to increase the range size, the percentage of samples included are,

 

 

• Other formulas for standard deviation are,

 

 

 

 

 

10.5.3 The Shape of the Distribution

 

• Skewed functions

 

 

• this lack of symmetry tends to indicate a bias in the data (and hence in the real world)

 

• a skew factor can be calculated

 

 

10.5.4 Kurtosis

 

• This is a peaking in the data

 

 

• This is best used for comparison to other values. i.e. you can watch the trends in the values of a4.

 

 

10.5.5 Generalizing From a Few to Many

 

 

 

10.5.6 The Normal Curve

 

• this is a good curve that tends to represent distributions of things in nature (also called Gaussian)

 

• This distribution can be fitted for populations (μ, σ), or for samples (X, s)

 

 

• The area under the curve is 1, and therefore will enclose 100% of the population.

 

• the parameters vary the shape of the distribution

 

 

• The area under the curve indicates the cumulative probability of some event

 

 

• When applied to quality ±3σ are used to define a typical “process variability” for the product. This is also known as the upper and lower natural limits (UNL & LNL)

 

******************* LOOK INTO USE OF SYMBOLS, and UNL, LNL, UCL, LCL, etc.

 

 

10.5.7 Probability plots

 

• Procedure