Measures of Central Tendency
Central tendency refers to the central or typical value around which data tends to cluster. It helps summarize data into a single representative value.
Arithmetic Mean: The sum of all observations divided by the number of observations. A simple average.
Weighted Arithmetic Mean: The mean where different values contribute differently based on assigned weights.
Median: The middle value when data is arranged in ascending or descending order; divides data into two equal halves.
Mode: The most frequently occurring value in a dataset.
Geometric Mean: The nth root of the product of n observations; used for growth rates.
Harmonic Mean: The reciprocal of the arithmetic mean of reciprocals; useful in rates like speed.
Partition Values: Values dividing data into equal parts:
1) Quartiles: Divide data into four equal parts.
2) Deciles: Divide data into ten equal parts.
3) Percentiles: Divide data into 100 equal parts.
Measures of Dispersion
Dispersion quantifies the variability or spread of data. It highlights how much observations differ from the central value.
Range: The difference between the maximum and minimum values.
Quartile Deviation (Semi-Interquartile Range): Half the difference between the upper quartile (Q3) and lower quartile (Q1).
Mean Deviation: The average of absolute deviations of values from the mean or median.
Standard Deviation: The square root of the variance; widely used as it considers all deviations.
Coefficient of Variation: Expresses standard deviation as a percentage of the mean; enables comparison between datasets.
Skewness
Skewness measures the asymmetry of a frequency distribution.
Meaning: If data is symmetrical (mean = median = mode), it is un-skewed. If not, it is either positively (right-skewed) or negatively (left-skewed) skewed.
Difference Between Dispersion and Skewness: Dispersion measures spread without considering shape, while skewness focuses on the shape and direction of data asymmetry.
Karl Pearson’s Coefficient: Based on the difference between mean and mode divided by standard deviation.
Bowley’s Coefficient: Uses quartiles (Q3, Q1, Q2) to assess symmetry.
Kurtosis
Kurtosis quantifies the sharpness or flatness of a distribution relative to a normal distribution.
Concept of Kurtosis: Focuses on the tails of a distribution.
Types:
Leptokurtic: Tall and sharp peak, with heavy tails.
Mesokurtic: Normal distribution, moderate peak.
Platykurtic: Flat and broad peak, with light tails.
Importance: Helps identify whether data has extreme outliers or concentrated near the mean.
Classification and Tabulation of Data
This involves organizing raw data into meaningful categories and tables for clarity:
Classification: Grouping data based on characteristics (e.g., geographical, chronological, qualitative, quantitative).
Tabulation: Presenting data in rows and columns for easy interpretation.
Frequency Distribution, Diagrams, and Graphs
These tools help visualize and summarize data:
Frequency Distribution: A table showing the frequency (count) of data points in intervals.
Diagrams & Graphs: Pictorial representations, e.g., bar graphs, histograms, pie charts.