+ learner first aid

First aid: read the overview, copy one worked example by hand, then try explaining the key rule without looking.

+ Math syllabus context

Current Mathematics path is the active Basic Mathematics syllabus. The 2023 Mathematics syllabus is a transition path expected to take effect from January 2027; this wiki will update the lead path in late 2026.

Statistics

Core Concepts

Statistics involves the collection, organization, presentation, analysis, and interpretation of numerical data. At this level, the focus is on summarizing grouped data both graphically and numerically.

1. Data Presentation Charts

  • Pictograms: Represent data using proportional images or symbols. A key must be provided to indicate what quantity each symbol represents.
  • Bar Charts: Use rectangular bars with lengths proportional to the values they represent. The bars are separated by equal gaps, making them ideal for discrete or categorical data.
  • Line Graphs: Display data points connected by straight line segments, typically used to illustrate trends over time.
  • Pie Charts: Circular charts divided into sectors. The angle of each sector is proportional to the frequency of the category it represents. The sector angle is calculated as:
  • $$ \text{Angle} = \left( \frac{\text{Frequency}}{\text{Total Frequency}} \right) \times 360^\circ $$

2. Grouped Frequency Distributions

When analyzing large sets of continuous data, we group values into class intervals (e.g., $40\text{--}49$, $50\text{--}59$).

  • Class Limits: The smallest and largest observed values in a class (e.g., 40 is the lower limit, 49 is the upper limit).
  • Class Boundaries: The true boundaries that close the gaps between consecutive intervals. For integers, subtract $0.5$ from the lower limit and add $0.5$ to the upper limit (e.g., $39.5\text{--}49.5$).
  • Class Mark (Mid-point, $x$): The center of a class interval, found by averaging the lower and upper limits.
  • Class Size ($c$): The difference between the upper and lower class boundaries.

3. Graphical Presentation of Grouped Data

  • Histograms: A type of bar chart for continuous grouped data where rectangles are drawn without gaps. The horizontal axis represents class boundaries, and the vertical axis (or area of the rectangle) represents frequency. Histograms are used to visually estimate the Mode.
  • Frequency Polygons: A line graph formed by connecting the mid-points of the tops of the histogram rectangles.
  • Cumulative Frequency Curves (Ogive): A smooth curve plotted by graphing cumulative frequencies against upper class boundaries. It is used to visually estimate the Median, quartiles, and percentiles.

4. Measures of Central Tendency for Grouped Data

  • Mean ($\bar{x}$): The average value. It can be computed directly or using an assumed mean ($A$):
  • $$ \bar{x} = \frac{\sum fx}{\sum f} \quad \text{or} \quad \bar{x} = A + \frac{\sum fd}{\sum f} $$ where $x$ is the class mark, $f$ is the frequency, and $d = x - A$ is the deviation from the assumed mean.

  • Median: The middle value. It is found in the median class (where the cumulative frequency reaches $\frac{N}{2}$) using the formula:
  • $$ \text{Median} = L + \left( \frac{\frac{N}{2} - n_b}{n_w} \right) c $$ where $L$ is the lower boundary of the median class, $N = \sum f$, $n_b$ is the cumulative frequency before the median class, $n_w$ is the frequency of the median class, and $c$ is the class size.

  • Mode: The most frequent value, located in the modal class (the class with the highest frequency):
  • $$ \text{Mode} = L + \left( \frac{\Delta_1}{\Delta_1 + \Delta_2} \right) c $$ where $L$ is the lower boundary of the modal class, $\Delta_1$ is the difference in frequency between the modal class and the preceding class, $\Delta_2$ is the difference in frequency between the modal class and the succeeding class, and $c$ is the class size.

Worked Examples

Example 1: Analyzing Grouped Data The following frequency distribution table shows the scores of 30 students in a Mathematics test:

| Class Interval | Frequency | |----------------|-----------| | $40\text{--}49$| 2 | | $50\text{--}59$| 4 | | $60\text{--}69$| 7 | | $70\text{--}79$| 9 | | $80\text{--}89$| 5 | | $90\text{--}99$| 3 |

Calculate: (a) The mean score using an assumed mean of $74.5$. (b) The median score. (c) The mode score.

Solution: First, expand the table to include class boundaries, class marks ($x$), deviations ($d = x - A$), $fd$, and cumulative frequencies ($CF$). The assumed mean is $A = 74.5$.

| Class Interval | Boundaries | Class Mark ($x$) | Freq ($f$) | Deviation ($d = x - 74.5$) | $fd$ | Cumul. Freq ($CF$) | |---|---|---|---|---|---|---| | $40\text{--}49$ | $39.5\text{--}49.5$ | 44.5 | 2 | -30 | -60 | 2 | | $50\text{--}59$ | $49.5\text{--}59.5$ | 54.5 | 4 | -20 | -80 | 6 | | $60\text{--}69$ | $59.5\text{--}69.5$ | 64.5 | 7 | -10 | -70 | 13 | | $70\text{--}79$ | $69.5\text{--}79.5$ | 74.5 | 9 | 0 | 0 | 22 | | $80\text{--}89$ | $79.5\text{--}89.5$ | 84.5 | 5 | 10 | 50 | 27 | | $90\text{--}99$ | $89.5\text{--}99.5$ | 94.5 | 3 | 20 | 60 | 30 | | Total | | | $\sum f = 30$ | | $\sum fd = -100$ | |

(a) Mean Score Using the assumed mean formula: $$ \bar{x} = A + \frac{\sum fd}{\sum f} $$ $$ \bar{x} = 74.5 + \frac{-100}{30} = 74.5 - 3.333... \approx 71.17 $$ The mean score is $71.17$.

(b) Median Score Position of the median = $\frac{N}{2} = \frac{30}{2} = 15^\text{th}$ position. Looking at the cumulative frequencies, the $15^\text{th}$ value falls in the class $70\text{--}79$. Median class = $70\text{--}79$. $L = 69.5$ (Lower boundary of the median class) $N = 30$ $n_b = 13$ (Cumulative frequency before median class) $n_w = 9$ (Frequency of the median class) $c = 10$ (Class size: $79.5 - 69.5$)

$$ \text{Median} = L + \left( \frac{\frac{N}{2} - n_b}{n_w} \right) c $$ $$ \text{Median} = 69.5 + \left( \frac{15 - 13}{9} \right) 10 $$ $$ \text{Median} = 69.5 + \left( \frac{2}{9} \right) 10 = 69.5 + 2.222... \approx 71.72 $$ The median score is $71.72$.

(c) Mode Score The modal class is the one with the highest frequency, which is $70\text{--}79$ (frequency = 9). $L = 69.5$ (Lower boundary of the modal class) $\Delta_1 = 9 - 7 = 2$ (Difference between modal frequency and preceding frequency) $\Delta_2 = 9 - 5 = 4$ (Difference between modal frequency and succeeding frequency) $c = 10$ (Class size)

$$ \text{Mode} = L + \left( \frac{\Delta_1}{\Delta_1 + \Delta_2} \right) c $$ $$ \text{Mode} = 69.5 + \left( \frac{2}{2 + 4} \right) 10 $$ $$ \text{Mode} = 69.5 + \left( \frac{2}{6} \right) 10 = 69.5 + 3.333... \approx 72.83 $$ The mode score is $72.83$.

Example 2: Constructing a Pie Chart The frequency of transportation types used by 40 students is: Bus (18), Walk (12), Bicycle (10). Calculate the sector angles required to construct a pie chart.

Solution: Calculate the sector angle for each category using the formula: $\text{Angle} = \left( \frac{\text{Frequency}}{\text{Total Frequency}} \right) \times 360^\circ$

  • Bus: $\frac{18}{40} \times 360^\circ = 18 \times 9^\circ = 162^\circ$
  • Walk: $\frac{12}{40} \times 360^\circ = 12 \times 9^\circ = 108^\circ$
  • Bicycle: $\frac{10}{40} \times 360^\circ = 10 \times 9^\circ = 90^\circ$
  • (To draw the pie chart, draw a circle, mark a starting radius, and use a protractor to measure and draw sectors of $162^\circ$, $108^\circ$, and $90^\circ$ respectively. Label each sector with its category.)

NECTA Exam Focus

Based on past papers (2018–2024), the Statistics topic is a highly predictable, high-yielding section of the CSEE Basic Mathematics exam. It is typically assessed in Section B of Paper 1 as a comprehensive 10-mark question.

Recurring Themes:

  1. Constructing Frequency Distribution Tables: Most questions begin by providing raw, ungrouped data (often 30 to 50 data points) and require you to build a grouped frequency distribution table using a specified class size or starting interval. Occasionally, a cumulative frequency table is given, and you must "unpack" it into standard frequencies.
  2. Calculating the Mean: You are frequently asked to find the mean using an assumed mean method. NECTA tests your ability to correctly identify the class marks ($x$) and use the deviation column ($d$).
  3. Graphical Representations: Drawing a Histogram to estimate the mode or plotting an Ogive (Cumulative Frequency Curve) to estimate the median are extremely common. When asked to estimate via graph, NECTA expects to see the construction lines on your graph paper.
  4. Mathematical Formulas: Analytical calculation of the median and mode using their respective formulas is frequently tested alongside graphical estimations.

Common Pitfalls:

  • Confusing Class Limits with Boundaries: When drawing histograms or ogives, students often mistakenly plot class limits (e.g., $40, 49, 50, 59$) instead of continuous class boundaries ($39.5, 49.5, 59.5$). This leads to disconnected histograms, incorrect ogive plots, and wrong estimations.
  • Tallying Errors: Skipping a number or double-counting when extracting frequencies from raw data arrays. Cross out numbers as you tally them to avoid this.
  • Ogive Placements: Plotting the cumulative frequency against the class mark or lower boundary instead of the upper class boundary. An ogive must always start from the lower boundary of the first class on the horizontal axis with a cumulative frequency of zero.
  • Rounding Accuracy: Failing to adhere to NECTA's instructions on significant figures or decimal places (e.g., "correct to 2 decimal places" or "4 significant figures"). Always compute to at least one extra digit before rounding the final answer.

Practice Problems

1. [2018 Paper 1] The scores of 45 pupils in a Civics test were recorded as follows: 30 65 50 62 40 35 64 32 28 59 60 82 24 35 63 68 46 48 73 92 54 46 63 75 58 43 71 72 27 28 61 71 36 64 80 61 64 76 64 35 76 73 70 64 46 (a) Construct a frequency distribution table of the given data, taking equal class intervals $21\text{--}40$, $41\text{--}60$, ... (b) Draw the cumulative frequency curve and use it to estimate the median.

2. [2020 Paper 1] The following data represent the marks scored by 36 students of a certain school in Geography examination: 72 76 90 89 74 82 63 74 70 73 58 71 55 62 65 74 71 64 71 85 70 61 64 75 51 83 50 61 83 68 70 80 50 60 66 68 (a) Prepare a frequency distribution table representing the given data by using the class intervals: $50\text{--}54$, $55\text{--}59$, $60\text{--}64$, and so on. (b) Use the frequency distribution table obtained in part (a) to:     (i) draw a histogram.     (ii) calculate the median. Write the answer correct to 2 decimal places.

3. [2024 Paper 1] In the terminal examination of a certain school, the scores of students in Geography subject were grouped as shown in the following table:

| Scores | Cumulative Frequency | |---|---| | $65\text{--}69$ | 10 | | $70\text{--}74$ | 22 | | $75\text{--}79$ | 43 | | $80\text{--}84$ | 49 | | $85\text{--}89$ | 58 | | $90\text{--}94$ | 62 | | $95\text{--}99$ | 66 |

(a) Calculate the mode of the scores correct to 3 decimal places. (b) Find the mean score correct to 2 decimal places, given an assumed mean of 77. (c) Draw the cumulative frequency curve (ogive) of the scores.

Subtopics

  • Mean
  • Mode
  • Median
  • Frequency distributions
  • Cumulative frequency curves

Crosswalk Notes

Cross-version relationships are drafted in data/curricula/crosswalks/csee-basic-mathematics-2005-to-mathematics-2023.json. Partial and 2005-only mappings remain reviewable.

+ Related Pages

Syllabus Sequence

Sibling Topics

Curriculum And Sources