## Averages

The phrase "on average..." is used a lot in the media to describe things that are the norm. Hopefully, this lesson will allow you to look at statements such as this in a new, mor informed way. There are three different averages that you can calculate.

The mode is the easiest of the averages to find and the one that is used the least! To find the mode you need to look for the

Mode = 2 as it appears the most in the list (four times)

Mode = there is no mode.

NOTE: Be careful here not to say that the mode is zero, since 0 is not a number of the list and so cannot be the mode!

**most common**value or item.- The mode is special because it is the only average that exists when your data is given in words. For example if you wanted to find the most popular sweet for a group of friends you could ask each of them their favourite and pick the sweet which is listed the most to be the mode.
- Sometimes you will have more than one mode (if two values tie for the most common) or no mode (if none of the values are repeated).

*Examples:**Find the mode for: 2,2,4,3,5,7,8,2,6,6,4,2,6,7*Mode = 2 as it appears the most in the list (four times)

*Find the mode for: 2,3,7,9,12,5,8,6,4,150*Mode = there is no mode.

NOTE: Be careful here not to say that the mode is zero, since 0 is not a number of the list and so cannot be the mode!

The median is the middle value and so is used when you want to have an idea of what the

Median position = (7+1)/2 = 8/2 = 4th

Median = 6

List in order: 10,10,12,16,16,17,19,20,20,26

Median position = (10+1)/2 = 11/2 = 5.5th position

5th position = 16

6th position = 17

Median = 16.5

One benefit of the median is that, because it is only concerned with the middle value, it is not affected by very large or very small data values. This could be useful if you are looking for the average wage in a company as the median will be one of the regular workers and the salary of the bosses will not affect the result.

**middle**person is experiencing (I'm trying to avoid saying the average person here but really that's what it is.- Before you can find the middle value you must put the numbers in order. I usually do this smallest to largest but it doesn't really matter. If you don't put the numbres in order first then the calculation you do will be meaningless so please don't forget this bit!
- Most people find it easiest to count into the middle value crossing off one from the start of the

list and then one from the end of the list before repeating the process. This can take a while if you have lots of values but is fine if you have a relatively small list. - To get to the position of the median quickly you can use the following formula: (n+1)/2 where n is the number of values you have in the full list. If, for example, you have five values, then the calculation of (5+1)/2 = 6/2 = 3 so you should look for the third value in the list for your

median. If you test this using the crossing out method it is easy to see that the third value out of 5 is in the middle - If you start with an even number of values in your list and you use the crossing out method, then you will end up with two values in the middle. You cannot have two medians so you need to look for the number which is half way between your two leftovers. Sometimes this is easy to do but if you are struggling you can add the numbers together and divide by 2.
- If you are using the formula to locate the position of the median and you get a decimal, such as the 3.5th position, you need to look between two values for you median (so in this case that is between value three and value four.

*Examples:**Find the median of: 1,6,7,8,2,4,9*

List in order: 1,2,4,6,7,8,9Median position = (7+1)/2 = 8/2 = 4th

Median = 6

*Find the median of: 20,12,16,19,17,26,10,10,20,16*List in order: 10,10,12,16,16,17,19,20,20,26

Median position = (10+1)/2 = 11/2 = 5.5th position

5th position = 16

6th position = 17

Median = 16.5

One benefit of the median is that, because it is only concerned with the middle value, it is not affected by very large or very small data values. This could be useful if you are looking for the average wage in a company as the median will be one of the regular workers and the salary of the bosses will not affect the result.

The mean is the average which is used the most often by the media. To calculate the mean you need to

Total = 5+6+7+5+2 = 25

Mean = 25/5 = 5

Total = 1+2+6+7+2+9+12+8+322+5+12+6+3+3 = 398

Mean = 398/14 = 28.4 (1.d.p)

The second of these examples shows one disadvantage of the mean. As it uses all the data values then it is affected by large or small values which are a long way away from the rest of the data. the answer of 28.4 is above all the other numbers except the 322 so isn't really a good way to represent the data. Unfortunately it is exactly this method that various media establishments use to make the data support their own views. I don't want to go off on a rant here but I just wanted you to be aware! Obviously if all your data is relatively close together then the mean is the perfect average to use exactly because it uses all the data. In this case it will give a fair representation of the values you have started with which is why it is usually thought of by people as the true average.

**add up all the values**and**divide by how many values you have**.- If you are using a calculator to work out the mean make sure you press equals to get the total of the values before you try to divide. If you don't then your mean will be incorrect.
- Your mean will be somewhere within your original data but doesn't neccessarily have to be one of your data values.
- You may need to round your answer to something sensible (either one or two decimal places) if you get a long decimal, rather than writing down the full answer.

*Examples:*

Find the mean of: 5,6,7,5,2Find the mean of: 5,6,7,5,2

Total = 5+6+7+5+2 = 25

Mean = 25/5 = 5

*Find the mean of: 1,2,6,7,2,9,12,8,122,5,12,6,3,3*Total = 1+2+6+7+2+9+12+8+322+5+12+6+3+3 = 398

Mean = 398/14 = 28.4 (1.d.p)

The second of these examples shows one disadvantage of the mean. As it uses all the data values then it is affected by large or small values which are a long way away from the rest of the data. the answer of 28.4 is above all the other numbers except the 322 so isn't really a good way to represent the data. Unfortunately it is exactly this method that various media establishments use to make the data support their own views. I don't want to go off on a rant here but I just wanted you to be aware! Obviously if all your data is relatively close together then the mean is the perfect average to use exactly because it uses all the data. In this case it will give a fair representation of the values you have started with which is why it is usually thought of by people as the true average.

## Spread of data

Although the range is not an average it is usually calculated alongside the other three and used when comparing two sets of data. To find the range you need to subtract the smallest value from the largest value.

Smallest = 1

Range = 20 - 1 = 19

Largest = 109

Smallest = 90

Range = 109 - 90 = 19

As you can see from these examples, two sets of data can be very different from each other and have the same range. Because of this the range isn't very useful on its own but can be vital when trying to compare two sets of data.

- The range tells you how spread out a set of data is. The smaller the range, the closer together the data values are. If you're looking at performance over time then a smaller range is desirable as it means that the thing being measured has been more consistent than if the range was large.

*Examples:*

Find the range of: 15,16,4,9,12,1,16,20,5,4,7,9

Largest = 20Find the range of: 15,16,4,9,12,1,16,20,5,4,7,9

Smallest = 1

Range = 20 - 1 = 19

*Find the range of: 96,90,102,97,109,101,100,99,97*Largest = 109

Smallest = 90

Range = 109 - 90 = 19

As you can see from these examples, two sets of data can be very different from each other and have the same range. Because of this the range isn't very useful on its own but can be vital when trying to compare two sets of data.

## Comparing data sets

If you have two sets of data that you want to compare you need to calculate at least one of the averages and the range. It is usual to calculate the mean or the median as the average, depending on the data set you have started with. You should write an answer using full sentences, describing which is 'better' on average and which is more consistent. Usually there is no one right or wrong answer, as long as you back up your conclusion with evidence.

I have decided to calculate the mean rather than the median since the mean uses all the data and there is no data value which looks far from the rest.

SET A Mean = 958.3

SET A Range = 103

SET B Mean = 956.3

SET B Range = 17

*Example:*

A company designs two different lightbulbs and decides to test a set of each to decide how long they last. They are only planning on selling one type of bulb so want to decide which one should go into mass production. They test the number of hours it takes before the bulbs fail and record the data. Decide which bulb they should choose to make.

SET A: 900, 992, 950, 925, 1003, 980

SET B: 956, 945, 959, 955, 962, 961

A company designs two different lightbulbs and decides to test a set of each to decide how long they last. They are only planning on selling one type of bulb so want to decide which one should go into mass production. They test the number of hours it takes before the bulbs fail and record the data. Decide which bulb they should choose to make.

SET A: 900, 992, 950, 925, 1003, 980

SET B: 956, 945, 959, 955, 962, 961

I have decided to calculate the mean rather than the median since the mean uses all the data and there is no data value which looks far from the rest.

SET A Mean = 958.3

SET A Range = 103

SET B Mean = 956.3

SET B Range = 17

__Conclusion:__On average, SET A is better as the mean is higher meaning that the bulbs lasted longer. However, SET B is more consistent as is shown by the range which is much smaller at 17 hours compared to the 103 hours for the other set. Because the difference in the mean is only small I would recommend the company makes the design used by SET B.Time for you to test if you've remembered which average is which...