There are "words people" and "numbers people" and sometimes it seems that never the twain shall meet. But some mathematical concepts are essential for freelance journalists to understand. One of them is averages. The three basic techniques for calculating this number are called the "mean," the "median," and "the mode."
These three techniques have different purposes and uses. Freelancers often cover several different topics, each of which may have its own protocol for evaluating and presenting information. Or they may write for different markets and editors, each with a different requirement.
When you read in a housing report that the "average housing price in North America is $300,000" just what does that tell you? Does it mean that most of the houses are $300,000? Does it mean that you will be able to find a house at that price in your neighborhood? If you use that statistic in an article you are writing, do you understand how it was calculated and what information it conveys?
It is essential not only to use the right tool, but also to understand which tools were used in the compilation of facts and figures you are using to bolster the points in your article.
The Mean: The Sum Divided by its Parts
The mean is probably the most widely understood method of calculating averages. If we have three incomes to calculate, we add up them all up, then simply divide the total by three. The problem with this method is that it tends to skew high, which means that a few super-high earners can make the numbers unrealistic.
For (an extreme) example, suppose you work for Madame Huge Celebrity, who makes $100 million dollars a year. And let's suppose that there are 10 of you in the office, and you each make $50,000 a year. A census taker would not be incorrect to add up all your earnings (the 10 of you plus the celebrity boss; that comes to $100,500,000) and divide by 11. In that case, she could report that "the average income of a person at 1 Celebrity Street is more than $9 million dollars a year!" That would come as a surprise to the workers, don't you think?
So the first lesson is that choosing the right method can be a matter of context. Realize that when using the mean with small populations that have a lot of variance, the results may be so unrealistically skewed that they are useless.
Note that statisticians have a few tools to deal with the kind of outlandish variation in the example above. Sometimes they truncate the unreasonably high figures, which means that they just cut them out of the data field entirely. Or they truncate the top and bottom values. Another technique is to count the high-earning person, but to adjust how much of her income to include in the data field by assigning her a reduced value, equivalent to that of the next highest income in the group.
Finally, journalists, when writing about a situation such as the one above, can always say "the mean salary of a worker at 1 Celebrity Place, not including Madame Huge Celebrity herself, is $50,000." The important thing is to communicate the information you intend to without inadvertent distortions.
The Median: Right Down the Middle
Choosing to use the median gets rid of some of the outlier problems associated with using the mean. In the median, the statistician (or reporter) simply makes a list of the numbers being averaged, from lowest to highest, and figures out the mid-point. The definition of the median is "half the values in the study are lower and half are higher." In the case of Madame Huge Celebrity, this method simply eliminates the skewing effect of her absurd salary. She becomes simply one of the people who make more than the midpoint.
The median is often used to describe housing market data and salaries, precisely because it eliminates the uber-wealthy exceptions and instead provides a more accurate and realistic picture of what is going on with the majority. Journalists should therefore be aware of whether one particular method is typically used when discussing certain types of data.
The Mode: Majority Rules
The mode is simply the most frequent number (or range of numbers). This is the least common method of averaging, but it can be useful in painting a picture of what is actually going on. For example, if almost everyone in a factory earns the same wage, the mode would be that particular wage. (It could also be presented as a range, for example, if 75% of the factory workers make between $15 and $16 an hour, then the mode could be stated as $15 - $16 dollars an hour. (Rounding off the pennies and giving ranges can eliminate irrelevancies that could skew the mode.) Giving the mode in cases like this presents the most accurate information about what a new worker can expect to make.
The same would be true for a housing market where most of the houses were very similar to each other (cookie cutter condos, for example, or a street filled with bungalows that all look pretty much the same). If 90% of the available bungalows sell in the $300,000 - $ 320,000 range, and if the other 10% of the housing stock is made up of homes that have been severely neglected or lavishly renovated, using the mode would give the bungalow buyer a more accurate picture of the cost of a typical home in a given neighborhood.
The mode can also be used when writing about difficult-to-quantify situations, such as the dominant fashion color on the ski slopes this year. Let's say that in a head count, 50 people wore vermilion, 30 wore chartreuse, 10 wore mango, and 10 wore black. A journalist might report the fashion trend toward vermilion based on the mode.
Mark Twain was famously quoted as saying, "There are three kinds of lies: Lies, damn lies, and statistics." By mastering the basics of calculating averages, correctly using the mean, median, and mode, and communicating the methodology to the reader, journalists will be able to present accurate facts and figures. Not to mention, they'll avoid being tarred with the famous author's brush!