How to find the average. Entertaining math

The features of the units of statistical aggregates are different in their meaning, for example, the wages of workers of the same profession of any enterprise are not the same for the same period of time, the prices on the market for the same products are different, the yield of agricultural crops in the farms of the region, etc. Therefore, in order to determine the value of the characteristic characteristic of the entire studied set of units, the average values ​​are calculated.
average value it is a generalizing characteristic of a set of individual values ​​of a certain quantitative characteristic.

An aggregate studied by a quantitative criterion consists of individual values; they are influenced by both common causes and individual conditions. In the mean, the deviations characteristic of individual values ​​are extinguished. The average, being a function of a set of individual values, represents the entire set as one value and reflects the common that is inherent in all of its units.

The average calculated for populations consisting of qualitatively homogeneous units is called typical secondary... For example, you can calculate the average monthly salary of an employee of a particular professional group (miner, librarian doctor). Of course, the levels of monthly wages of miners, due to the difference in their qualifications, length of service, time worked per month and many other factors, differ from each other, and from the level of average wages. However, the average level reflects the main factors that affect the level of wages, and mutually compensate for differences that arise due to the individual characteristics of the employee. The average wage reflects the typical wage level for a given type of worker. Obtaining a typical average should be preceded by an analysis of how the given population is qualitatively homogeneous. If the aggregate consists of separate parts, it should be divided into typical groups (average temperature in the hospital).

The means used as characteristics for heterogeneous populations are called system averages... For example, the average gross domestic product (GDP) per capita, the average consumption of various groups of goods per person, and other similar values ​​that represent generalizing characteristics of the state as a single economic system.

The average should be calculated for populations containing a sufficiently large number of units. Compliance with this condition is necessary in order for the law of large numbers to come into force, as a result of which random deviations of individual values ​​from the general trend are mutually canceled out.

Types of averages and how to calculate them

The choice of the type of average is determined by the economic content of a certain indicator and initial data. However, any average value should be calculated so that when it replaces each variant of the averaged feature, the final, generalizing, or, as it is usually called, defining indicator, which is related to the average indicator. For example, when replacing the actual speeds on separate sections of the path, their average speed the total distance traveled should not change vehicle at the same time; when the actual wages of individual employees of the enterprise are replaced by the average wages, the wage fund should not change. Consequently, in each specific case, depending on the nature of the available data, there is only one true average value of the indicator, which is adequate to the properties and essence of the studied socio-economic phenomenon.
The most commonly used are arithmetic mean, harmonic mean, geometric mean, root-mean-square and cubic mean.
The listed averages belong to the class power-law averages and are combined by the general formula:
,
where is the average value of the investigated feature;
m - indicator of the degree of the average;
- the current value (variant) of the averaged attribute;
n is the number of features.
Depending on the value of the exponent m, the following types of power means are distinguished:
at m = -1 - average harmonic;
at m = 0 - geometric mean;
for m = 1 - arithmetic mean;
for m = 2 - root-mean-square;
with m = 3 - average cubic.
Using the same initial data, the larger the exponent m in the above formula, the larger the mean value:
.
This property of power averages to increase with an increase in the exponent of the determining function is called the rule of majorization of means.
Each of the marked averages can take two forms: simple and weighted.
Simple medium shape it is used when the average is calculated from primary (ungrouped) data. Weighted form- when calculating the average for secondary (grouped) data.

Arithmetic mean

The arithmetic mean is used when the volume of the population is the sum of all individual values ​​of the variable attribute. It should be noted that if the type of the average is not indicated, the arithmetic mean is implied. Its logical formula is:

Simple arithmetic mean calculated by ungrouped data according to the formula:
or ,
where are the individual values ​​of the attribute;
j is the ordinal number of the observation unit, which is characterized by the value;
N is the number of observation units (population size).
Example. In the lecture "Summary and grouping of statistical data" the results of observation of the work experience of a team of 10 people were considered. Let's calculate the average work experience of the workers of the brigade. 5, 3, 5, 4, 3, 4, 5, 4, 2, 4.

According to the formula arithmetic mean simple are calculated also chronological averages if the time intervals for which the characteristic values ​​are presented are equal.
Example. The volume of products sold for the first quarter amounted to 47 den. units, for the second 54, for the third 65 and for the fourth 58 days. units The average quarterly turnover is (47 + 54 + 65 + 58) / 4 = 56 den. units
If moment indicators are given in the chronological series, then when calculating the average, they are replaced by half-sums of values ​​at the beginning and end of the period.
If there are more than two moments and the intervals between them are equal, then the average is calculated using the formula for the average chronological

,
where n is the number of times
In the case when the data is grouped by characteristic values (i.e., a discrete variational distribution series is constructed) with mean arithmetic weighted is calculated using either frequencies or frequencies of observation of specific values ​​of a feature, the number of which (k) is significant less number observations (N).
,
,
where k is the number of groups of the variation series,
i - number of the group of the variation series.
Since, a, we get the formulas used for practical calculations:
and
Example. Let's calculate the average seniority of working teams for the grouped row.
a) using frequencies:

b) using frequencies:

In the case when the data is grouped by intervals , i.e. are presented in the form of interval series of distribution, when calculating the arithmetic mean, the middle of the interval is taken as the attribute value, based on the assumption of a uniform distribution of population units in this interval. The calculation is carried out according to the formulas:
and
where is the middle of the interval:,
where and are the lower and upper boundaries of the intervals (provided that the upper boundary of this interval coincides with the lower boundary of the next interval).

Example. Let us calculate the arithmetic mean of the interval variation series built on the basis of the results of the study of the annual wages of 30 workers (see the lecture "Summary and grouping of statistical data").
Table 1 - Interval variation series of distribution.

Intervals, UAH

Frequency, people

Frequency,

The middle of the interval,

600-700
700-800
800-900
900-1000
1000-1100
1100-1200

3
6
8
9
3
1

0,10
0,20
0,267
0,30
0,10
0,033

(600+700):2=650
(700+800):2=750
850
950
1050
1150

1950
4500
6800
8550
3150
1150

65
150
226,95
285
105
37,95

hrn. or hrn.
The arithmetic means calculated on the basis of the initial data and interval variation series may not coincide due to the uneven distribution of the attribute values ​​within the intervals. In this case, for a more accurate calculation of the weighted arithmetic mean, not the mean of the intervals, but the simple arithmetic means calculated for each group should be used ( group averages). The average calculated from the group average using a weighted calculation formula is called general average.
The arithmetic mean has a number of properties.
1. The sum of deviations of the variant from the mean is equal to zero:
.
2. If all the values ​​of the variant increase or decrease by the amount A, then the average value also increases or decreases by the same amount A:

3. If each option is increased or decreased by B times, then the average value will also increase or decrease by the same number of times:
or
4. The sum of the products of the variant by the frequencies is equal to the product of the average value by the sum of the frequencies:

5. If all frequencies are divided or multiplied by any number, then the arithmetic mean will not change:

6) if in all intervals the frequencies are equal to each other, then the weighted arithmetic mean is equal to the simple arithmetic mean:
,
where k is the number of groups of the variation series.

Using the properties of the mean makes it easier to calculate.
Suppose that all options (x) are first reduced by the same number A, and then reduced by B times. The greatest simplification is achieved when the value of the middle of the interval with the highest frequency is selected as A, and the value of the interval (for rows with equal intervals) is selected as B. The quantity A is called the origin, therefore this method of calculating the average is called way b ohm counting from conditional zero or way of moments.
After such a transformation, we obtain a new variational distribution series, the variants of which are equal. Their arithmetic mean called moment first order, is expressed by the formula and according to the second and third properties of the arithmetic mean is equal to the average of the original options, reduced first by A, and then by B times, i.e.
For getting real average(the average of the initial series), you need to multiply the moment of the first order by B and add A:

The calculation of the arithmetic mean by the method of moments is illustrated by the data in Table. 2.
Table 2 - Distribution of workers of the enterprise shop by length of service


Work experience, years

Amount of workers

Middle of the interval

0 – 5
5 – 10
10 – 15
15 – 20
20 – 25
25 – 30

12
16
23
28
17
14

2,5
7,5
12,7
17,5
22,5
27,5

15
-10
-5
0
5
10

3
-2
-1
0
1
2

36
-32
-23
0
17
28

Find the moment of the first order ... Then, knowing that A = 17.5, and B = 5, we calculate the average length of service of the shop workers:
years

Average harmonic
As shown above, the arithmetic mean is used to calculate the average value of a feature in cases where its variants x and their frequency f are known.
If the statistical information does not contain frequencies f for individual variants x of the population, but is presented as their product, the formula is applied average harmonic weighted... To calculate the average, let us denote where. Substituting these expressions into the formula for the arithmetic weighted average, we obtain the formula for the harmonic weighted average:
,
where is the volume (weight) of the values ​​of the indicator attribute in the interval with the number i (i = 1,2, ..., k).

Thus, the harmonic mean is used in cases where not the options themselves are subject to summation, but their reciprocal values: .
In cases where the weight of each option is equal to one, i.e. the individual values ​​of the inverse attribute occur once, it is applied average harmonic simple:
,
where are individual variants of the opposite sign, occurring one time at a time;
N is the number of options.
If there are harmonic averages for two parts of the population and there are harmonic averages, then the total average for the entire population is calculated by the formula:

and called weighted harmonic mean from group means.

Example. In the course of trading on the currency exchange, three transactions were concluded in the first hour of work. Data on the amount of the hryvnia sale and the hryvnia exchange rate in relation to the US dollar are given in table. 3 (columns 2 and 3). Determine the average exchange rate of the hryvnia against the US dollar for the first hour of trading.
Table 3 - Data on the course of trading on the currency exchange

The average dollar rate is determined by the ratio of the amount of hryvnia sold during all transactions to the amount of dollars acquired as a result of the same transactions. The total amount of the hryvnia sale is known from column 2 of the table, and the number of dollars bought in each transaction is determined by dividing the amount of the hryvnia sale by its rate (column 4). In total, in the course of three transactions, 22 million dollars were purchased. This means that the average hryvnia exchange rate for one dollar was
.
The resulting value is real, because replacing it with the actual hryvnia exchange rates in transactions will not change the total amount of hryvnia sales, which acts as defining indicator: UAH mln.
If the arithmetic mean was used for the calculation, i.e. hryvnia, then at the exchange rate for the purchase of $ 22 million. it would be necessary to spend 110.66 million hryvnyas, which does not correspond to reality.

Geometric mean
The geometric mean is used to analyze the dynamics of phenomena and allows you to determine the average growth rate. When calculating the geometric mean, the individual values ​​of the feature represent the relative indicators of dynamics, built in the form of chain quantities, as the ratio of each level to the previous one.
The geometric mean simple is calculated by the formula:
,
where is the sign of the work,
N is the number of averaged values.
Example. The number of registered crimes over 4 years increased 1.57 times, including for the 1st - 1.08 times, for the 2nd - 1.1 times, for the 3rd - 1.18 times and for the 4th - 1.12 times. Then the average annual growth rate of the number of crimes is:, i.e. the number of registered crimes grew by an average of 12% annually.

1,8
-0,8
0,2
1,0
1,4

1
3
4
1
1

3,24
0,64
0,04
1
1,96

3,24
1,92
0,16
1
1,96

To calculate the weighted mean square, we determine and enter into the table and. Then the average value of deviations of the length of products from a given norm is equal to:

The arithmetic mean in this case would be unsuitable, since the result would be zero deviation.
The application of the root mean square will be discussed later in terms of variation.

The most common form of statistical indicators used in socio-economic research is the average value, which is a generalized quantitative characteristic of an attribute of a statistical population. Average values ​​are, as it were, "representatives" of the entire series of observations. It is possible to determine the average in many cases through the initial ratio of the average (ISC) or its logical formula:. So, for example, to calculate the average wage of employees of an enterprise, it is necessary to divide the total wage fund by the number of employees: The numerator of the initial ratio of the average is its defining indicator. For the average wage, such a defining indicator is the payroll. For each indicator used in socioeconomic analysis, only one true baseline ratio can be compiled to calculate the average. It should also be added that in order to more accurately estimate the standard deviation for small samples (with the number of elements less than 30), in the denominator of the expression under the root it is necessary to use not n, but n- 1.

The concept and types of average values

Average value is a generalizing indicator of a statistical population, which extinguishes individual differences in the values ​​of statistical quantities, allowing you to compare different populations with each other. Exists 2 class averages: power-law and structural. Structural averages include fashion and median , but most often used power averagesdifferent types.

Power averages

Power means can be simple and weighted.

A simple average is calculated when there are two or more ungrouped statistics, arranged in no particular order according to the following general formula average power (for different values ​​of k (m)):

The weighted average is calculated from the grouped statistics using the following general formula:

Where x - the average value of the studied phenomenon; x i - i -th variant of the averaged feature;

f i is the weight of the i -th option.

Where X - the values ​​of individual statistical quantities or the middle of the grouping intervals;
m is an exponent, on the value of which the following types of power-law mean values ​​depend:
at m = -1 average harmonic;
for m = 0, the geometric mean;
for m = 1, the arithmetic mean;
for m = 2 root mean square;
for m = 3, the average is cubic.

Using the general formulas of simple and weighted averages for different exponents of m, we obtain particular formulas of each type, which will be further considered in detail.

Arithmetic mean

Arithmetic mean - the initial moment of the first order, the mathematical expectation of the values ​​of a random variable with a large number of tests;

The arithmetic mean is the most commonly used average, which is obtained by substituting m = 1 into the general formula. Arithmetic mean simple looks like this:

or

Where X - values ​​of quantities for which it is necessary to calculate the average value; N is the total number of X values ​​(the number of units in the studied population).

For example, a student passed 4 exams and received the following grades: 3, 4, 4 and 5. Calculate average score according to the simple arithmetic mean formula: (3 + 4 + 4 + 5) / 4 = 16/4 = 4. Arithmetic mean weighted looks like this:

Where f is the number of quantities with the same X value (frequency). > For example, a student passed 4 exams and received the following grades: 3, 4, 4 and 5. Calculate the average score using the arithmetic weighted average formula: (3 * 1 + 4 * 2 + 5 * 1) / 4 = 16/4 = 4 ... If the X values ​​are specified as intervals, then the midpoints of the X intervals are used for calculations, which are determined as the half-sum of the upper and lower boundaries of the interval. And if the X interval does not have a lower or upper boundary (open interval), then the range (the difference between the upper and lower boundaries) of the neighboring X interval is used to find it. For example, the company has 10 employees with up to 3 years of experience, 20 with 3 to 5 years of experience, 5 employees with more than 5 years of experience. Then we calculate the average length of service of employees using the formula of the arithmetic weighted average, taking as X the middle of the intervals of experience (2, 4 and 6 years): (2 * 10 + 4 * 20 + 6 * 5) / (10 + 20 + 5) = 3.71 years.

AVERAGE function

This function calculates the average (arithmetic) of its arguments.

AVERAGE (number1, number2, ...)

Number1, number2, ... are 1 to 30 arguments for which the average is calculated.

Arguments must be numbers or names, arrays or references containing numbers. If the argument, which is an array or a reference, contains text, boolean values, or empty cells, then those values ​​are ignored; however, cells that contain null values ​​are counted.

AVERAGE function

Calculates the average arithmetic values specified in the argument list. In addition to numbers, text and logical values ​​such as TRUE and FALSE can be used in the calculation.

AVERAGE (value1, value2, ...)

Value1, value2, ... are 1 to 30 cells, cell ranges, or values ​​for which the average is calculated.

Arguments must be numbers, names, arrays, or references. Arrays and links containing text are interpreted as 0 (zero). Empty text ("") is interpreted as 0 (zero). Arguments containing the value TRUE are interpreted as 1, Arguments containing the value FALSE are interpreted as 0 (zero).

The arithmetic mean is used most often, but there are cases when it is necessary to use other types of averages. We will consider such cases further.

Average harmonic

Harmonic mean to determine the average sum of reciprocals;

Average harmonic it is used when the original data does not contain the frequencies f for individual values ​​of X, but are presented as their product Xf. Denoting Xf = w, we express f = w / X, and substituting these designations into the formula for the arithmetic weighted average, we obtain the formula for the harmonic weighted average:

Thus, the weighted average harmonic is used when the frequencies f are unknown, but w = Xf is known. In those cases when all w = 1, that is, the individual values ​​of X occur 1 time, the formula for the average harmonic simple is applied: or For example, a car was driving from point A to point B at a speed of 90 km / h, and back at a speed of 110 km / h. To determine the average speed, we apply the simple harmonic average formula, since in the example the distance w 1 = w 2 is given (the distance from point A to point B is the same as from B to A), which is equal to the product of speed (X) and time ( f). Average speed = (1 + 1) / (1/90 + 1/110) = 99 km / h.

SRGARM function

Returns the harmonic mean of a data set. The harmonic mean is the reciprocal of the arithmetic mean of reciprocals.

SRGARM (number1; number2; ...)

Number1, number2, ... are 1 to 30 arguments for which the average is calculated. You can use an array or an array reference instead of arguments separated by semicolons.

The harmonic mean is always less than the geometric mean, which is always less than the arithmetic mean.

Geometric mean

Geometric mean for assessing the average growth rate of random variables, finding the value of a feature equidistant from the minimum and maximum values;

Geometric mean used in determining average relative changes. The geometric mean gives the most exact result averaging, if the task is to find such a value of X, which would be equidistant from both the maximum and minimum values ​​of X. For example, between 2005 and 2008inflation index in Russia it was: in 2005 - 1.109; in 2006 - 1,090; in 2007 - 1,119; in 2008 - 1.133. Since the inflation index is a relative change (dynamics index), the average value should be calculated using the geometric mean: (1.109 * 1.090 * 1.119 * 1.133) ^ (1/4) = 1.1126, that is, for the period from 2005 to 2008 prices grew by an average of 11.26% annually. An erroneous calculation using the arithmetic mean would give an incorrect result of 11.28%.

SRGEOM function

Returns the geometric mean of an array or interval of positive numbers. For example, you can use the SRGEOM function to calculate average growth rates when you specify a variable rate compound income.

SRGEOM (number1; number2; ...)

Number1, number2, ... are between 1 and 30 arguments for which the geometric mean is calculated. You can use an array or an array reference instead of arguments separated by semicolons.

Root mean square

The root mean square is the starting moment of the second order.

Root mean square it is used in cases where the initial values ​​of X can be both positive and negative, for example, when calculating average deviations. The main application of the quadratic mean is to measure the variation in X values.

Average cubic

Average cubic - the initial moment of the third order.

Average cubic it is used extremely rarely, for example, when calculating the poverty indices of the population for developing countries (INN-1) and for developed countries (INN-2), proposed and calculated by the UN.

When starting to talk about average values, they most often remember how they graduated from school and entered educational institution... Then, according to the certificate, the average score was calculated: all the grades (both good and not so good) were added, the amount received was divided by their number. This is how the simplest form of the average is calculated, which is called the simple arithmetic mean. In practice, various types of averages are used in statistics: arithmetic, harmonic, geometric, quadratic, structural averages. One or another of their types is used depending on the nature of the data and the objectives of the study.

average value is the most common statistical indicator, with the help of which a generalizing characteristic of a set of phenomena of the same type is given according to one of the varying signs. It shows the level of the trait per unit of the population. With the help of average values, a comparison of various aggregates is carried out according to varying characteristics, the patterns of development of phenomena and processes of social life are studied.

In statistics, two classes of means are used: power (analytical) and structural. The latter are used to characterize the structure of the variational series and will be discussed further in Ch. eight.

The group of power averages includes the arithmetic mean, harmonic, geometric, quadratic. Individual formulas for calculating them can be reduced to a form common to all power averages, namely

where m is the exponent of the power-law mean: for m = 1 we obtain the formula for calculating the arithmetic mean, for m = 0 - the geometric mean, m = -1 - the harmonic mean, with m = 2 - the mean square;

x i - options (values ​​that the attribute takes);

f i - frequencies.

The main condition under which power averages can be used in statistical analysis is the homogeneity of the population, which should not contain initial data that differ sharply in their quantitative value (in the literature, they are called anomalous observations).

Let us demonstrate the importance of this condition with the following example.

Example 6.1. Let's calculate the average salary of employees of a small enterprise.

Table 6.1. Employee wages
P / p No. Salary, rub. P / p No. Salary, rub.
1 5 950 11 7 000
2 6 790 12 5 950
3 6 790 13 6 790
4 5 950 14 5 950
5 7 000 5 6 790
6 6 790 16 7 000
7 5 950 17 6 790
8 7 000 18 7 000
9 6 790 19 7 000
10 6 790 20 5 950

To calculate the average wage, it is necessary to sum up the wages accrued to all employees of the enterprise (i.e. find the payroll), and divide by the number of employees:


And now we add to our totality only one person (the director of this enterprise), but with a salary of 50,000 rubles. In this case, the calculated average will be completely different:

As you can see, it exceeds 7,000 rubles, etc. it is greater than all the values ​​of the characteristic, with the exception of one single observation.

In order for such cases not to happen in practice, and the average would not lose its meaning (in Example 6.1 it no longer fulfills the role of a generalizing characteristic of the population, which should be), when calculating the average, anomalous, sharply distinguished observations should be excluded or excluded from the analysis and topics by doing so, make the population homogeneous, or divide the population into homogeneous groups and calculate the average values ​​for each group and analyze not the overall average, but the group averages.

6.1. Arithmetic mean and its properties

The arithmetic mean is calculated either as a simple or as a weighted value.

When calculating the average wage according to the table of example 6.1, we added up all the values ​​of the attribute and divided by their number. We write the course of our calculations in the form of a formula for the arithmetic mean of a simple

where x i - options (individual values ​​of the feature);

n is the number of units in the aggregate.

Example 6.2. Now let's group our data from the table in Example 6.1, etc. Let us construct a discrete variational series of distribution of workers according to the level of wages. The grouping results are presented in the table.

Let's write an expression for calculating the average wage level in a more compact form:

In example 6.2, the formula for the arithmetic weighted average was applied

where f i - frequencies showing how many times the value of the attribute x i y of the units of the population occurs.

It is convenient to calculate the arithmetic weighted average in the table, as shown below (Table 6.3):

Table 6.3. Calculation of the arithmetic mean in a discrete series
Initial data Calculated indicator
salary, rub. number of employees, people payroll, rub.
x i f i x i f i
5 950 6 35 760
6 790 8 54 320
7 000 6 42 000
Total 20 132 080

It should be noted that the simple arithmetic mean is used in cases where the data is not grouped or grouped, but all frequencies are equal.

Often the observation results are presented in the form of an interval distribution series (see table in example 6.4). Then, when calculating the average, the midpoints of the intervals are taken as x i. If the first and last intervals are open (do not have one of the boundaries), then they are conventionally "closed", taking the value of the adjacent interval as the value of this interval, and so on. the first is closed based on the value of the second, and the last - according to the value of the penultimate one.

Example 6.3. Based on the results of a sample survey of one of the population groups, we will calculate the size of the average per capita money income.

In the above table, the middle of the first interval is 500. Indeed, the value of the second interval is 1000 (2000-1000); then bottom line the first is equal to 0 (1000-1000), and its middle is 500. We do the same with the last interval. We take 25,000 as its middle: the value of the penultimate interval is 10,000 (20,000-10,000), then its upper limit is 30,000 (20,000 + 10,000), and the middle, respectively, is 25,000.

Table 6.4. Calculation of the arithmetic mean in the interval series
Average per capita monetary income, rubles per month Total population,% f i Midpoints of intervals x i x i f i
Up to 1,000 4,1 500 2 050
1 000-2 000 8,6 1 500 12 900
2 000-4 000 12,9 3 000 38 700
4 000-6 000 13,0 5 000 65 000
6 000-8 000 10,5 7 000 73 500
8 000-10 000 27,8 9 000 250 200
10 000-20 000 12,7 15 000 190 500
20,000 and above 10,4 25 000 260 000
Total 100,0 - 892 850

Then the average per capita monthly income will be

The most common type of average is the arithmetic mean.

Simple arithmetic mean

A simple arithmetic mean is an average term, in determining which the total volume of a given feature in the data is equally distributed among all the units included in this set. So, the average annual output per employee is the amount of output that would fall on each employee if the entire volume of output was equally distributed among all employees of the organization. The arithmetic average simple value is calculated by the formula:

Simple arithmetic mean- Equal to the ratio of the sum of individual values ​​of a feature to the number of features in the aggregate

Example 1 ... A team of 6 workers receives 3 3.2 3.3 3.5 3.8 3.1 thousand rubles a month.

Find Average Salary
Solution: (3 + 3.2 + 3.3 +3.5 + 3.8 + 3.1) / 6 = 3.32 thousand rubles.

Weighted arithmetic mean

If the volume of the data set is large and represents a distribution series, then a weighted arithmetic mean is calculated. This is how the weighted average price per unit of production is determined: the total cost of production (the sum of the products of its quantity by the price of a unit of production) is divided by the total amount of production.

We represent this in the form of the following formula:

Weighted arithmetic mean- is equal to the ratio (the sum of the products of the value of a feature to the frequency of repetition of a given feature) to (the sum of the frequencies of all features) .It is used when the variants of the studied population occur an unequal number of times.

Example 2 ... Find the average monthly wage of a workshop worker

Average wages can be obtained by dividing the total wages by the total number of workers:

Answer: 3.35 thousand rubles.

Arithmetic mean for interval series

When calculating the arithmetic mean for an interval variation series, first determine the average for each interval, as the half-sum of the upper and lower boundaries, and then - the average of the entire series. In the case of open intervals, the value of the lower or upper interval is determined by the size of the intervals adjacent to them.

Averages calculated from interval series are approximate.

Example 3... Determine the average age of evening students.

Averages calculated from interval series are approximate. The degree of their approximation depends on the extent to which the actual distribution of population units within the interval approaches uniform.

When calculating averages, not only absolute, but also relative values ​​(frequency) can be used as weights:

The arithmetic mean has a number of properties that more fully reveal its essence and simplify the calculation:

1. The product of the average by the sum of the frequencies is always equal to the sum of the products of the variant by the frequencies, i.e.

2.Medium arithmetic sum of varying quantities is equal to the sum of the arithmetic mean of these quantities:

3. The algebraic sum of the deviations of the individual values ​​of the attribute from the mean is equal to zero:

4. The sum of the squares of the deviations of the options from the mean is less than the sum of the squares of the deviations from any other arbitrary value, i.e.

In statistics, various types of averages are used, which are divided into two large classes:

Power averages (harmonic mean, geometric mean, arithmetic mean, mean square, cubic mean);

Structural means (fashion, median).

To calculate power averages all available characteristic values ​​must be used. Fashion and median are determined only by the distribution structure, therefore they are called structural, positional averages. The median and mode are often used as an average characteristic in those populations where the calculation of the power mean is impossible or impractical.

The most common type of average is the arithmetic mean. Under arithmetic mean the meaning of a feature is understood that each unit of the population would have if the total of all values ​​of the feature were distributed evenly among all units of the population. The calculation of this value is reduced to the summation of all values ​​of the variable attribute and dividing the resulting sum by the total number of units in the population. For example, five workers fulfilled an order for the manufacture of parts, while the first made 5 parts, the second - 7, the third - 4, the fourth - 10, the fifth - 12. Since in the initial data, the value of each option was encountered only once, to determine

To determine the average output of one worker, a simple arithmetic mean formula should be applied:

that is, in our example, the average output of one worker is equal to

Along with the simple arithmetic mean, they study weighted arithmetic mean. For example, let's calculate the average age of students in a group of 20, whose ages range from 18 to 22, where xi- variants of the averaged feature, fi- frequency, which shows how many times it occurs i-th value in aggregate (Table 5.1).

Table 5.1

Average age of students

Applying the formula for the arithmetic weighted average, we get:


There is a certain rule for choosing the weighted arithmetic mean: if there is a series of data on two indicators, for one of which it is necessary to calculate

average, and at the same time the numerical values ​​of the denominator of its logical formula are known, and the values ​​of the numerator are unknown, but can be found as the product of these indicators, then the average value should be calculated using the formula of the weighted arithmetic mean.

In some cases, the nature of the initial statistical data is such that the calculation of the arithmetic mean loses its meaning and the only generalizing indicator can be only another type of average - average harmonic. At present, the computational properties of the arithmetic mean have lost their relevance in the calculation of generalizing statistical indicators in connection with the widespread introduction of electronic computing technology. Big practical significance acquired an average harmonic value, which also happens to be simple and weighted. If the numerical values ​​of the numerator of a logical formula are known, and the values ​​of the denominator are unknown, but can be found as a quotient division of one indicator by another, then the average value is calculated using the harmonic weighted average formula.

For example, let it be known that the car traveled the first 210 km at 70 km / h, and the remaining 150 km at 75 km / h. It is impossible to determine the average speed of a car throughout the entire journey of 360 km using the arithmetic mean formula. Since the options are speeds in individual sections xj= 70 km / h and X2= 75 km / h, and the weights (fi) are the corresponding segments of the path, then the products of the options by the weights will have neither physical nor economic meaning. In this case, the quotients from dividing the sections of the path into the corresponding speeds (options xi), that is, the time spent on passing individual sections of the path (fi / xi). If the segments of the path are denoted by fi, then the entire path will be expressed as? Fi, and the time spent on the entire path - how? fi / xi , Then the average speed can be found as the quotient of dividing the entire path by the total time spent:

In our example, we get:

If, when using the average harmonic weights of all options (f) are equal, then instead of the weighted one, you can use simple (unweighted) harmonic mean:

where xi are individual options; n- the number of variants of the averaged feature. In the speed example, the simple harmonic average could be applied if the path segments traveled at different speeds were equal.

Any average value should be calculated so that when it replaces each variant of the averaged feature, the value of some final, generalizing indicator, which is associated with the averaged indicator, does not change. So, when replacing the actual speeds on individual sections of the path with their average value (average speed), the total distance should not change.

The form (formula) of the average value is determined by the nature (mechanism) of the relationship of this final indicator with the average, therefore the final indicator, the value of which should not change when replacing the options with their average value, is called defining indicator. To derive the formula for the average, you need to compose and solve an equation using the relationship of the averaged indicator with the determining one. This equation is constructed by replacing the variants of the averaged attribute (indicator) with their average value.

In addition to the arithmetic mean and harmonic mean, other types (forms) of the mean are used in statistics. They are all special cases. power-law average. If we calculate all kinds of power-law averages for the same data, then the values

they will turn out to be the same, here the rule applies majo-ranks medium. With an increase in the exponent of averages, the average value itself also increases. The formulas most often used in practical research for calculating various types of power-law mean values ​​are presented in Table. 5.2.

Table 5.2

Types of power averages


Geometric mean is applied when available. n growth factors, while the individual values ​​of the feature are, as a rule, the relative values ​​of the dynamics, built in the form of chain quantities, as a relation to the previous level of each level in the series of dynamics. The average thus characterizes the average growth rate. Average geometric simple calculated by the formula

Formula geometric weighted mean looks like this:

The above formulas are identical, but one is applied at the current rates or growth rates, and the second - at the absolute values ​​of the series levels.

Root mean square is used when calculating with the values ​​of square functions, is used to measure the degree of variability of individual values ​​of a feature around the arithmetic mean in distribution series and is calculated by the formula

Weighted mean square calculated using a different formula:

Average cubic is used when calculating with the values ​​of cubic functions and is calculated by the formula

weighted average cubic:

All the averages discussed above can be presented in the form of a general formula:

where is the average value; - individual value; n- the number of units of the studied population; k Is an exponent that determines the type of average.

When using the same initial data, the more k in the general formula of the power-law average, the larger the average value. It follows from this that there is a regular relationship between the values ​​of the power averages:

The average values ​​described above give a generalized idea of ​​the studied aggregate, and from this point of view, their theoretical, applied and cognitive value is indisputable. But it happens that the value of the average does not coincide with any of the really existing options, therefore, in addition to the considered averages in the statistical analysis, it is advisable to use the values ​​of specific options, which occupy a quite definite position in an ordered (ranked) series of values ​​of a feature. Among these values, the most common are structural, or descriptive, medium- mode (Mo) and median (Me).

Fashion- the value of a feature that is most often found in a given population. With regard to the variation series, the mode is the most frequent value of the ranked series, i.e., the variant with the highest frequency. Fashion can be used to determine which stores are more frequently visited, the most common price for a product. It shows the size of a feature characteristic of a significant part of the population, and is determined by the formula

where x0 is the lower boundary of the interval; h- the size of the interval; fm- interval frequency; fm_ 1 - frequency of the previous interval; fm + 1 - frequency of the next interval.

Median is called the variant located in the center of the ranked row. The median divides the row into two equal parts in such a way that the same number of population units are located on either side of it. At the same time, in one half of the units of the population, the value of the varying attribute is less than the median, in the other - more than it. The median is used when studying an element whose value is greater than or equal to or simultaneously less than or equal to half of the elements of the distribution series. The median gives a general idea of ​​where the values ​​of the trait are concentrated, in other words, where their center is located.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative boundary of the values ​​of the varying attribute, which half of the population units have. The problem of finding the median for a discrete variation series is easy to solve. If we assign ordinal numbers to all units of the series, then the ordinal number of the median variant is determined as (n +1) / 2 with an odd number of members n. If the number of members of the series is an even number, then the median will be the average of the two options with ordinal numbers n/ 2 and n/ 2 + 1.

When determining the median in the interval variation series, first the interval in which it is located (median interval) is determined. This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds the half-sum of all frequencies of the series. The median of the interval variation series is calculated using the formula

Where X0- the lower boundary of the interval; h- the size of the interval; fm- interval frequency; f- the number of members of the series;

M -1 - the sum of the accumulated members of the series preceding this one.

Along with the median for more full characteristics the structures of the studied population also use other meanings of options that occupy a well-defined position in the ranked series. These include quartiles and deciles. Quartiles divide the series by the sum of frequencies into 4 equal parts, and deciles into 10 equal parts. There are three quartiles and nine deciles.

The median and mode, in contrast to the arithmetic mean, do not extinguish individual differences in the values ​​of the varying attribute and therefore are additional and very important characteristics of the statistical population. In practice, they are often used instead of or alongside the average. It is especially advisable to calculate the median and mode in those cases when the studied population contains a certain number of units with a very large or very small value of the varying attribute. These, not very typical for the aggregate values ​​of the options, influencing the value of the arithmetic mean, do not affect the values ​​of the median and mode, which makes the latter very valuable indicators for economic and statistical analysis.