The features of the units of statistical aggregates are different in their meaning, for example, the wages of workers of the same profession of any enterprise are not the same for the same period of time, the prices on the market for the same products are different, the yield of agricultural crops in the farms of the region, etc. Therefore, in order to determine the value of the characteristic characteristic of the entire studied set of units, the average values ​​are calculated.
average value it is a generalizing characteristic of a set of individual values ​​of a certain quantitative characteristic.

An aggregate studied by a quantitative criterion consists of individual values; they are influenced by both common causes and individual conditions. In the mean, the deviations characteristic of individual values ​​are extinguished. The average, being a function of a set of individual values, represents the entire set as one value and reflects the common that is inherent in all of its units.

The average calculated for populations consisting of qualitatively homogeneous units is called typical secondary... For example, you can calculate the average monthly salary of an employee of a particular professional group (miner, librarian doctor). Of course, the levels of monthly wages of miners, due to the difference in their qualifications, length of service, time worked per month and many other factors, differ from each other, and from the level of average wages. However, the average level reflects the main factors that affect the level of wages, and mutually compensate for differences that arise due to the individual characteristics of the employee. The average wage reflects the typical wage level for a given type of worker. Obtaining a typical average should be preceded by an analysis of how the given population is qualitatively homogeneous. If the aggregate consists of separate parts, it should be divided into typical groups (average temperature in the hospital).

The means used as characteristics for heterogeneous populations are called system averages... For example, the average gross domestic product (GDP) per capita, the average consumption of various groups of goods per person, and other similar values ​​that represent generalizing characteristics of the state as a single economic system.

The average should be calculated for populations containing a sufficiently large number of units. Compliance with this condition is necessary in order for the law of large numbers to come into force, as a result of which random deviations of individual values ​​from the general trend are mutually canceled out.

Types of averages and how to calculate them

The choice of the type of average is determined by the economic content of a certain indicator and initial data. However, any average value should be calculated so that when it replaces each variant of the averaged feature, the final, generalizing, or, as it is usually called, defining indicator, which is related to the average indicator. For example, when replacing the actual speeds on separate sections of the path, their average speed the total distance traveled should not change vehicle at the same time; when the actual wages of individual employees of the enterprise are replaced by the average wages, the wage fund should not change. Consequently, in each specific case, depending on the nature of the available data, there is only one true average value of the indicator, which is adequate to the properties and essence of the studied socio-economic phenomenon.
The most commonly used are arithmetic mean, harmonic mean, geometric mean, root-mean-square and cubic mean.
The listed averages belong to the class power-law averages and are combined by the general formula:
where is the average value of the investigated feature;
m - indicator of the degree of the average;
- the current value (variant) of the averaged attribute;
n is the number of features.
Depending on the value of the exponent m, the following types of power means are distinguished:
at m = -1 - average harmonic;
at m = 0 - geometric mean;
for m = 1 - arithmetic mean;
for m = 2 - root-mean-square;
with m = 3 - average cubic.
Using the same initial data, the larger the exponent m in the above formula, the larger the mean value:
This property of power averages to increase with an increase in the exponent of the determining function is called the rule of majorization of means.
Each of the marked averages can take two forms: simple and weighted.
Simple medium shape it is used when the average is calculated from primary (ungrouped) data. Weighted form- when calculating the average for secondary (grouped) data.

Arithmetic mean

The arithmetic mean is used when the volume of the population is the sum of all individual values ​​of the variable attribute. It should be noted that if the type of the average is not indicated, the arithmetic mean is implied. Its logical formula is:

Simple arithmetic mean calculated by ungrouped data according to the formula:
or ,
where are the individual values ​​of the attribute;
j is the ordinal number of the observation unit, which is characterized by the value;
N is the number of observation units (population size).
Example. In the lecture "Summary and grouping of statistical data" the results of observation of the work experience of a team of 10 people were considered. Let's calculate the average work experience of the workers of the brigade. 5, 3, 5, 4, 3, 4, 5, 4, 2, 4.

According to the formula arithmetic mean simple are calculated also chronological averages if the time intervals for which the characteristic values ​​are presented are equal.
Example. The volume of products sold for the first quarter amounted to 47 den. units, for the second 54, for the third 65 and for the fourth 58 days. units The average quarterly turnover is (47 + 54 + 65 + 58) / 4 = 56 den. units
If moment indicators are given in the chronological series, then when calculating the average, they are replaced by half-sums of values ​​at the beginning and end of the period.
If there are more than two moments and the intervals between them are equal, then the average is calculated using the formula for the average chronological

where n is the number of times
In the case when the data is grouped by characteristic values (i.e., a discrete variational distribution series is constructed) with mean arithmetic weighted is calculated using either frequencies or frequencies of observation of specific values ​​of a feature, the number of which (k) is significant less number observations (N).
where k is the number of groups of the variation series,
i - number of the group of the variation series.
Since, a, we get the formulas used for practical calculations:
Example. Let's calculate the average seniority of working teams for the grouped row.
a) using frequencies:

b) using frequencies:

In the case when the data is grouped by intervals , i.e. are presented in the form of interval series of distribution, when calculating the arithmetic mean, the middle of the interval is taken as the attribute value, based on the assumption of a uniform distribution of population units in this interval. The calculation is carried out according to the formulas:
where is the middle of the interval:,
where and are the lower and upper boundaries of the intervals (provided that the upper boundary of this interval coincides with the lower boundary of the next interval).

Example. Let us calculate the arithmetic mean of the interval variation series built on the basis of the results of the study of the annual wages of 30 workers (see the lecture "Summary and grouping of statistical data").
Table 1 - Interval variation series of distribution.

Intervals, UAH

Frequency, people


The middle of the interval,







hrn. or hrn.
The arithmetic means calculated on the basis of the initial data and interval variation series may not coincide due to the uneven distribution of the attribute values ​​within the intervals. In this case, for a more accurate calculation of the weighted arithmetic mean, not the mean of the intervals, but the simple arithmetic means calculated for each group should be used ( group averages). The average calculated from the group average using a weighted calculation formula is called general average.
The arithmetic mean has a number of properties.
1. The sum of deviations of the variant from the mean is equal to zero:
2. If all the values ​​of the variant increase or decrease by the amount A, then the average value also increases or decreases by the same amount A:

3. If each option is increased or decreased by B times, then the average value will also increase or decrease by the same number of times:
4. The sum of the products of the variant by the frequencies is equal to the product of the average value by the sum of the frequencies:

5. If all frequencies are divided or multiplied by any number, then the arithmetic mean will not change:

6) if in all intervals the frequencies are equal to each other, then the weighted arithmetic mean is equal to the simple arithmetic mean:
where k is the number of groups of the variation series.

Using the properties of the mean makes it easier to calculate.
Suppose that all options (x) are first reduced by the same number A, and then reduced by B times. The greatest simplification is achieved when the value of the middle of the interval with the highest frequency is selected as A, and the value of the interval (for rows with equal intervals) is selected as B. The quantity A is called the origin, therefore this method of calculating the average is called way b ohm counting from conditional zero or way of moments.
After such a transformation, we obtain a new variational distribution series, the variants of which are equal. Their arithmetic mean called moment first order, is expressed by the formula and according to the second and third properties of the arithmetic mean is equal to the average of the original options, reduced first by A, and then by B times, i.e.
For getting real average(the average of the initial series), you need to multiply the moment of the first order by B and add A:

The calculation of the arithmetic mean by the method of moments is illustrated by the data in Table. 2.
Table 2 - Distribution of workers of the enterprise shop by length of service

Work experience, years

Amount of workers

Middle of the interval

0 – 5
5 – 10
10 – 15
15 – 20
20 – 25
25 – 30






Find the moment of the first order ... Then, knowing that A = 17.5, and B = 5, we calculate the average length of service of the shop workers:

Average harmonic
As shown above, the arithmetic mean is used to calculate the average value of a feature in cases where its variants x and their frequency f are known.
If the statistical information does not contain frequencies f for individual variants x of the population, but is presented as their product, the formula is applied average harmonic weighted... To calculate the average, let us denote where. Substituting these expressions into the formula for the arithmetic weighted average, we obtain the formula for the harmonic weighted average:
where is the volume (weight) of the values ​​of the indicator attribute in the interval with the number i (i = 1,2, ..., k).

Thus, the harmonic mean is used in cases where not the options themselves are subject to summation, but their reciprocal values: .
In cases where the weight of each option is equal to one, i.e. the individual values ​​of the inverse attribute occur once, it is applied average harmonic simple:
where are individual variants of the opposite sign, occurring one time at a time;
N is the number of options.
If there are harmonic averages for two parts of the population and there are harmonic averages, then the total average for the entire population is calculated by the formula:

and called weighted harmonic mean from group means.

Example. In the course of trading on the currency exchange, three transactions were concluded in the first hour of work. Data on the amount of the hryvnia sale and the hryvnia exchange rate in relation to the US dollar are given in table. 3 (columns 2 and 3). Determine the average exchange rate of the hryvnia against the US dollar for the first hour of trading.
Table 3 - Data on the course of trading on the currency exchange

The average dollar rate is determined by the ratio of the amount of hryvnia sold during all transactions to the amount of dollars acquired as a result of the same transactions. The total amount of the hryvnia sale is known from column 2 of the table, and the number of dollars bought in each transaction is determined by dividing the amount of the hryvnia sale by its rate (column 4). In total, in the course of three transactions, 22 million dollars were purchased. This means that the average hryvnia exchange rate for one dollar was
The resulting value is real, because replacing it with the actual hryvnia exchange rates in transactions will not change the total amount of hryvnia sales, which acts as defining indicator: UAH mln.
If the arithmetic mean was used for the calculation, i.e. hryvnia, then at the exchange rate for the purchase of $ 22 million. it would be necessary to spend 110.66 million hryvnyas, which does not correspond to reality.

Geometric mean
The geometric mean is used to analyze the dynamics of phenomena and allows you to determine the average growth rate. When calculating the geometric mean, the individual values ​​of the feature represent the relative indicators of dynamics, built in the form of chain quantities, as the ratio of each level to the previous one.
The geometric mean simple is calculated by the formula:
where is the sign of the work,
N is the number of averaged values.
Example. The number of registered crimes over 4 years increased 1.57 times, including for the 1st - 1.08 times, for the 2nd - 1.1 times, for the 3rd - 1.18 times and for the 4th - 1.12 times. Then the average annual growth rate of the number of crimes is:, i.e. the number of registered crimes grew by an average of 12% annually.





To calculate the weighted mean square, we determine and enter into the table and. Then the average value of deviations of the length of products from a given norm is equal to:

The arithmetic mean in this case would be unsuitable, since the result would be zero deviation.
The application of the root mean square will be discussed later in terms of variation.

The most common form of statistical indicators used in socio-economic research is the average value, which is a generalized quantitative characteristic of an attribute of a statistical population. Average values ​​are, as it were, "representatives" of the entire series of observations. It is possible to determine the average in many cases through the initial ratio of the average (ISC) or its logical formula:. So, for example, to calculate the average wage of employees of an enterprise, it is necessary to divide the total wage fund by the number of employees: The numerator of the initial ratio of the average is its defining indicator. For the average wage, such a defining indicator is the payroll. For each indicator used in socioeconomic analysis, only one true baseline ratio can be compiled to calculate the average. It should also be added that in order to more accurately estimate the standard deviation for small samples (with the number of elements less than 30), in the denominator of the expression under the root it is necessary to use not n, but n- 1.

Example 6.1. Let's calculate the average salary of employees of a small enterprise.

Table 6.1. Employee wages
P / p No. Salary, rub. P / p No. Salary, rub.
1 5 950 11 7 000
2 6 790 12 5 950
3 6 790 13 6 790
4 5 950 14 5 950
5 7 000 5 6 790
6 6 790 16 7 000
7 5 950 17 6 790
8 7 000 18 7 000
9 6 790 19 7 000
10 6 790 20 5 950

To calculate the average wage, it is necessary to sum up the wages accrued to all employees of the enterprise (i.e. find the payroll), and divide by the number of employees:

And now we add to our totality only one person (the director of this enterprise), but with a salary of 50,000 rubles. In this case, the calculated average will be completely different:

As you can see, it exceeds 7,000 rubles, etc. it is greater than all the values ​​of the characteristic, with the exception of one single observation.

In order for such cases not to happen in practice, and the average would not lose its meaning (in Example 6.1 it no longer fulfills the role of a generalizing characteristic of the population, which should be), when calculating the average, anomalous, sharply distinguished observations should be excluded or excluded from the analysis and topics by doing so, make the population homogeneous, or divide the population into homogeneous groups and calculate the average values ​​for each group and analyze not the overall average, but the group averages.

The arithmetic mean is calculated either as a simple or as a weighted value.

When calculating the average wage according to the table of example 6.1, we added up all the values ​​of the attribute and divided by their number. We write the course of our calculations in the form of a formula for the arithmetic mean of a simple

where x i - options (individual values ​​of the feature);

n is the number of units in the aggregate.

Example 6.2. Now let's group our data from the table in Example 6.1, etc. Let us construct a discrete variational series of distribution of workers according to the level of wages. The grouping results are presented in the table.

Let's write an expression for calculating the average wage level in a more compact form:

In example 6.2, the formula for the arithmetic weighted average was applied

where f i - frequencies showing how many times the value of the attribute x i y of the units of the population occurs.

It is convenient to calculate the arithmetic weighted average in the table, as shown below (Table 6.3):

Table 6.3. Calculation of the arithmetic mean in a discrete series
Initial data Calculated indicator
salary, rub. number of employees, people payroll, rub.
x i f i x i f i
5 950 6 35 760
6 790 8 54 320
7 000 6 42 000
Total 20 132 080

It should be noted that the simple arithmetic mean is used in cases where the data is not grouped or grouped, but all frequencies are equal.

Often the observation results are presented in the form of an interval distribution series (see table in example 6.4). Then, when calculating the average, the midpoints of the intervals are taken as x i. If the first and last intervals are open (do not have one of the boundaries), then they are conventionally "closed", taking the value of the adjacent interval as the value of this interval, and so on. the first is closed based on the value of the second, and the last - according to the value of the penultimate one.

Example 6.3. Based on the results of a sample survey of one of the population groups, we will calculate the size of the average per capita money income.

In the above table, the middle of the first interval is 500. Indeed, the value of the second interval is 1000 (2000-1000); then bottom line the first is equal to 0 (1000-1000), and its middle is 500. We do the same with the last interval. We take 25,000 as its middle: the value of the penultimate interval is 10,000 (20,000-10,000), then its upper limit is 30,000 (20,000 + 10,000), and the middle, respectively, is 25,000.

Table 6.4. Calculation of the arithmetic mean in the interval series
Average per capita monetary income, rubles per month Total population,% f i Midpoints of intervals x i x i f i
Up to 1,000 4,1 500 2 050
1 000-2 000 8,6 1 500 12 900
2 000-4 000 12,9 3 000 38 700
4 000-6 000 13,0 5 000 65 000
6 000-8 000 10,5 7 000 73 500
8 000-10 000 27,8 9 000 250 200
10 000-20 000 12,7 15 000 190 500
20,000 and above 10,4 25 000 260 000
Total 100,0 - 892 850

Then the average per capita monthly income will be

Simple arithmetic mean

A simple arithmetic mean is an average term, in determining which the total volume of a given feature in the data is equally distributed among all the units included in this set. So, the average annual output per employee is the amount of output that would fall on each employee if the entire volume of output was equally distributed among all employees of the organization. The arithmetic average simple value is calculated by the formula:

Simple arithmetic mean- Equal to the ratio of the sum of individual values ​​of a feature to the number of features in the aggregate

Example 1 ... A team of 6 workers receives 3 3.2 3.3 3.5 3.8 3.1 thousand rubles a month.

Find Average Salary
Solution: (3 + 3.2 + 3.3 +3.5 + 3.8 + 3.1) / 6 = 3.32 thousand rubles.

Weighted arithmetic mean

If the volume of the data set is large and represents a distribution series, then a weighted arithmetic mean is calculated. This is how the weighted average price per unit of production is determined: the total cost of production (the sum of the products of its quantity by the price of a unit of production) is divided by the total amount of production.

We represent this in the form of the following formula:

Weighted arithmetic mean- is equal to the ratio (the sum of the products of the value of a feature to the frequency of repetition of a given feature) to (the sum of the frequencies of all features) .It is used when the variants of the studied population occur an unequal number of times.

Example 2 ... Find the average monthly wage of a workshop worker

Average wages can be obtained by dividing the total wages by the total number of workers:

Answer: 3.35 thousand rubles.

Arithmetic mean for interval series

When calculating the arithmetic mean for an interval variation series, first determine the average for each interval, as the half-sum of the upper and lower boundaries, and then - the average of the entire series. In the case of open intervals, the value of the lower or upper interval is determined by the size of the intervals adjacent to them.

Averages calculated from interval series are approximate.

Example 3... Determine the average age of evening students.

Averages calculated from interval series are approximate. The degree of their approximation depends on the extent to which the actual distribution of population units within the interval approaches uniform.

When calculating averages, not only absolute, but also relative values ​​(frequency) can be used as weights:

The arithmetic mean has a number of properties that more fully reveal its essence and simplify the calculation:

1. The product of the average by the sum of the frequencies is always equal to the sum of the products of the variant by the frequencies, i.e.

2.Medium arithmetic sum of varying quantities is equal to the sum of the arithmetic mean of these quantities:

3. The algebraic sum of the deviations of the individual values ​​of the attribute from the mean is equal to zero:

4. The sum of the squares of the deviations of the options from the mean is less than the sum of the squares of the deviations from any other arbitrary value, i.e.

that is, in our example, the average output of one worker is equal to

Along with the simple arithmetic mean, they study weighted arithmetic mean. For example, let's calculate the average age of students in a group of 20, whose ages range from 18 to 22, where xi- variants of the averaged feature, fi- frequency, which shows how many times it occurs i-th value in aggregate (Table 5.1).

Table 5.1

Average age of students

Applying the formula for the arithmetic weighted average, we get:

There is a certain rule for choosing the weighted arithmetic mean: if there is a series of data on two indicators, for one of which it is necessary to calculate

average, and at the same time the numerical values ​​of the denominator of its logical formula are known, and the values ​​of the numerator are unknown, but can be found as the product of these indicators, then the average value should be calculated using the formula of the weighted arithmetic mean.

In some cases, the nature of the initial statistical data is such that the calculation of the arithmetic mean loses its meaning and the only generalizing indicator can be only another type of average - average harmonic. At present, the computational properties of the arithmetic mean have lost their relevance in the calculation of generalizing statistical indicators in connection with the widespread introduction of electronic computing technology. Big practical significance acquired an average harmonic value, which also happens to be simple and weighted. If the numerical values ​​of the numerator of a logical formula are known, and the values ​​of the denominator are unknown, but can be found as a quotient division of one indicator by another, then the average value is calculated using the harmonic weighted average formula.

For example, let it be known that the car traveled the first 210 km at 70 km / h, and the remaining 150 km at 75 km / h. It is impossible to determine the average speed of a car throughout the entire journey of 360 km using the arithmetic mean formula. Since the options are speeds in individual sections xj= 70 km / h and X2= 75 km / h, and the weights (fi) are the corresponding segments of the path, then the products of the options by the weights will have neither physical nor economic meaning. In this case, the quotients from dividing the sections of the path into the corresponding speeds (options xi), that is, the time spent on passing individual sections of the path (fi / xi). If the segments of the path are denoted by fi, then the entire path will be expressed as? Fi, and the time spent on the entire path - how? fi / xi , Then the average speed can be found as the quotient of dividing the entire path by the total time spent:

In our example, we get:

If, when using the average harmonic weights of all options (f) are equal, then instead of the weighted one, you can use simple (unweighted) harmonic mean:

where xi are individual options; n- the number of variants of the averaged feature. In the speed example, the simple harmonic average could be applied if the path segments traveled at different speeds were equal.

Any average value should be calculated so that when it replaces each variant of the averaged feature, the value of some final, generalizing indicator, which is associated with the averaged indicator, does not change. So, when replacing the actual speeds on individual sections of the path with their average value (average speed), the total distance should not change.

The form (formula) of the average value is determined by the nature (mechanism) of the relationship of this final indicator with the average, therefore the final indicator, the value of which should not change when replacing the options with their average value, is called defining indicator. To derive the formula for the average, you need to compose and solve an equation using the relationship of the averaged indicator with the determining one. This equation is constructed by replacing the variants of the averaged attribute (indicator) with their average value.

Fashion- the value of a feature that is most often found in a given population. With regard to the variation series, the mode is the most frequent value of the ranked series, i.e., the variant with the highest frequency. Fashion can be used to determine which stores are more frequently visited, the most common price for a product. It shows the size of a feature characteristic of a significant part of the population, and is determined by the formula

where x0 is the lower boundary of the interval; h- the size of the interval; fm- interval frequency; fm_ 1 - frequency of the previous interval; fm + 1 - frequency of the next interval.

Median is called the variant located in the center of the ranked row. The median divides the row into two equal parts in such a way that the same number of population units are located on either side of it. At the same time, in one half of the units of the population, the value of the varying attribute is less than the median, in the other - more than it. The median is used when studying an element whose value is greater than or equal to or simultaneously less than or equal to half of the elements of the distribution series. The median gives a general idea of ​​where the values ​​of the trait are concentrated, in other words, where their center is located.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative boundary of the values ​​of the varying attribute, which half of the population units have. The problem of finding the median for a discrete variation series is easy to solve. If we assign ordinal numbers to all units of the series, then the ordinal number of the median variant is determined as (n +1) / 2 with an odd number of members n. If the number of members of the series is an even number, then the median will be the average of the two options with ordinal numbers n/ 2 and n/ 2 + 1.

When determining the median in the interval variation series, first the interval in which it is located (median interval) is determined. This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds the half-sum of all frequencies of the series. The median of the interval variation series is calculated using the formula

Where X0- the lower boundary of the interval; h- the size of the interval; fm- interval frequency; f- the number of members of the series;

M -1 - the sum of the accumulated members of the series preceding this one.

Along with the median for more full characteristics the structures of the studied population also use other meanings of options that occupy a well-defined position in the ranked series. These include quartiles and deciles. Quartiles divide the series by the sum of frequencies into 4 equal parts, and deciles into 10 equal parts. There are three quartiles and nine deciles.

The median and mode, in contrast to the arithmetic mean, do not extinguish individual differences in the values ​​of the varying attribute and therefore are additional and very important characteristics of the statistical population. In practice, they are often used instead of or alongside the average. It is especially advisable to calculate the median and mode in those cases when the studied population contains a certain number of units with a very large or very small value of the varying attribute. These, not very typical for the aggregate values ​​of the options, influencing the value of the arithmetic mean, do not affect the values ​​of the median and mode, which makes the latter very valuable indicators for economic and statistical analysis.