{"id":7,"date":"2024-04-03T23:09:22","date_gmt":"2024-04-03T20:09:22","guid":{"rendered":"https:\/\/sisu.ut.ee\/measurement\/31-normal-distribution\/"},"modified":"2026-04-13T18:35:03","modified_gmt":"2026-04-13T15:35:03","slug":"31-normal-distribution","status":"publish","type":"page","link":"https:\/\/sisu.ut.ee\/measurement\/31-normal-distribution\/","title":{"rendered":"3.1. The Normal distribution"},"content":{"rendered":"<p><strong>Brief summary:<\/strong> This lecture starts by generalizing that all measured values are <strong>random quantities<\/strong> from the point of view of mathematical statistics. The most important distribution in measurement science \u2013 the <strong>Normal distribution<\/strong> \u2013 is then explained: its importance, the parameters of the Normal distribution (<strong>mean<\/strong> and <strong>standard deviation<\/strong>). The initial definitions of <strong>standard uncertainty<\/strong> (<em>u\u00a0<\/em>), <strong>expanded uncertainty<\/strong> (<em>U\u00a0<\/em>) and <strong>coverage factor<\/strong> (<em>k\u00a0<\/em>) are given. A link between these concepts and the Normal distribution is created.<\/p>\n<p style=\"text-align: center;\"><\/p><div class=\"ratio ratio-16x9 mb-3\"><div class=\"video-placeholder-wrapper video-placeholder-wrapper--16x9\">\n\t\t\t    <div class=\"video-placeholder d-flex justify-content-center align-items-center\">\n\t\t\t        <div class=\"overlay text-white p-2 w-100 text-center d-block justify-content-center align-items-center\">\n\t\t\t            <div>To view third-party content, please accept cookies.<\/div>\n\t\t\t            <button class=\"btn btn-secondary btn-sm mt-1 consent-change\">Change consent<\/button>\n\t\t\t        <\/div>\n\t\t\t    <\/div>\n\t\t\t<\/div>\n<\/div>\n<h4 style=\"text-align: center;\"><strong>The Normal distribution<\/strong><br>\n<a style=\"line-height: 1.6em;\" href=\"http:\/\/www.uttv.ee\/naita?id=17589\" target=\"_blank\" rel=\"noopener\">http:\/\/www.uttv.ee\/naita?id=17589<\/a><\/h4>\n<p style=\"text-align: center;\"><a href=\"https:\/\/www.youtube.com\/watch?v=N-F6leWyNZk\" target=\"_blank\" rel=\"noopener\">https:\/\/www.youtube.com\/watch?v=N-F6leWyNZk<\/a><\/p>\n<p><span data-teams=\"true\">All measured values are from the point of view of mathematical statistics random quantities.<\/span> Random quantities can have different values. This was demonstrated in the lecture on the example of pipetting. If pipetting with the same pipette with nominal volume 10 ml is repeated multiple times then all the pipetted volumes are around 10 ml, but are still slightly different. If a sufficiently large number of repeated measurements are carried out and if the pipetted volumes <a href=\"#\" data-bs-toggle=\"modal\" data-bs-target=\"#popup-modal\" data-title=\"[1]\" data-content='It is fair to ask, how do we know the individual pipetted volumes if the pipette always \u201etells\u201c us just that the volume is 10 ml? In fact, if we have only the pipette and no other (more accurate) measurement possibility of volume then we cannot know how much the volumes differ form each other or from the nominal volume. However, if a more accurate method is available then this is possible. In the case of pipettig a very suitable and often used more accurate method is weighing. It is possible to find the volume of the pipetted water, which is more accurate than that obtained by pipetting, by weighing the pipetted solution (most often water) and divided the obtained mass by the density of water at the temperature of water. Water is used in such experiments because the densities of water at different temperatures are very accurately known (see e.g. &lt;a href=\"http:\/\/en.wikipedia.org\/wiki\/Properties_of_water#Density_of_water_and_ice\" target=\"_blank\" rel=\"noopener\"&gt;http:\/\/en.wikipedia.org\/wiki\/Properties_of_water#Density_of_water_and_ice&lt;\/a&gt;).'>[1]<\/a>\u00a0are plotted according to how frequently they are encountered then it becomes evident that although random, the values are still governed by some underlying relationship between the volume and frequency: the maximum probability of a volume is somewhere in the range of 10.005 and 10.007 ml and the probability gradually decreases towards smaller and larger volumes. This relationship is called <strong>distribution function\u00a0<\/strong>(the more exact term is probability density function).<\/p>\n<p>There are numerous distribution functions known to mathematicians and many of them are encountered in the nature, i.e. they describe certain processes in the nature. In measurement science the most important distribution function is the <strong style=\"line-height: 1.6em;\">normal distribution<\/strong> (also known as the Gaussian distribution). Its importance stems from the so-called <strong style=\"line-height: 1.6em;\">Central limit theorem<\/strong>. In a simplified way it can be worded for measurements as follows: if a measurement result is simultaneously influenced by many uncertainty sources then if the number of the uncertainty sources approaches infinity then the distribution function of the measurement result approaches the normal distribution, irrespective of what are the distribution functions of the factors\/parameters describing the uncertainty sources. In reality the distribution function of the result becomes indistinguishable from the normal distribution already if there are 3-5 (depending on situation) significantly contributing\u00a0<a id=\"jpopup-716\" class=\"jpopup_dialog\" title=\"[2]\" href=\"#\">[2]<\/a><span id=\"jpopup-716\" class=\"jpopup\" data-mce-mark=\"1\">Significantly cointributing uncertainty sources are the important uncertainty sources. We have already qualitatively seen in section 2 that different uncertainty sources have different \u201eimportance\u201c. In the coming lectures we will also see how the \u201eimportance\u201c of an uncertainty source (its uncertainty contribution) can be quantitatively\u00a0\u00a0expressed.<\/span>\u00a0\u00a0uncertainty sources. This explains, why in so many cases measured quantities have normal distribution and why most of the mathematical basis of measurement science and measurement uncertainty estimation is based on the normal distribution.<\/p>\n<div>\n<p style=\"text-align: center;\"><strong><img loading=\"lazy\" decoding=\"async\" width=\"549\" height=\"391\" class=\"alignnone wp-image-61\" title=\"3-1.png\" src=\"https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/3-1.png\" alt=\"3-1.png\" srcset=\"https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/3-1.png 549w, https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/3-1-300x214.png 300w\" sizes=\"auto, (max-width: 549px) 100vw, 549px\"><\/strong><\/p>\n<p style=\"text-align: center;\"><strong>Scheme 3.1. The normal distribution curve of quantity <em>Y<\/em> with mean value <em>y<\/em><sub>m<\/sub> and standard deviation <em>s<\/em>.<\/strong><\/p>\n<div>\n<p>The normal distribution curve has the bell-shaped appearance (Scheme 3.1), and is expressed by equation 3.1:<\/p>\n<table class=\"table table-hover\" border=\"0\" align=\"center\">\n<tbody>\n<tr>\n<td><img loading=\"lazy\" decoding=\"async\" width=\"240\" height=\"83\" class=\"alignnone wp-image-59\" style=\"margin-right: auto; margin-left: auto;\" title=\"valem3-1.png\" src=\"https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/valem3-1.png\" alt=\"valem3-1.png\"><\/td>\n<td>(3.1)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>In this equation <em style=\"line-height: 1.6em;\">f \u00a0<\/em>(\u00a0<em style=\"line-height: 1.6em;\">y \u00a0<\/em>) is the probability (or more exactly \u2013 probability density)\u00a0 that the measurand <em style=\"line-height: 1.6em;\">Y<\/em> has value <em style=\"line-height: 1.6em;\">y<\/em>. <em style=\"line-height: 1.6em;\">y<\/em><sub>m<\/sub> is the <strong style=\"line-height: 1.6em;\">mean<\/strong> value of the <strong style=\"line-height: 1.6em;\">population<\/strong> and <em style=\"line-height: 1.6em;\">s<\/em> is the <strong style=\"line-height: 1.6em;\">standard deviation<\/strong> of the population. <em style=\"line-height: 1.6em;\">y<\/em><sub>m<\/sub> characterizes the position of the normal distribution on the <em style=\"line-height: 1.6em;\">Y<\/em> axis, <em style=\"line-height: 1.6em;\">s<\/em> characterizes the width (spread) of the distribution function, which is determined by the scatter of the data points. The mean and standard deviation are the two parameters that fully determine the shape of the normal distribution curve of a particular random quantity.\u00a0The constants 2 and\u00a0<span class=\"math-tex\">(pi)<\/span>\u00a0are normalization factors, which are present in order to make the overall area under the curve equal to 1.<\/p>\n<\/div>\n<\/div>\n<div>\n<p>The word \u201cpopulation\u201d here means that we would need to do an infinite number of measurements in order to obtain the true <em style=\"line-height: 1.6em;\">y<\/em><sub>m<\/sub> and <em style=\"line-height: 1.6em;\">s<\/em> values. In reality we always operate with a limited number of measurements, so that the mean value and standard deviation that we have from our experiments are in fact <strong style=\"line-height: 1.6em;\">estimates<\/strong> of the true mean and true standard deviation. The larger is the number of repeated measurements the more reliable are the estimates. The number of parallel measurements is therefore very important and we will return to it in different other parts of this course.<\/p>\n<\/div>\n<div>\n<p>The normal distribution and the standard deviation are the basis for definition of <strong>standard uncertainty<\/strong>. Standard uncertainty, denoted by <em>u<\/em>, is the uncertainty expressed at standard deviation level, i.e., uncertainty with roughly 68.3% coverage probability (i.e. the probability of the true value falling within the uncertainty range is roughly 68.3%). The probability of 68.3% is often too low for practical applications. Therefore uncertainty of measurement results is in most cases not reported as standard uncertainty but as <strong>expanded uncertainty<\/strong>. Expanded uncertainty, denoted by <em>U<\/em>, is obtained by multiplying standard uncertainty with a <strong>coverage factor<\/strong>,<a id=\"jpopup-508\" class=\"jpopup_dialog\" title=\"[3]\" href=\"#\">[3]<\/a><span id=\"jpopup-508\" class=\"jpopup\">This definition of expanded uncertainty is simplified. A more rigorous definition goes via the combined standard uncertainty and is introduced in section 4.4.<\/span>\u00a0 denoted by <em>k<\/em>, which is a positive number, larger than 1. If the coverage factor is e.g. 2 (which is the most commonly used value for coverage factor) then in the case of normally distributed measurement result the coverage probability is roughly 95.5%. These probabilities can be regarded as fractions of areas of the respective segments from the total area under the curve as illustrated by the following scheme:<\/p>\n<div>\n<p style=\"text-align: center;\"><strong><img loading=\"lazy\" decoding=\"async\" width=\"585\" height=\"395\" class=\"alignnone wp-image-62\" title=\"3-2.png\" src=\"https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/3-2.png\" alt=\"3-2.png\" srcset=\"https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/3-2.png 585w, https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/3-2-300x203.png 300w\" sizes=\"auto, (max-width: 585px) 100vw, 585px\"><\/strong><\/p>\n<p style=\"text-align: center;\"><strong>Scheme 3.2. The same normal distribution curve as in Scheme 3.1 with 2<em>s<\/em> and 3<em>s<\/em> segments indicated.<\/strong><\/p>\n<p>Since the exponent function can never return a value of zero, the value of <em>f\u00a0<\/em>(\u00a0<em>y\u00a0<\/em>) (eq 3.1) is higher than zero with any value of <em>y<\/em>. This is the reason why uncertainty with 100% coverage is (almost) never possible.<\/p>\n<p>It is important to stress that these percentages hold only if the measurement result is normally distributed. As said above, very often it is. There are, however, important cases when measurement result is not normally distributed. In most of those cases the distribution function has \u201cheavier tails\u201d, meaning, that the expanded uncertainty at e.g. <em style=\"line-height: 1.6em;\">k<\/em> = 2 level will not correspond to coverage probability of 95.5%, but less (e.g. 92%). The issue of distribution of the measurement result will be addressed later in this course.<\/p>\n<p><a href=\"https:\/\/sisu.ut.ee\/measurement\/self-test-3-1\/\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-60\" style=\"margin-right: auto; margin-left: auto;\" title=\"selftest.png\" src=\"https:\/\/sisu.ut.ee\/wp-content\/uploads\/sites\/18\/selftest.png\" alt=\"selftest.png\" width=\"104\" height=\"41\"><\/a><\/p>\n<\/div>\n<div>\n<p>***<br>\n[1]\u00a0It is fair to ask, how do we know the individual pipetted volumes if the pipette always \u201etells\u201c us just that the volume is 10 ml? In fact, if we have only the pipette and no other (more accurate) measurement possibility of volume then we cannot know how much the volumes differ form each other or from the nominal volume. However, if a more accurate method is available then this is possible. In the case of pipettig a very suitable and often used more accurate method is weighing. It is possible to find the volume of the pipetted water, which is more accurate than that obtained by pipetting, by weighing the pipetted solution (most often water) and divided the obtained mass by the density of water at the temperature of water. Water is used in such experiments because the densities of water at different temperatures are very accurately known (see e.g. <a href=\"http:\/\/en.wikipedia.org\/wiki\/Properties_of_water#Density_of_water_and_ice\" target=\"_blank\" rel=\"noopener\">http:\/\/en.wikipedia.org\/wiki\/Properties_of_water#Density_of_water_and_ice<\/a>).<\/p>\n<p>[2]\u00a0Significantly cointributing uncertainty sources are the important uncertainty sources. We have already qualitatively seen in section 2 that different uncertainty sources have different \u201eimportance\u201c. In the coming lectures we will also see how the \u201eimportance\u201c of an uncertainty source (its uncertainty contribution) can be quantitatively\u00a0expressed.<\/p>\n<p>[3]\u00a0This definition of expanded uncertainty is simplified. A more rigorous definition goes via the\u00a0<em style=\"line-height: 1.6em;\">combined<\/em>\u00a0standard uncertainty and is introduced in section 4.4.<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Brief summary: This lecture starts by generalizing that all measured values are random quantities from the point of view of mathematical statistics. The most important distribution in measurement science \u2013 the Normal distribution \u2013 is then explained: its importance, the &#8230;<\/p>\n","protected":false},"author":14,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"class_list":["post-7","page","type-page","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/pages\/7","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/comments?post=7"}],"version-history":[{"count":4,"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/pages\/7\/revisions"}],"predecessor-version":[{"id":1017,"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/pages\/7\/revisions\/1017"}],"wp:attachment":[{"href":"https:\/\/sisu.ut.ee\/measurement\/wp-json\/wp\/v2\/media?parent=7"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}