|
|
Line 1: |
Line 1: |
| {{Other uses|Reliability (disambiguation){{!}}Reliability}}
| | The mind might be smarter than you think. It warns and protects we when it's cold or hot so which you learn how to dress appropriately. It triggers pain whenever the body is ill or injured thus which you recognize to see a doctor or treat oneself. Your mind even puts the brakes on your fat reduction progress when you may be not feeding it the nutrition it needs. Find out what your demands are plus how to lose weight healthily plus safely.<br><br>Plan out a meals [http://safedietplansforwomen.com/bmr-calculator bmr calculator] for the day. Having a program reduces the risk of eating fast foods and harmful snacks. Plan to consume 4-5 occasions a day. A mid-morning plus mid-afternoon meal might enable keep you from overeating at lunch and dinner. Try to include a protein into each of the food throughout the day. Including a protein with food plus snacks usually aid you feel happy and hold off the hunger. Try pre-cooking foods to reduce preparation time. I have found which reducing the prep time for food might reduce the likelihood that you will choose a quicker and less healthy meal.<br><br>The basal metabolic rate (BMR)Your basal metabolic rate is significant whenever planning a fat loss program. It shows the rate your body burns calories just for standard metabolic functions, i.e. how many calories would we burn for a day should you only lie in bed. "Lying inside bed" isn't fairly exact naturally, considering in the event you are thinking or having conversation while inside bed, a body might still burn more calories than your BMR. The BMR shows just the minimum calories necessary to stay alive.<br><br>Carbs, whenever converted to glucose, are utilized primarily for energy. Foods that are categorized basically as carbs include: grains plus their flours, potatoes, sugars (all forms), fruits, vegetables, plus anything made of them.<br><br>Then let's calculate the amount of calories your have to help a daily escapades. Just increase a calculated bmr by 1.2 should you don't exercise at all, 1.375 in the event you exercise lightly 1 to 3 times per week, 1.55 should you exercise moderately 3 to 5 instances per week, 1.725 should you exercise hard 6 to 7 instances per week, or 1.9 in the event you have a physically demanding job and exercise daily or are training for a sports competition including a marathon.<br><br>Calories In is simple. This is just the number of calories you eat plus drink every day, regardless where they come from. There are many techniques to look these up. Fitday.com plus Calorie-Count.com are two excellent web resources. We do should track the Calories In. Fitday.com has tools for this, or we can create an Excel spreadsheet, or write them in a notebook. But we do it, keep track of your Calories In each day. As a side benefit, recognizing you'll have to write down which piece of cake assists motivate we to not eat it.<br><br>There are 3 methods to create this deficit: diet, exercise, or perhaps a combination. If you combine methods you have a better chance of sticking with it. Just think, instead of completely cutting out that afternoon snack, or risking injury by functioning out too hard, you can only have a lighter snack plus take the stairs. Its easier to create smaller changes which usually add as much as 500 calories a day. |
| | |
| In the [[psychometrics]], '''reliability''' is used to describe the overall consistency of a measure. A measure is said to have a high '''reliability''' if it produces similar results under consistent conditions. For example, measurements of people’s height and weight are often extremely reliable.<ref>{{cite book|last=al.]|first=Neil R. Carlson ... [et|title=Psychology : the science of behaviour|year=2009|publisher=Pearson|location=Toronto|isbn=978-0-205-64524-4|edition=4th Canadian ed.}}</ref><ref name="themasb.org">The [[Marketing Accountability Standards Board]] (MASB) endorses this definition as part of its ongoing [http://www.themasb.org/common-language-project/ Common Language: Marketing Activities and Metrics Project].</ref>
| |
| | |
| ==Types==
| |
| There are several general classes of reliability estimates:
| |
| *'''[[Inter-rater reliability]]''' assesses the degree of agreement between two or more raters in their appraisals.
| |
| *'''[[Test-retest reliability]]''' assesses the degree to which test scores are consistent from one test administration to the next. Measurements are gathered from a single rater who uses the same methods or instruments and the same testing conditions.<ref name="themasb.org"/> This includes [[intra-rater reliability]].
| |
| *'''Inter-method reliability''' assesses the degree to which test scores are consistent when there is a variation in the methods or instruments used. This allows inter-rater reliability to be ruled out. When dealing with [[Form (document)|forms]], it may be termed '''parallel-forms reliability'''.<ref name=socialresearchmethods>[http://www.socialresearchmethods.net/kb/reltypes.php Types of Reliability] The Research Methods Knowledge Base. Last Revised: 20 October 2006</ref>
| |
| *'''[[Internal consistency]] reliability''', assesses the consistency of results across items within a test.<ref name=socialresearchmethods/>
| |
| | |
| ==Difference from validity==
| |
| Reliability does not imply [[validity (statistics)|validity]]. That is, a reliable measure that is measuring something consistently is not necessarily measuring what you want to be measuring. For example, while there are many reliable tests of specific abilities, not all of them would be valid for predicting, say, job performance. In terms of [[accuracy and precision]], reliability is a more accurate way of describing precision, while validity is a more precise way of describing accuracy.
| |
| | |
| While reliability does not imply [[validity (statistics)|validity]], a lack of reliability does place a limit on the overall validity of a test. A test that is not perfectly reliable cannot be perfectly valid, either as a means of measuring attributes of a person or as a means of predicting scores on a criterion. While a reliable test may provide useful valid information, a test that is not reliable cannot possibly be valid.<ref name=David>{{cite book|last=Davidshofer|first=Kevin R. Murphy, Charles O.|title=Psychological testing : principles and applications|year=2005|publisher=Pearson/Prentice Hall|location=Upper Saddle River, N.J.|isbn=0-13-189172-3|edition=6th ed.}}</ref>
| |
| | |
| An example often used to illustrate the difference between reliability and validity in the experimental sciences involves a common [[bathroom scale]]. If someone who is 200 pounds steps on a scale 5 times and gets readings of "15", "250", "95", "140", and "500", then the scale is not reliable. If the scale consistently reads "150", then it is reliable, but not valid. If it reads "200" each time, then the measurement is both reliable and valid.
| |
| | |
| ==General model==
| |
| | |
| In practice, testing measures are never perfectly consistent.Theories of test reliability have been developed to estimate the effects of inconsistency on the accuracy of measurement. The basic starting point for almost all theories of test reliability is the idea that test scores reflect the influence of two sorts of factors:<ref name =David />
| |
| | |
| 1. '''Factors that contribute to consistency:''' stable characteristics of the individual or the attribute that one is trying to measure
| |
| | |
| 2. '''Factors that contribute to inconsistency:''' features of the individual or the situation that can affect test scores but have nothing to do with the attribute being measured.
| |
| | |
| These factors include:<ref name =David />
| |
| | |
| * Temporary but general characteristics of the individual: health, fatigue, motivation, emotional strain
| |
| * Temporary and specific characteristics of individual: comprehension of the specific test task, specific tricks or techniques of dealing with the particular test materials, fluctuations of memory, attention or accuracy
| |
| * Aspects of the testing situation: freedom from distractions, clarity of instructions, interaction of personality, sex, or race of examiner
| |
| * Chance factors: luck in selection of answers by sheer guessing, momentary distractions
| |
| | |
| The goal of estimating reliability is to determine how much of the variability in test scores is due to '''errors in measurement''' and how much is due to variability in '''true scores'''.<ref name =David /> | |
| | |
| A '''true score''' is the replicable feature of the concept being measured. It is the part of the observed score that would recur across different measurement occasions in the absence of error.
| |
| | |
| '''Errors of measurement''' are composed of both [[random error]] and [[systematic error]]. It represents the discrepancies between scores obtained on tests and the corresponding true scores. | |
| | |
| This conceptual breakdown is typically represented by the simple equation:
| |
| | |
| : <big>'''''Observed test score = true score + errors of measurement'''''</big>
| |
| | |
| ==Classical test theory==
| |
| | |
| The goal of reliability theory is to estimate errors in measurement and to suggest ways of improving tests so that errors are minimized.
| |
| | |
| The central assumption of reliability theory is that measurement errors are essentially random. This does not mean that errors arise from random processes. For any individual, an error in measurement is not a completely random event. However, across a large number of individuals, the causes of measurement error are assumed to be so varied that measure errors act as random variables.<ref name =David />
| |
| | |
| If errors have the essential characteristics of random variables, then it is reasonable to assume that errors are equally likely to be positive or negative, and that they are not correlated with true scores or with errors on other tests.
| |
| | |
| It is assumed that:<ref>{{cite book|last=Gulliksen|first=Harold|title=Theory of mental tests|year=1987|publisher=L. Erlbaum Associates|location=Hillsdale, N.J.|isbn=978-0-8058-0024-1}}</ref>
| |
| | |
| 1. Mean error of measurement = 0
| |
| | |
| 2. True scores and errors are uncorrelated
| |
| | |
| 3. Errors on different measures are uncorrelated
| |
| | |
| Reliability theory shows that the variance of obtained scores is simply the sum of the variance of '''true scores''' plus the variance of '''errors of measurement'''.<ref name =David />
| |
| | |
| : <math> \sigma^2_X = \sigma^2_T + \sigma^2_E </math>
| |
| | |
| This equation suggests that test scores vary as the result of two factors:
| |
| | |
| 1. Variability in true scores
| |
| | |
| 2. Variability due to errors of measurement.
| |
| | |
| The reliability coefficient <math>\rho_{xx'} </math> provides an index of the relative influence of true and error scores on attained test scores. In its general form, the reliability coefficient is defined as the ratio of ''true score'' variance to the total variance of test scores. Or, equivalently, one minus the ratio of the variation of the ''error score'' and the variation of the ''observed score'':
| |
| | |
| : <math> \rho_{xx'} = \frac{\sigma^2_T}{\sigma^2_X} = 1 - \frac{ \sigma^2_E }{ \sigma^2_X } </math>
| |
| | |
| Unfortunately, there is no way to directly observe or calculate the '''true score''', so a variety of methods are used to estimate the reliability of a test.
| |
| | |
| Some examples of the methods to estimate reliability include [[test-retest reliability]], [[internal consistency]] reliability, and ''parallel-test reliability''. Each method comes at the problem of figuring out the source of error in the test somewhat differently.
| |
| | |
| ==Item response theory==
| |
| It was well-known to classical test theorists that measurement precision is not uniform across the scale of measurement. Tests tend to distinguish better for test-takers with moderate trait levels and worse among high- and low-scoring test-takers. [[Item response theory]] extends the concept of reliability from a single index to a function called the ''information function''. The IRT information function is the inverse of the conditional observed score standard error at any given test score.
| |
| | |
| ==Estimation==
| |
| | |
| The goal of estimating reliability is to determine how much of the variability in test scores is due to errors in measurement and how much is due to variability in true scores.
| |
| | |
| Four practical strategies have been developed that provide workable methods of estimating test reliability.<ref name =David />
| |
| | |
| 1. '''[[Test-retest reliability]] method''': directly assesses the degree to which test scores are consistent from one test administration to the next. | |
| | |
| It involves:
| |
| | |
| * Administering a test to a group of individuals
| |
| | |
| * Re-administering the same test to the same group at some later time
| |
| | |
| * Correlating the first set of scores with the second
| |
| | |
| The correlation between scores on the first test and the scores on the retest is used to estimate the reliability of the test using the [[Pearson product-moment correlation coefficient]]: see also [[item-total correlation]].
| |
| | |
| 2. '''Parallel-forms method''':
| |
| | |
| The key to this method is the development of alternate test forms that are equivalent in terms of content, response processes and statistical characteristics. For example, alternate forms exist for several tests of general intelligence, and these tests are generally seen equivalent.<ref name =David />
| |
| | |
| With the parallel test model it is possible to develop two forms of a test that are equivalent in the sense that a person’s true score on form A would be identical to their true score on form B. If both forms of the test were administered to a number of people, differences between scores on form A and form B may be due to errors in measurement only.<ref name =David />
| |
| | |
| It involves:
| |
| | |
| * Administering one form of the test to a group of individuals
| |
| | |
| * At some later time, administering an alternate form of the same test to the same group of people
| |
| | |
| * Correlating scores on form A with scores on form B
| |
| | |
| The correlation between scores on the two alternate forms is used to estimate the reliability of the test.
| |
| | |
| This method provides a partial solution to many of the problems inherent in the '''[[test-retest reliability]] method'''. For example, since the two forms of the test are different, [[carryover effect]] is less of a problem. Reactivity effects are also partially controlled; although taking the first test may change responses to the second test. However, it is reasonable to assume that the effect will not be as strong with alternate forms of the test as with two administrations of the same test.<ref name =David />
| |
| | |
| However, this technique has its disadvantages:
| |
| | |
| * It may very difficult to create several alternate forms of a test
| |
| * It may also be difficult if not impossible to guarantee that two alternate forms of a test are parallel measures
| |
| | |
| 3. '''Split-half method''':
| |
| | |
| This method treats the two halves of a measure as alternate forms. It provides a simple solution to the problem that the '''parallel-forms method''' faces: the difficulty in developing alternate forms.<ref name =David />
| |
| | |
| It involves:
| |
| | |
| * Administering a test to a group of individuals
| |
| * Splitting the test in half
| |
| * Correlating scores on one half of the test with scores on the other half of the test
| |
| | |
| The correlation between these two split halves is used in estimating the reliability of the test. This halves reliability estimate is then stepped up to the full test length using the [[Spearman–Brown prediction formula]].
| |
| | |
| There are several ways of splitting a test to estimate reliability. For example, a 40-item vocabulary test could be split into two subtests, the first one made up of items 1 through 20 and the second made up of items 21 through 40. However, the responses from the first half may be systematically different from responses in the second half due to an increase in item difficulty and fatigue.<ref name =David /> | |
| | |
| In splitting a test, the two halves would need to be as similar as possible, both in terms of their content and in terms of the probable state of the respondent. The simplest method is to adopt an odd-even split, in which the odd-numbered items form one half of the test and the even-numbered items form the other. This arrangement guarantees that each half will contain an equal number of items from the beginning, middle, and end of the original test.<ref name =David />
| |
| | |
| 4. '''[[Internal consistency]]''': assesses the consistency of results across items within a test. The most common internal consistency measure is [[Cronbach's alpha]], which is usually interpreted as the mean of all possible split-half coefficients.<ref name="Cortina">Cortina, J.M., (1993). What Is Coefficient Alpha? An Examination of Theory and Applications. ''Journal of Applied Psychology, 78''(1), 98–104.</ref> Cronbach's alpha is a generalization of an earlier form of estimating internal consistency, [[Kuder–Richardson Formula 20]].<ref name="Cortina" /> Although the most commonly used, there are some misconceptions regarding Cronbach's alpha.<ref>Ritter, N. (2010). Understanding a widely misunderstood statistic: Cronbach's alpha. Paper presented at Southwestern Educational Research Association (SERA) Conference 2010, New Orleans, LA (ED526237).</ref>
| |
| <ref>{{cite journal|first1=R.|last1=Eisinga|first2=M.|last2=Te Grotenhuis|first3=B.|last3=Pelzer|title=The reliability of a two-item scale: Pearson, Cronbach or Spearman-Brown? |journal= International Journal of Public Health|year=2012|volume=58|issue=4|pages=637-642|doi= 10.1007/s00038-012-0416-3}}</ref>
| |
| | |
| These measures of reliability differ in their sensitivity to different sources of error and so need not be equal. Also, reliability is a property of the ''scores of a measure'' rather than the measure itself and are thus said to be ''sample dependent''. Reliability estimates from one sample might differ from those of a second sample (beyond what might be expected due to sampling variations) if the second sample is drawn from a different population because the true variability is different in this second population. (This is true of measures of all types—yardsticks might measure houses well yet have poor reliability when used to measure the lengths of insects.)
| |
| | |
| Reliability may be improved by clarity of expression (for written assessments), lengthening the measure,<ref name="Cortina" /> and other informal means. However, formal psychometric analysis, called item analysis, is considered the most effective way to increase reliability. This analysis consists of computation of '''item difficulties''' and '''item discrimination''' indices, the latter index involving computation of correlations between the items and sum of the item scores of the entire test. If items that are too difficult, too easy, and/or have near-zero or negative discrimination are replaced with better items, the reliability of the measure will increase.
| |
| | |
| * <math>R(t) = 1 - F(t).</math>
| |
| | |
| * <math>R(t) = \exp(-\lambda t).</math> (where <math>\lambda</math> is the failure rate)
| |
| | |
| ==See also==
| |
| * [[Coefficient of variation]]
| |
| * [[Homogeneity (statistics)]]
| |
| * [[Test-retest reliability]]
| |
| * [[Internal consistency]]
| |
| * [[Levels of measurement]]
| |
| * [[Accuracy and precision]]
| |
| * [[Reliability (disambiguation)|Reliability]] disambiguation page
| |
| * [[Reliability theory]]
| |
| * [[Reliability engineering]]
| |
| * [[Reproducibility]]
| |
| * [[Validity (statistics)]]
| |
| | |
| {{More footnotes|date=July 2010}}
| |
| | |
| ==References==
| |
| {{Reflist}}
| |
| | |
| ==External links==
| |
| * [http://www.uncertainty-in-engineering.net Uncertainty models, uncertainty quantification, and uncertainty processing in engineering]
| |
| * [http://www.visualstatistics.net/Statistics/Principal%20Components%20of%20Reliability/PCofReliability.asp The relationships between correlational and internal consistency concepts of test reliability]
| |
| * [http://www.visualstatistics.net/Statistics/Reliability%20Negative/Negative%20Reliability.asp The problem of negative reliabilities]
| |
| {{Use dmy dates|date=September 2010}}
| |
| | |
| {{DEFAULTSORT:Reliability (Statistics)}}
| |
| [[Category:Comparison of assessments]]
| |
| [[Category:Psychometrics]]
| |
| [[Category:Market research]]
| |
| [[Category:Educational psychology research methods]]
| |
| [[Category:Reliability analysis|*]]
| |
| | |
| [[pl:Rzetelność (metodologia nauki)#Rzetelność w psychometrii]]
| |
The mind might be smarter than you think. It warns and protects we when it's cold or hot so which you learn how to dress appropriately. It triggers pain whenever the body is ill or injured thus which you recognize to see a doctor or treat oneself. Your mind even puts the brakes on your fat reduction progress when you may be not feeding it the nutrition it needs. Find out what your demands are plus how to lose weight healthily plus safely.
Plan out a meals bmr calculator for the day. Having a program reduces the risk of eating fast foods and harmful snacks. Plan to consume 4-5 occasions a day. A mid-morning plus mid-afternoon meal might enable keep you from overeating at lunch and dinner. Try to include a protein into each of the food throughout the day. Including a protein with food plus snacks usually aid you feel happy and hold off the hunger. Try pre-cooking foods to reduce preparation time. I have found which reducing the prep time for food might reduce the likelihood that you will choose a quicker and less healthy meal.
The basal metabolic rate (BMR)Your basal metabolic rate is significant whenever planning a fat loss program. It shows the rate your body burns calories just for standard metabolic functions, i.e. how many calories would we burn for a day should you only lie in bed. "Lying inside bed" isn't fairly exact naturally, considering in the event you are thinking or having conversation while inside bed, a body might still burn more calories than your BMR. The BMR shows just the minimum calories necessary to stay alive.
Carbs, whenever converted to glucose, are utilized primarily for energy. Foods that are categorized basically as carbs include: grains plus their flours, potatoes, sugars (all forms), fruits, vegetables, plus anything made of them.
Then let's calculate the amount of calories your have to help a daily escapades. Just increase a calculated bmr by 1.2 should you don't exercise at all, 1.375 in the event you exercise lightly 1 to 3 times per week, 1.55 should you exercise moderately 3 to 5 instances per week, 1.725 should you exercise hard 6 to 7 instances per week, or 1.9 in the event you have a physically demanding job and exercise daily or are training for a sports competition including a marathon.
Calories In is simple. This is just the number of calories you eat plus drink every day, regardless where they come from. There are many techniques to look these up. Fitday.com plus Calorie-Count.com are two excellent web resources. We do should track the Calories In. Fitday.com has tools for this, or we can create an Excel spreadsheet, or write them in a notebook. But we do it, keep track of your Calories In each day. As a side benefit, recognizing you'll have to write down which piece of cake assists motivate we to not eat it.
There are 3 methods to create this deficit: diet, exercise, or perhaps a combination. If you combine methods you have a better chance of sticking with it. Just think, instead of completely cutting out that afternoon snack, or risking injury by functioning out too hard, you can only have a lighter snack plus take the stairs. Its easier to create smaller changes which usually add as much as 500 calories a day.