# 《心理测量》Ch6 Validity合法性

Chapter 6 Validity 1 Basic Concepts of Validity pp What is the Validity What is the Validity InterpretationInterpretation The validity of a test concerns The validity of a test concerns what what the test the test measure and measure and how wellhow well it does so it does so Anne Anne AnastasiAnastasi It tell us what can be inferred from test scores Anne Anastasi Figure6 1 One Funny Picture Validity can be defined as the agreement between a test score or measure and the quality it is believed to measure Robert M Kaplan Dennis P Saccuzzo Does the test measure what it is supposed to measure Validity is the evidence for inferences made about a test score AERA APA NCME STANDARS FOR EDUCATIONAL AND PSYCHOLOGICAL TESTING Validity effected by random and systematic errors Random errors and systematic errors both reduce the accuracy of the test Mathematic Definition of Validity Validity coefficient is the ratio of The variance concerned to the trait measured to observed score variance 6 1 Comparing Validity with Reliability The reliability of test is low usually the validity is low too The reliability of test is high the validity isn t necessarily high Figure 6 2 Components of the Variance of Observed Scores Reliability is a necessary premise for validity and validity represents the ultimate purpose of the test pThree Types of Validity Criterion Related Validity Content Related validity Construct Related Validity Note The most recent standards emphasize that validity is a unitary concept The use of categories does not imply that there are distinct s of validity p Effect Factors for Validity Test Itself Test Administration and Scoring Examinees The Criterion Chosen for Criterion Validity Effect from test itself The statement of the items is clear or not The items represent the trait measured or not The length of the test is adequate or not The test difficulty is proper or not Test administration and scoring Whether the sample is representative heterogeneous Whether the testing conditions are appropriate and unexpected disturbances occur Whether the tester administers the test according to the manual Whether the test guides for examinees are clear Whether the Scoring system is object and standard Examinees Interests and Motivation on the Test Emotional State and Attitude During the Testing State of Physical Health Experiences on Test The criterion chosen for criterion validity 2 Content Validity and Construct Validity pContent Validity Interpretation Content validity involves the careful definition of the domain of behaviors to be measured by the test and the logical design of items to cover all the important areas of the domain The purpose of a content validity is to assess whether the items adequately represents a perance domain or construct of specific interest It is established through a rational analysis of the content of a test Steps for Content Validation Using Experts Judgment Defining the perance domain of interest Selection a panel of qualified experts in the content domain Providing a structured framework for the process of matching items to the perance domain Collecting and summarizing the data from the matching process Application Content validity is most often employed with achievement test so the perance domain is often defined by a list of instructional objectives Content validity is also applicable to certain occupational test designed for employee selection and classification Table6 1 Table of Instructional objectives knowledge Comprehension application analysis uation synthesis Sum Chapter1 Chapter2 Chapter3 Chapter4 8 2 10 6 2 10 3 6 2 4 7 2 9 12 6 5 6 10 28 22 40 Sum 5 25 28 14 22 6 100 Distinction Face Validity The face validity refers to what it appears superficially to measure not to what the test actually measures pConstruct Validity Interpretation The construct validity of a test is the extent to which the test may be said to measure a theoretical construct or trait What is Construct Each construct is developed to explain and organize observed response consistencies It derives from established interrelationships among behavioral measures Examples scholastic aptitude intelligence verbal fluency anxiety depression self esteem etc Construct validation has focused attention on the role of psychological theory in test construction and on the need for ulate hypotheses that can be proved or disproved in validation process Anne Anastasi Procedures for Construct Validation Correlations between a measure of the construct and designated Internal Consistency Differentiation between Groups Development Changes Factor Analysis Multitrait multi matrix 1 2 3 Trait A B C A B C A B C 1 True False A Sex Guilt 95 B Hostility Guilt 28 86 C Morality Conscience 58 39 92 2 Force Choice A Sex Guilt 86 32 57 95 B Hostility Guilt 30 90 40 39 76 C Morality Conscience 52 31 86 55 26 84 3 Incomplete Sentences A Sex Guilt 73 10 43 64 17 37 48 B Hostility Guilt 10 63 17 22 67 19 15 41 C Morality Conscience 35 16 52 31 17 56 41 30 58 Example How to Search the Evidences for a Supposed Intelligence Test State the theory hypotheses of test 1 Intelligence grows with the age growing 2 IQ is relatively stable 3 Intelligence is substantially related to school achievement 4 Intelligence is affected by inheritance Administer the test to population and analyze the data Judge whether the test scores increase with the ages increasing whether IQ and school achievements is correlated IQs keep stably cross a time interval whether the correlation between MZ is higher than the correlation between DZ 3 Criterion Related Validity pConcepts 1 interpretation of Criterion related Validity It is the degree on which the test scores can be related to a criterion It indicate the effectiveness of a test in predicting an individual perance in specified activities Two Types Predictive Validity refers to the degree to which test scores predict criterion measurement that will be made at some point in the future Concurrent Validity refers to the relationship between test scores and criterion measurements made at the time the test was given 2 What is criterion The Criterion is some behavior that the test scores are used to predicted For example use the grade point averages as the criterion of a school admissions test The problems About Criterion The reliability of criterion The validity of criterion Whether it can be measured Criterion contamination Usually Used Criterion academic achievement for intelligence test perance in specialized training for special aptitude test job perance contrasted group for personality domain referenced test psychiatric diagnosis for personality test ratings by schoolteachers job supervisor previously available tests pProcedures of Criterion Related Validation Validity Coefficient Discrimination Between Two Groups Estimate Validity Coefficient Pearson Product Moment Correlation Coefficient rcise 1 Suppose that 10 male applicants were examined one job interests test and the admitted as salesman by one company The job interest test scores X and the sale amount for the first year Y unit is ten thousands of each applicant are listed in the following table table 6 2 10 Applicants Test Scores and Sale Amount examinees 1 2 3 4 5 6 7 8 9 10 X 30 34 32 47 20 24 27 25 22 16 Y 2 5 3 8 3 4 0 7 1 2 2 3 5 2 8 1 2 Biserial Correlation Coefficient for correlation between a continuous variable and a dichotomous variable 6 2 is the percentage of examinees who get point 1 on dichotomous variable is equal to 1 p is the mean of the test scores on the continuous variable of the examinees who get point 1 on dichotomous variable is the mean of the test scores on the continuous variable of the examinees who get point 0 on dichotomous variable is the standard deviation of test scores for all examinees on continusous variable is the Y oirdinate of the standard normal curvve at the z score associated with the p value Research Case Use rb to estimate the validity of the fist application for WISC R in Shanghai Data concerned the number of first level middle school students is 66 the number of second level middle school students is 286 the mean of IQs of the first level students is 114 the mean of IQs of the second level students is 96 the standard deviation of all students IQs is 14 53 if p 1875 then Y is 2685 p p 1875 1875 thenthen Y Y is 2685is 2685 rcise 2 The middle school students attended a math test The mean scores of students who have been instructed with higher math program is 60 188 and their number is 382 The mean of the students who have accepted normal program is 47 429 and their number is 618 The standard deviation for all students is 11 910 Please estimate the validity coefficient of the math test 2 Discrimination Between Two Groups Compare the means of two groups t Test Degree of freedom Compute the overlap amount of the two groups 1 Compute the number of the examinees from one group usually contrasted whose test scores is higher than the mean of the other group Compute the rate of the number of those test scores is higher than the mean for the other group Then calculate the rate of the two numbers 2 Compute the overlap percentage of the score distribution for each group 4 Application of Validity Coefficient pPredict the Criterion Score pEstablish Regression Equation is the predicted criterion score for a examinee the test score of a examinee is the regression coefficient and is the intercept and Example Figure 6 3 100 Examinees Scores on Job Aptitude Test and Real Perance Scores If one applicant get 6 on the test then we can use the regression equation to predict his job perance in the future rcise 3 Suppose a group of students from high school were examined a job interests test Researcher obtained these statistics The validity coefficient is 0 6 If John got 54 points on the job interest test then what his criterion scores job perance would be 2 Estimate Error Standard Error of Estimate The error of estimate shows the margin of error to be expected in the individual s predicted criterion score as a result of the imperfect validity of the rest X1 X Coefficient of Determination indicating the proportion of the variance of criterion test scores which is related to the variance of the predictor test scores 3 Establish the approximate interval for an actual criterion Y pValidity Coefficient and Classification Decision Y X Yc Xc Figure 6 4 Scatter Plots of the Predictor and Criterion Scores Basic Concepts Cut off Scores Valid Acceptance Valid Rejection False Acceptence False Rejection Four rates Base Rate the proportion of successful applicants selected without the use of a test Selection Ratio the proportion of applicants who must be accepted Hit Rate the percentage of predictions that are correct Success Ratio the proportion of selected applicants who succeed Table 6 3 Taylor Russell Table foe a Base Rate of 60