摘要 :
The 1999 Standards for Educational and Psychological Testing defines validity as the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although quite explicit, there...
展开
The 1999 Standards for Educational and Psychological Testing defines validity as the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although quite explicit, there are ways in which this definition lacks precision, consistency, and clarity. The history of validity has taught us that ambiguity risks oversimplification, misunderstanding, inadequate validation, and the inevitable potential for inappropriate interpretation and use of results. This article identifies ways in which the spirit of the Standards can be clarified, with the intention of reducing these risks. The article provides an elaboration of the consensus definition, invoking a narrow, technical sense of validity, unique to the professions of educational and psychological measurement and assessment; an assessment-based decision-making procedure is valid if the argument for interpreting assessment outcomes (under stated conditions and in terms of stated conclusions) as measures of the attribute entailed by the decision is sufficiently strong.
收起
摘要 :
The focus article provided me with an opportunity to unpack the consensus definition of validity and to explore its implications in the light of recent debates. I proposed an elaboration of the consensus definition, which was inte...
展开
The focus article provided me with an opportunity to unpack the consensus definition of validity and to explore its implications in the light of recent debates. I proposed an elaboration of the consensus definition, which was intended to express the spirit of the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999) with increased precision, highlighting a range of features including the following:
1. People assess in order to make decisions and, therefore, validity is ultimately a property of assessment-based decision-making procedures.
2. Validity is a property of a procedure that describes its potential to support good measurement and, therefore, good decision making.
收起
摘要 :
I am very grateful to the commentators for their observations and challenges: Baird, Coe, Cresswell, von Davier, and Walker. Most comments focused on aspects of my alternative framework for thinking about linking, so I will restri...
展开
I am very grateful to the commentators for their observations and challenges: Baird, Coe, Cresswell, von Davier, and Walker. Most comments focused on aspects of my alternative framework for thinking about linking, so I will restrict my response accordingly. I will begin by summarizing the general proposal, which will help to clarify subsequent points of detail.
收起
摘要 :
In Talk and Gesture as Process Datay Bryan Maddox (a) demonstrates how video-ethnographic methods can inform our understanding of assessment events and (b) considers how evidence from talk and gesture can inform validation practic...
展开
In Talk and Gesture as Process Datay Bryan Maddox (a) demonstrates how video-ethnographic methods can inform our understanding of assessment events and (b) considers how evidence from talk and gesture can inform validation practice. He notes that, despite strong arguments concerning its importance, this kind of response process (RP) evidence has been neglected in routine validation research. He argues, in contrast, that with advances in computer-based testing and digital technology, micro-analytic investigations, such as his own, promise to transform validation practice. Adopting an ethnographic approach, Maddox makes the case for a novel form of RP evidence, based on talk and gesture. He also makes a more general case for the importance of RP evidence (see also Ercikan & Pellegrino, 2017; Zumbo & Hubley, 2017). So, to what extent does this micro-analytic approach, with its focus on talk and gesture during assessment events, constitute a relevant and valuable source of RP evidence? And, more generally, how likely is it that we are on the brink of a revolution in the use of RP evidence within validation research?
收起