摘要 :
The 1999 Standards for Educational and Psychological Testing defines validity as the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although quite explicit, there...
展开
The 1999 Standards for Educational and Psychological Testing defines validity as the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although quite explicit, there are ways in which this definition lacks precision, consistency, and clarity. The history of validity has taught us that ambiguity risks oversimplification, misunderstanding, inadequate validation, and the inevitable potential for inappropriate interpretation and use of results. This article identifies ways in which the spirit of the Standards can be clarified, with the intention of reducing these risks. The article provides an elaboration of the consensus definition, invoking a narrow, technical sense of validity, unique to the professions of educational and psychological measurement and assessment; an assessment-based decision-making procedure is valid if the argument for interpreting assessment outcomes (under stated conditions and in terms of stated conclusions) as measures of the attribute entailed by the decision is sufficiently strong.
收起
摘要 :
In this update of Clark and Watson (1995), we provide a synopsis of major points of our earlier article and discuss issues in scale construction that have become more salient as clinical and personality assessment has progressed o...
展开
In this update of Clark and Watson (1995), we provide a synopsis of major points of our earlier article and discuss issues in scale construction that have become more salient as clinical and personality assessment has progressed over the past quarter-century. It remains true that the primary goal of scale development is to create valid measures of underlying constructs and that Loevinger's theoretical scheme provides a powerful model for scale development. We still discuss practical issues to help developers maximize their measures' construct validity, reiterating the importance of (a) clear conceptualization of target constructs, (b) an overinclusive initial item pool, (c) paying careful attention to item wording, (d) testing the item pool against closely related constructs, (e) choosing validation samples thoughtfully, and (f) emphasizing unidimensionality over internal consistency. We have added (g) consideration of the hierarchical structures of personality and psychopathology in scale development, discussion of (h) codeveloping scales in the context of these structures, (i) "orphan," and "interstitial" constructs, which do not fit neatly within these structures, (j) problems with "conglomerate" constructs, and (k) developing alternative versions of measures, including short forms, translations. informant versions, and age-based adaptations. Finally, we have expanded our discussions of (I) item-response theory and of external validity, emphasizing (m) convergent and discriminant validity, (n) incremental validity, and (o) cross-method analyses, such as questionnaires and interviews. We conclude by reaffirming that all mature sciences are built on the bedrock of sound measurement and that psychology must redouble its efforts to develop reliable and valid measures.
收起
摘要 :
The current article enhances the test validation process by addressing important issues with the quantifying construct validity (QCV) procedure. The QCV procedure is intended to help researchers systematically and objectively eval...
展开
The current article enhances the test validation process by addressing important issues with the quantifying construct validity (QCV) procedure. The QCV procedure is intended to help researchers systematically and objectively evaluate the degree to which a pattern of convergent and discriminant validity correlations correspond to a priori hypotheses. Although the QCV procedure holds promise as a psychometric tool and has enjoyed some use, at least three factors have likely limited the frequency and accuracy of its use-questions regarding its role and utility in test validation, a lack of clarity about its key concepts, and a lack of integration with widely available statistical software. We address these important issues and provide psychometrically grounded recommendations for applying the QCV procedure. This work facilitates the understanding, computation, and useful application of the QCV procedure, and ultimately it is intended to enhance work in test validation.
收起
摘要 :
BACKGROUND: The skills required for laparoscopic surgery are amenable to simulator-based training. Several computerized devices are now available. We hypothesized that the LAPSIM simulator can be shown to distinguish novice from e...
展开
BACKGROUND: The skills required for laparoscopic surgery are amenable to simulator-based training. Several computerized devices are now available. We hypothesized that the LAPSIM simulator can be shown to distinguish novice from experienced laparoscopic surgeons, thus establishing construct validity. METHODS: We tested residents of all levels and attending laparoscopic surgeons. The subjects were tested on eight software modules. Pass/fail (P/F), time (T), maximum level achieved (MLA), tissue damage (TD), motion, and error scores were compared using the t-test and analysis of variance. RESULTS: A total of 54 subjects were tested. The most significant difference was found when we compared the most (seven attending surgeons) and least experienced (10 interns) subjects. Grasping showed significance at P/F and MLA (p < 0.03). Clip applying was significant for P/F, MLA, motion, and errors (p < 0.02). Laparoscopic suturing was significant for P/F, MLA, T, TD, as was knot error (p < 0.05). This finding held for novice, intermediate, and expert subjects (p < 0.05) and for suturing time between attending surgeons and residents (postgraduate year [PGY] 1-4) (p < 0.05). CONCLUSIONS: LAPSIM has construct validity to distinguish between expert and novice laparoscopists. Suture simulation can be used to discriminate between individuals at different levels of residency and expert surgeons.
收起
摘要 :
Constructs and indicators are central to the efforts of many researchers who seek to build and test theories and articulate rich narratives about real-world phenomena. For this reason, an extensive discourse exists about their nat...
展开
Constructs and indicators are central to the efforts of many researchers who seek to build and test theories and articulate rich narratives about real-world phenomena. For this reason, an extensive discourse exists about their nature. Increasingly, this discourse has become fraught with controversy. Using Bunge's (1977, 1979) ontology, I examine the nature of constructs and indicators as they are discussed in the extant literature. I define these concepts precisely, disentangle conceptual from measurement issues, and point to ways that discourse about them could better proceed. I show that unidimensional constructs, multidimensional constructs, dimensions, and indicators are all properties in general of a class of things. I also show that only three types of indicators exist-synonyms of the focal construct and succeeding or preceding properties in a pre-order of properties that includes the focal construct. I examine ontologically the notions of content validity, convergent validity, discriminant validity, and internal-consistency reliability and show their problematic nature. I introduce two new concepts, scope validity and the level of concomitance of indicators, that have rigorous ontological foundations. Together, they provide an improved foundation for assessing the construct validity of a set of indicators.
收起
摘要 :
ABSTRACT Interprofessional education has been receiving attention as a result of research suggesting the benefits of interpersonal collaboration in healthcare. In Hong Kong, the implementation of the Interprofessional Team-based L...
展开
ABSTRACT Interprofessional education has been receiving attention as a result of research suggesting the benefits of interpersonal collaboration in healthcare. In Hong Kong, the implementation of the Interprofessional Team-based Learning programme provides implicit call to study the psychometric properties of Readiness for Interprofessional Learning Scale (RIPLS) to clarify if this is a valid measure when used in the Chinese undergraduate healthcare context. This study examines the psychometric properties of RIPLS involving predominantly Chinese undergraduate healthcare students in Hong Kong. Using within- and between-network approaches to construct validity, we investigated the applicability of English version of RIPLS among 469 predominantly Hong Kong Chinese students who have competence in the English language. These participants were from complementary health professional programmes: biomedical sciences, Chinese medicine, medicine, nursing, and pharmacy, from two universities in Hong Kong. The within-network test results indicated that RIPLS had good internal consistency reliability. Results of the confirmatory factor analysis lend support to the overall factor structure of hypothesized four-factor solution although one item obtained non-significant factor loading. The between-network test also suggests that various subscales of RIPLS correlated systematically with theoretically relevant constructs: collective efficacy, team impact on quality of learning, and team impact on clinical reasoning ability. The RIPLS is a valid measure to estimate the Chinese undergraduate healthcare students' readiness to engage in interprofessional learning.
收起
摘要 :
During the last 20 years, health literacy has been promoted as an important determinant of individual and group differences in health outcomes. Despite a definition and pattern of associations with health outcomes highly similar t...
展开
During the last 20 years, health literacy has been promoted as an important determinant of individual and group differences in health outcomes. Despite a definition and pattern of associations with health outcomes highly similar to ‘g’ (i.e., the general cognitive ability factor), health literacy has been conceptualized as a distinct construct. This study evaluates the conceptual and empirical distinctiveness of health literacy. A sample of 167 students from a southeastern urban university (117 females and 50 males) between the ages of 18 and 53 (M = 21.31, SD = 5.61) completed a cognitive ability battery, three health literacy tests, two knowledge tests, and a questionnaire assessing 12 health behaviors and health outcomes. Across 36 tests of criterion-related validity, cognitive ability had an effect in all 36 cases, where the health literacy tests only showed measureable incremental validity in 6 of 36 cases. Factor analysis revealed only three factors defined by the traditional ability tests with the health literacy measures loading on the ability factors as predicted by the content validity analysis. There was no evidence of a health literacy factor. The combined results from a comparative content analysis, an empirical factor analysis, and an incremental validity analysis cast doubt on the uniqueness of a health literacy construct. It is suggested that measures of health literacy are simply domain-specific contextualized measures of basic cognitive abilities. Implications for linking these literatures and increasing our understanding of the influence of intellectual factors on health are discussed.
收起
摘要 :
We review contemporary best practice for developing and validating measures of constructs in the organizational sciences. The three basic steps in scale development are: (a) construct definition, (b) choosing operationalizations t...
展开
We review contemporary best practice for developing and validating measures of constructs in the organizational sciences. The three basic steps in scale development are: (a) construct definition, (b) choosing operationalizations that match the construct definition, and (c) obtaining empirical evidence to confirm construct validity. While summarizing this 3-step process [i.e., Define-Operationalize-Confirm], we address many issues in establishing construct validity and provide a checklist for journal reviewers and authors when evaluating the validity of measures used in organizational research. Among other points, we pay special attention to construct conceptualization, acknowledging existing constructs, improving existing measures, multidimensional constructs, macro-level constructs, and the need for independent samples to confirm construct validity and measurement equivalence across subpopulations.
收起
摘要 :
Drawing on 50 unique samples (from 37 studies), the authors used meta-analytical techniques to assess the extent to which job burnout and employee engagement are independent and useful constructs. The authors found that (a) dimens...
展开
Drawing on 50 unique samples (from 37 studies), the authors used meta-analytical techniques to assess the extent to which job burnout and employee engagement are independent and useful constructs. The authors found that (a) dimension-level correlations between burnout and engagement are high, (b) burnout and engagement dimensions exhibit a similar pattern of association with correlates, and (c) controlling for burnout in meta-regression equations substantively reduced the effect sizes associated with engagement. These findings suggest that doubts about the functional distinctiveness of the dimensions underlying burnout and engagement cannot be dismissed as pure speculation.
收起
摘要 :
Psychometric validity requires construct, predictive, and content validity. However, existing methods for ensuring content validity are limited in their ability to identify attributes missing from a content domain or attributes th...
展开
Psychometric validity requires construct, predictive, and content validity. However, existing methods for ensuring content validity are limited in their ability to identify attributes missing from a content domain or attributes that should be removed from a content domain. In particular, rather than capturing how consumers conceptualize a market construct, existing methods capture how consumers respond to how researchers conceptualize a market construct. The authors propose a market-based procedure for capturing how consumers conceptualize market constructs, including metrics for (1) identifying attributes missing from a content domain and (2) attributes consumers consider outside a content domain.
收起