心理发展与教育 ›› 2011, Vol. 27 ›› Issue (2): 210-215.

• 论文 • 上一篇    下一篇

标准参照测验决策一致性指标研究的总结与展望

陈平, 李珍, 辛涛, 高慧健   

  1. 北京师范大学发展心理研究所, 北京 100875
  • 出版日期:2011-03-15 发布日期:2011-03-15
  • 通讯作者: 辛涛,E-mail:xintao@bnu.edu.cn E-mail:xintao@bnu.edu.cn
  • 基金资助:
    教育部新世纪优秀人才支持计划(NCET-07-0097);全国教育科学规划考试专项(GFA097004)

A Review of Decision Consistency Indices of Criteria-Reference Test

CHEN Ping, LI Zhen, XIN Tao, GAO Hui-jian   

  1. Institute of Developmental Psychology, Beijing Normal University, Beijing 100875
  • Online:2011-03-15 Published:2011-03-15

摘要: 决策一致性指考生在两次平行测验中被一致归类的程度,是衡量标准参照测验质量的重要指标.到目前为止,基于经典测量模型和项目反应模型,研究者已经提出了数十种估计决策一致性指标的方法,并对这些方法的优劣进行了比较.由于模型基础和对分数分布的假设不同,各种方法适用于不同的测验情境.未来的研究应当对已有方法进行验证,并探讨决策一致性在教育测量中的应用途径,为教育和心理测量工作者估计测验的决策一致性指标提供凭据.

关键词: 决策一致性, 信度, p系数, Kappa系数

Abstract: This paper presented an overview of various procedures for estimating single-administration decision consistency index which is an important quality standard of criterion-referenced test.Researchers have proposed dozens of estimation methods based on classical test theory or item response theory,and have made some comparisons among them.Future studies should focus on validating these methods and exploring its application in educational measurement,providing psychometricians with a basis for choosing the appropriate estimation method for decision consistency in particular situation.

Key words: decision consistency, reliability, index p, index Kappa

中图分类号: 

  • G449
[1] AERA,APA,& NCME(1999).Standards for educational and psychological testing.Washington,DC:Author.35-36.
[2] Brennan,R.L.(2003).Coefficients and indices in generalizability theory(CASMA Research Report No.1).Iowa City,IA:Center for Advanced Studies in Measurement and Assessment,The University of Iowa.(Available on http://www.education.uiowa.edu/casma).
[3] Brennan,R.L.,& Wan,L.(2004).A bootstrap procedure for estimating decision consistency for single-administration complex assessments(CASMA Research Report No.17).Iowa City,IA:Center for Ad-vanced Studies in Measurement and Assessment,The University of Iowa.(Available on http://www.education.uiowa.edu/casma).
[4] Crocker,L.M.,& Algina,J.(1986).Introduction to classical and modern test theory.Belmont in USA:Thomson Learning Academic Resource Center,192-211.
[5] Hanson,B.A.,& Brennan,R.L.(1990).An investigation of classification consistency indexes estimated under alternative strong true score models.Journal of Educational Measurement,27(4),345-359.
[6] Lee,W.C.,et al.(2002).Estimating consistency and accuracy indices for multiple classifications.Applied Psychological Measurement,26(4),412-432.
[7] Lee,W.C.(2005).Classification consistency under the compound multinomial model(CASMA Research Report No.13).Iowa City,IA:Cen-ter for Advanced Studies in Measurement and Assessment,The University of Iowa.(Available on http://www.education.uiowa.edu/cas-ma).
[8] Lee,W.C.(2008a).Classification consistency and accuracy for complex assessments using item response theory(CASMA Research Report No.27).Iowa City,IA:Center for Advanced Studies in Measurement and Assessment,The University of Iowa.(Available on http://www.education.uiowa.edu/casma).
[9] Lee,W.,& Kolen,M.J.(2008b).IRT CLASS:A computer program for item response theory classification consistency and accuracy(Version 2.0) [Computer software].Iowa City,IA:University of Iowa,Center for Advanced Studies in Measurement and Assessment.(Available on http://www.education.uiowa.edu/casma).
[10] Li,S.H.(2006).Evaluating the consistency and accuracy of proficiency classifications using Item Response Theory.Unpublished doctoral dissertation,University of Massachusetts Amherst.
[11] Livingston,S.A.,& Lewis,C.(1995).Estimating the consistency and accuracy of classifications based on test scores.Journal of Educational Measurement,32(2),179-197.
[12] Rudner,L.M.(2005).Expected classification accuracy.Practical Assessment Research & Evaluation,10(13),1-4.
[13] Wan,L.,Brennan R.L.,& Lee,W.C.(2007).Estimating classification consistency for complex assessments(CASMA Research Report No.22).Iowa City,IA:Center for Advanced Studies in Measurement and Assessment,The University of Iowa.(Available on http://www.education.uiowa.edu/casma).
[14] Yoo,H.,Sukin,M.T.,& Hambleton,R.K.(2009).Evaluating consistency and accuracy of proficiency classifications using a single administration IRT method(Final Report).Amherst,MA:University of Massachusetts,Center for Educational Assessment.
[15] Yoo,H.,& Bishop,N.S.(2010,April).Evaluating proficiency classification using testlet response theory.Paper presented at the annual meeting of the National Council on Measurement in Education,Denver,CO.
[16] 韩宁.(2008).评价考试质量的新指标:决策一致性和决策准确性.中国考试,(6),3-6.
[17] 赵世明.(2006).资格认证测验的分类一致性信度估计.考试研究,(10),30-34.
[1] 李庆功, 张雯雨, 孙捷元, 马凤玲. 8~12岁儿童同伴信任的发展:特质可信度和面孔可信度的预测作用[J]. 心理发展与教育, 2020, 36(1): 38-44.
[2] 黎光明, 张敏强. 高校教师教学水平评价多元概化理论权重效应分析[J]. 心理发展与教育, 2017, 33(1): 122-128.
[3] 罗杰, 周瑗, 陈维, 潘运, 赵守盈. 大五人格测验在中国应用的信度概化分析[J]. 心理发展与教育, 2016, 32(1): 121-128.
[4] 周海丽, 周晖. 4~7岁儿童基于不同可信度特质和情景的信任判断[J]. 心理发展与教育, 2014, 30(6): 570-576.
[5] 李庆功, 周小梅, 徐芬. 好友可信度与9~12岁儿童信任的关系及其发展[J]. 心理发展与教育, 2013, 29(3): 232-237.
[6] 徐芬, 邹容, 马凤玲, 吴定诚. 大学生面孔信任评价的自动化加工[J]. 心理发展与教育, 2012, 28(5): 449-455.
[7] 李庆功, 徐芬, 周小梅. 3~4岁儿童基于可信度特质的信任判断:特质间差异和年龄特点[J]. 心理发展与教育, 2012, 28(4): 345-352.
[8] 蒋奖, 鲁峥嵘, 蒋苾菁, 许燕. 简式父母教养方式问卷中文版的初步修订[J]. 心理发展与教育, 2010, 26(1): 94-99.
[9] 刘俊升, 周颖, 桑标. 儿童心理一致感量表中文版的心理测量学特征分析[J]. 心理发展与教育, 2010, 26(1): 87-93.
[10] 孟红, 程慧君, 曹中平, 胡昆. 小学生社会技能家长评定量表编制与适用研究[J]. 心理发展与教育, 2008, 24(4): 88-92.
[11] 张景焕, 初玉霞, 林崇德. 教师创造性教学行为评价量表的结构[J]. 心理发展与教育, 2008, 24(3): 107-112.
[12] 张玲玲, 张文新, 纪林芹, Jari-Erik Nurmi. 青少年未来取向问卷中文版的测量学分析[J]. 心理发展与教育, 2006, 22(1): 103-108.
[13] 李虹. 大学教师工作压力量表的编制及其信效度指标[J]. 心理发展与教育, 2005, 21(4): 105-109.
[14] 唐卫海, 刘希平, 方格. 学生提取自信度判断准确性的发展[J]. 心理发展与教育, 2005, 21(2): 36-41.
[15] 黄喜珊. 中文“教师效能感量表”的信、效度研究[J]. 心理发展与教育, 2005, 21(1): 115-118.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!