• / 236
  • 下载费用:50 金币  

教学课件Language Testing精品

关 键 词:
Language Testing LanguageTesting Testinglanguage testingppt langua
资源描述:
Language Testing Wei Beibei Language Testing • Introduction to language testing • Stages of test construction • Testing language skills and elements • Common testing techniques • Interpreting test scores • Achieving beneficial backwash I. Introduction • Definition of terms: test, measurement, evaluation • Approaches to language testing • Test purposes • Types of tests • Criteria of tests 1. Definition of terms Test, Measurement, Evaluation Test-1 Carroll (1968) provides the following definition of a test: ● a psychological or educational test is a procedure designed to elicit certain behavior from which one can make inferences about certain characteristics of an individual. 测试是用来获取某些行为的方式 、方法,其目的是从这些行为中 推断个人具有的某些特征。 Test-2 Anastasi (1962): “ 测试实质上是对行为样本所做的 客观的标准化的测量。” 三要素: * 行为样本 * 客观的测量 * 标准化的测量 (刘润清:P4 ) Test-3 Bachman (1999): Measurement in the social science is the process of quantifying the characteristics of persons according to explicit procedures and rules. Measurement 1 测量是根据明确的程序和规 则量化研究对象特征的过程 。 Three distinguishing features: * quantification * characteristics * explicit rules and procedures Measurement 2 Stevens (1951): “ 测量就是根据法则赋予事物数量。 ”(按照一定的规则给事物的属性指 派数字或符号的过程) 三要素: * 事物及其属性 * 指派数字或符号 * 法则 (刘:P2) Measurement 3 Weiss (1972): Evaluation can be defined as the systematic gathering of information for the purpose of making decisions. Evaluation 1 “ 评价指为做出某种决策而收集 资料,并对资料进行分析,作出 解释的系统过程。” 与测量、测 试相比其含义更广,综合性更强 。 Evaluation 2 Relationships among the three: (刘:P5) An example of evaluation that does not involve either tests or measures (area 1) is the use of qualitative descriptions of student performance for diagnosing learning problems. An example of a non-test measure for evaluation (area 2) is a teacher ranking used for assigning grades, while an example of a test used for purposes of evaluation (area 3) is the use of an achievement test to determine student progress. The most common non- evaluative uses of tests and measures are for research purposes. An example of tests that are not used for evaluation (area 4) is the use of a proficiency test as a criterion in second language acquisition research. Finally, assigning code numbers to subjects in second language research according to native language is an example of a non-test measure that is not used for evaluation (area 5). In summary, then, not all measures are tests, not all tests are evaluative, and not all evaluation involves either measurement or tests. Bachman: Neither measures nor tests are in and of themselves evaluative, and evaluation need not involve measurement or testing. 并非所有的测量都是测试,并非所有的测 试都属于评价,而且并非所有的评价活动 都涉及到测试或测量。 三者关系 2. Approaches to language testing ﹡the essay-translation approach 写作翻译 法 ﹡the structuralist approach 结构主义法 ﹡the integrative approach 综合法 ﹡the communicative approach 交际法 (1) J.B.Heaton: The essay-translation approach This approach is commonly referred to as the pre-scientific stage of language testing. No special skill or expertise in testing is required: the subjective judgment of the teacher is considered to be of paramount importance. Tests usually consist of essay writing, translation, and grammatical analysis. The tests also have a heavy literary and cultural bias. 写作翻译法的特点 • 对测试技能或专长没有专门要求,主要依 靠老师的主观判断力; • 试卷主要包括写作、翻译、语法分析等项 目; • 试卷内容有较浓厚的文学和文化色彩; • 试题需要书面回答形式,需要 人工阅卷。 The structuralist approach This approach is characterized by the view that language learning is chiefly concerned with the systematic acquisition of a set of habits. It draws on the work of structural linguistics, in particular the importance of contrastive analysis and the need to identify and measure the learner’s mastery of the separate elements of the target language: phonology, vocabulary and grammar. 结构主义法的特点 • 强调分别测试不同的语言成分,如语音、词 汇和语法,脱离上下文单独测试,听说读写 等语言技能也可分开测试; • 采用了心理测量方法(psychometric approach),强调测试的可靠性和客观性,其 典型的表现形式是多项选择题, 一个题目测试一个成分; • 便于进行考后的统计。 The integrative approach This approach involves the testing of language in context and is thus concerned primarily with meaning and the total communicative effect of discourse. They are often designed to assess the learner’s ability to use two or more skills simultaneously. Integrative tests are best characterized by the use of cloze testing and of dictation. Oral interview, translation and essay writing are also included in many integrative tests. 综合法的特点 • 强调语言测试要在上下文中进行; • 不在测试中刻意追求区别各个单项语言技 能,而是强调两项或以上语言技能的综合 评估,题型包括填空、听写、翻译、写作 等,从整体上对学生的语言能力进行测量 。 The communicative approach This approach to language testing is sometimes linked to the integrative approach. However, although both approaches emphasize the importance of the meaning or utterances rather than their form and structure, there are nevertheless fundamental differences between the two approaches. Communicative tests are concerned primarily with how language is used in communication. Language ‘use’ is often emphasized to the exclusion of language ‘usage’. 交际法的特点 • 与综合法不同,交际法更加强调的是语言 在交际过程中的使用(use)而非用法 (usage:语言的形式和结构); • 某些交际测试不排除包含有关语言用法的 内容; • 交际语言测试建立在对学生 需求的分析上,强调其真实 性。(如BEC) * 科学前语言测试(第一代体系) pre-scientific testing * 心理测量学-结构主义语言学测试(第二代 ) psychometric-structualist testing * 交际语言测试(第三代) communicative language testing 或 心理语言学-社会语言学测试 psycholinguistic-sociolinguistic testing (2)刘润清—语言测试的理论模式(P19) (3)Canale 和 Swain模式(1980) 交际能力由四个部分组成: (1)语法能力—包括语音、词汇、语法等语言知识,这些是 理解和表达语言的字面意思所必需的知识; (2)社会语言能力—包括在不同的社会环境中,理解和表达 形式与意思都恰如其分的语言能力; (3)语篇能力—包括把语言形式和内容结合的能力; (4)交际策略能力—包括在交际时如何开始、如何继续、如 何调整和转换话题,以及如何结束谈话等能力。 该模式的缺陷是没有明确指出四种能力 之间的关系如何,另外,把策略能力仅 仅当作一种语言补偿能力似乎忽视了正 常语言交际活动中的语言使用策略能力。 The framework of CLA includes three components: language competence, strategic competence, and psycho- physiological mechanisms. 语言交际能力由语言能力、策略能力和心 理生理机制三个部分组成。 (刘:P23) (4)Bachman(1990): Communicative Language Ability KNOWLEDGE STRUCTURES Knowledge of the world 知识结构 (关于世界的知识) LANGUAGE COMPETENCE Knowledge of the language 语言能力(关于语言的知识) STRATEGIC COMPETENC E 策略能力 PSYCHOPHYSIOLOGICAL MECHANISMS 心理生理机制 CONTEXT OF SITUATION 语言使用环 境 Bachman的语言交际能力的各个组成部分 语言能力 语言组织能力语用能力 语法能力语篇能力语义能力功能能力社会语言能力 句法 词法 语音 修辞结构 词语联结 语义特性 字面意思 隐含意思 达意 操纵 探索 想象 对方言和变 体的语感 对语域差 别的语感 理解和使用 文化典故和 比喻的能力 对自然地道 语的语感 情景评估 目标 用特定的功能、形式和内容 理解或表达言语 语言能力 语言组织能力 语用能力 心理生理机制 制定计划过程 从语言知识库中取材料 计划 组织材料,以期导向 交际目标 实施 神经的和生理的过程 话语 表达或理解语言 (Bachman的语言使用模式) 3. Test purposes 刘润清: • Purposes of diagnosis and backwash • Purposes of comparison and selection • Purposes of placement • Purposes of research or survey (P6) Arthur Hughes: • to measure language proficiency regardless of any language courses that candidates may have followed • to discover how far students have achieved the objectives of a course of study • to diagnose students’ strengths and weaknesses, to identify what they know and what they do not know • to assist placement of students by identifying the stage or part of a teaching program most appropriate to their ability 4. Types of tests (Heaton: 171-173) • Placement test 编班测试 • Classroom test 随堂测试 • Mid-term test 期中测试 • End-of-term test 期末测试 (1) According to different learning periods 按照学习阶段 (刘:P8-16) • Progress test 进步测试 • Proficiency test 水平测试 • Achievement/Attainment test 成绩测试 • Aptitude test 潜能测试 • Diagnostic test 诊断测试 (2) According to test purposes 按照测试目的( 用途) • Discrete-point test 分离式测试 • Integrative test 综合性测试 (3) According to test methods 按照测试方法 • Norm-referenced test 常模参照性测试 • Criterion-referenced test 标准参照性测试 (4) According to interpretations of test scores 按照对考试分数 的解释 • Subjective test 主观性测试 • Objective test 客观性测试 (5) According to scoring methods 按照试卷的评 阅方式 • Communicative testing 交际性测试 • Pragmatic test 语用测试 (6) Other types of test 5. Criteria of tests • Validity 效度 • Reliability 信度 • Power/Difficulty 难度 • Discrimination 区分度 • Practicality 实用性 • Backwash effects 后效作 用 Criteria of tests Validity The validity of a test is the extent to which it measures what it is supposed to measure and nothing else. 效度是指一套测试所考的是否就是设计人想 要考的内容,或者说,在多大程度上考了想 要考的。 Discuss on the following items: • “Is photography an art or a science?” Discuss. • “The mind is in its own place, and itself can make a Heaven of Hell, a Hell of a Heaven.” (Milton) Discuss. • Use the following words in sentences: courageous, choosy, acceptable, complicated, etc. A. John is a very courageous boy. B. John, the captain of our team, is courageous. C. I have a courageous father. Factors of validity • Face validity 表面效度 • Content validity 内容效度 • Construct validity 结构效度 • Empirical validity 实验效度 • Concurrent validity 共时效度 • Predictive validity 预测效度 Face validity • If a test item looks right to other testers, teachers, moderators, and testees, it can be described as having at least face validity. • 表面效度指考试表面的可信度或 公众的可接受程度。 • 邹申:一个考试看上去具有了拟 定的技能或能力测试。(测语音 语调用笔头考试来测则表面效度 低。) Content validity • A test is said to have content validity if its content constitutes a representative sample of the language skill, structures, etc. with which it is meant to be concerned. • 内容效度指测试是否考了考试大纲 规定要考的,或者说考试的题目在 多大程度上能代表它所要测量的目 标。 • Is the content of a test related to the objective or purpose of it? • Are the test items representative? • Is the content appropriate or suitable for the testees? Construct validity • If a test has construct validity, it is capable of measuring certain specific characteristics in accordance with a theory of language behavior and learning. • 结构(构卷)效度指测试是否以有效的语 言观(包括语言学习观和语言运用观)为 依据。这里的结构并不是指试卷的结构或 题目的编排,而是指整个考试的理论基础 。 Empirical validity • This validity is obtained as a result of comparing the results of the test with the results of some criterion measure. • 实验(统计)效度是将考试结果 与其它测量结果相比较而得来的 。它又可分为共时效度和预测效 度。 Concurrent validity • If the results of the test are compared with the results of some criterion measure such as: — an existing test, known or believed to be valid and given; or — the teacher’s ratings or any other such form of independent assessment given at the same time, then results obtained by either of the above two methods are measures of the test’s concurrent validity in respect of the particular criterion used. • In other words, concurrent validity is established when the test and the criterion are administered at about the same time. • 共时效度是将一次测试的结果同另一次 同时或时间相近的测试的结果相比较, 或同教师对学生的评估相比较而得出的 系数。例如拿期末考试成绩与刚刚结束 的四级考试成绩相比,假若得分情况相 似,则说明期末测试有较高的共时效度 。 (前提:四级考试效度很高。) Predicative validity • If the results of the test are compared with the results of some criterion measure such as: — the subsequent performance of the testees on a certain task measured by some valid test; or — the teacher’s ratings or any other such form of independent assessment given later, then results obtained by either of these two methods are measures of the test’s predicative validity in respect of the particular criterion used. • In other words, predicative validity concerns the degree to which a test can predict the testers’ future performance or success. • 预测效度涉及测试的预测能力, 即测试结果到底在多大程度上能 够预测出某些将来会发生的可能 性,或者说考试是否具有预测学 生未来表现或成绩的功能。 A Test is said to be reliable if it is consistent in its measurements. 信度是指考试结果的可靠性和稳定性。例如 拿一份卷子对同一组学生实施两次或多次测 试,如果结果很一致,则说明该测试的信度 较高。 Reliability 验证测试信度的方法 • 考后复考法 (test/retest method) • 试题分半法 (split-half method) • 平行试题法 (parallel forms method) (刘润清:P211)(Heaton: P163) test/retest method This method is to re-administer the same test after a lapse of time. It is often impracticable since certain students will benefit more than others by a familiarity with the type and format of the test. Moreover, in addition to changes in performance resulting from the memory factor, personal factors such as motivation and differential maturation will also account for differences in the performances of certain students. split-half method This method estimates a different kind of reliability from that estimated by test/re-test procedure. It is based on the principle that, if an accurate measuring instrument were broken into two equal parts, the measurements obtained with one part would correspond exactly to those obtained with the other. (Heaton: 164) parallel forms method This method is to administer parallel forms of the test to the same group. This assumes that two similar versions of a particular test can be constructed: such tests must be identical in the nature of their sampling, difficulty, length, rubrics, etc. only after a full statistical analysis of the tests and all the items contained in them can the tests safely be regarded as parallel. If the correlation between the two tests is high, then the tests can be termed reliable. Factors affecting the reliability of a test: • the extent of the sample of material selected for testing; • the administration of the test (Heaton: P162) 影响考试信度的因素 (刘润清:P214) • 题量 • 题目性质 • 题目区分度 • 成绩分布 • 题目难度 • 评分是否客观 • 考试的时间 Power/Difficulty 难度是指一套试题中每个题目的难易程度 。分析一套试卷的质量如何,除了看其信 度和效度这两个重要指标之外,还要研究 试题的难度指数(index of difficulty/facility value),即试题的难易度。 难度值的计算公式: • 题目的难度通常用P来表示,P值 实际上指的是答对题目的比率。假 设有10名考生,某道题有8人答对 ,那么该题的难度值为: 适用于主观性试题的公式 • 假设某写作题的满分为20分,所有 考生在这道题上的得分的平均分为 16分,则该题的难度值为: 正态分布图 (刘润清:P217) Discrimination Discrimination of a test is its capability to discriminate among the different candidates and to reflect the differences in the performance of the individuals in the group. 区分度指一个题目区分考生能力的程度。 计算题目区分度的方法 • 公式法 • 点双列相关系数法 • 双列相关系数法 (刘:P221) Practicality A good test is practical. It is within the means of financial limitations, time constraints, ease of administration, and scoring and interpretation. 实用性是指试题是否便于使用以及实 施 起来是否可行。 Factors affecting practicality • the length of time available for the administration of the test • the answer sheet and the stationery used • the test situation • the necessary equipment • the presentation of the test paper (Heaton:167) Backwash effects The term backwash (also sometimes referred to as washback) refers to the effects of a test on teaching and learning. If a test has good backwash effects, it will exert a good influence on the learning and teaching that takes place before the test. Discussion • What’s the relationship among tests, measurement, and evaluation? • According to J.B. Heaton, what are the four main approaches to testing? And what are their features? • Consider any tests with which you’re familiar. Assess each of them in terms of the various kinds of validity. II. Stages of test construction • Deciding on test purpose, type, content, items, and framework • Writing specifications for the test • Writing and revising the test • Further considerations 1. Deciding on test purpose, type, content, items, and framework Purpose and type of the test According to different purposes, tests can be classified into several types: • Progress test 进步测试 • Achievement/Attainment test 成绩测试 • Proficiency test 水平测 试 • Aptitude test 潜能测试 • Diagnostic test 诊断测试 Test content-1 Tim(2000): Establishing test content involves careful sampling from the domain of the test, that is, the set of tasks or the kinds of behaviors in the criterion setting, as informed by our understanding of the test construct. (Tim:P25) Test content-2 B. Bloom: six levels of educational objectives • 知识 • 理解 • 应用 • 分析
展开阅读全文
  麦档网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
0条评论

还可以输入200字符

暂无评论,赶快抢占沙发吧。

关于本文
本文标题:教学课件Language Testing精品
链接地址:https://www.maidoc.com/p-15679724.html
关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

[email protected] 2018-2020 maidoc.com版权所有  文库上传用户QQ群:3303921 

麦档网为“文档C2C模式”,即用户上传的文档所得金币直接给(下载)用户,本站只是中间服务平台,本站所有文档下载所得的金币归上传人(含作者)所有。
备案号:蜀ICP备17040478号-3  
川公网安备:51019002001290号 

本站提供办公文档学习资料考试资料文档下载


收起
展开