TYPE OF TEST

BASED ON SCORE INTERPRETATION

1. Norm-Referenced testing

Test which compares a person’s score against the scores of a group of people who have already taken the same exam.

A norm-referenced test is a standardized test that compares a student's test performance with that of a sample of similar students who have taken the same test. After constructing a test, the test developers administer it to a standardization sample of students using the same administration and scoring procedures for all students. This makes the administration and scoring "standardized." The test scores of the standardization sample are called norms, which include a variety of types of scores. Norms are the scores obtained by the standardization sample and are the scores to which students are compared when they are administered a test.

Once test developers standardize a norm-referenced test, examiners can administer it to students with similar characteristics to the norm group and can compare the scores of these students with those of the norm group. Norm-referenced standardized tests can use local, state, or national norms as a base. Because of the comparison of scores between a norm group and other groups of students, a norm-referenced test provides information on the relative standing of students.

When assessing students with disabilities, evaluators should employ caution before making comparisons or interpretations stemming from established norms. It is possible to use typical norms when making interpretations that draw from the relative performance of the students with disabilities and from the general population of students. However, when making comparisons or interpretations that use level or degree of disability, normative data should come from the sample population to which comparisons are made.

Test manuals should provide sufficient details about the normative group so that test users can make informed judgments about the appropriateness of the norm sample (American Educational Research Association et al., 1999).

Purpose :

The major reason for using a norm-referenced tests (NRT) is to classify students. NRTs are designed to highlight achievement differences between and among students to produce a dependable rank order of students across a continuum of achievement from high achievers to low achievers (Stiggins, 1994). School systems might want to classify students in this way so that they can be properly placed in remedial or gifted programs. These types of tests are also used to help teachers select students for different ability level reading or mathematics instructional groups.

With norm-referenced tests, a representative group of students is given the test prior to its availability to the public. The scores of the students who take the test after publication are then compared to those of the norm group. Tests such as the California Achievement Test (CTB/McGraw-Hill), the Iowa Test of Basic Skills (Riverside), and the Metropolitan Achievement Test (Psychological Corporation) are normed using a national sample of students. Because norming a test is such an elaborate and expensive process, the norms are typically used by test publishers for 7 years. All students who take the test during that seven year period have their scores compared to the original norm group.

Content : The content of an NRT test is selected according to how well it ranks students from high achievers to low. (discriminates among students.)

The normal curve represents the norm or average performance of a population and the scores that are above and below the average within that population. The norms for a test include percentile ranks, standard scores, and other statistics for the norm group on which the test was standardized. A certain percentage of the norm group falls within various ranges along the normal curve. Depending on the range within which test scores fall, scores correspond to various descriptors ranging from deficient to superior.

The Graduate Record Exam (GRE)

The GRE is taken by college students wishing to enter graduate schools. The test items are included in an actual exam after they are analyzed and determined to discriminate appropriately. The following quote describes the "test development process" at GRE:

The General Test is composed of questions formulated by specialists in various fields. New questions are pretested in actual tests under standard testing conditions. Questions appearing in a test for the first time are analyzed for usefulness and potential weaknesses; they are not used in computing scores. Questions that perform satisfactorily become part of a pool from which new editions of the General Test are assembled at a future date.

Others example : IQ test, TOEFL, CAT, CTBS and SAT

2. Criterion-Referenced Testing

A criterion-referenced test is a test that provides a basis for determining a candidate's level of knowledge and skills in relation to a well-defined domain of content. Often one or more performance standards are set on the test score scale to aid in test score interpretation. Criterion-referenced tests, a type of test introduced by Glaser (1962) and Popham and Husek (1969), are also known as domain-referenced tests, competency tests, basic skills tests, mastery tests, performance tests or assessments, authentic assessments, objective-referenced tests, standards-based tests, credentialing exams, and more. What all of these tests have in common is that they attempt to determine a candidate's level of performance in relation to a well-defined domain of content. This can be contrasted with norm-referenced tests, which determine a candidate's level of the construct measured by a test in relation to a well-defined reference group of candidates, referred to as the norm group. So it might be said that criterion-referenced tests permit a candidate's score to be interpreted in relation to a domain of content, and norm-referenced tests permit a candidate's score to be interpreted in relation to a group of examinees. The first interpretation is content-centered, and the second interpretation is examinee-centered.

Purpose :

While norm-referenced tests ascertains the rank of students, criterion-referenced tests (CRTs) determine "...what test takers can do and what they know, not how they compare to others (Anastasi, 1988, p. 102). CRTs report how well students are doing relative to a pre-determined performance level on a specified set of educational goals or outcomes included in the school, district, or state curriculum.

Educators or policy makers may choose to use a CRT when they wish to see how well students have learned the knowledge and skills which they are expected to have mastered. This information may be used as one piece of information to determine how well the student is learning the desired curriculum and how well the school is teaching that curriculum.

Content : The content of a CRT test is determined by how well it matches the learning outcomes deemed most important. Although no test can measure everything of importance, the content selected for the CRT is selected on the basis of its significance in the curriculum.

A Sample CRM: The Performance Assessment

Most appropriate for determining the progress of smaller numbers of students on higher-order learning tasks. For performance assessments, students are tasked with creating or presenting a unique product or solution (paper, design, oral presentation, hands-on experiment). They are given standards or expected criteria prior to their performance. The standards are used to create rubrics or scales for use by instructors or raters in assessing student products or presentations.

Classroom quizzes and exams that are based on course objectives are other examples of criterion-references measures. Quizzes and exams can be norm-referenced, however, if the instructor purposely selects items that discriminate

Others Example : PIAT (Peabody Individual Achievement Test)

BASED ON TESTING MODE

1. Direct Test

A test is said to be direct when the test actually requires the candidate to demonstrate ability in the skill being sampled. It is a performance test. For example, if we wanted to find out if someone could drive a vehicle, we would test this most effectively by actually asking him to drive the vehicle. In language terms, if we wanted to test whether someone could write an academic essay, we would ask him to do just that. In terms of spoken interaction, we would require candidates to participate in oral activities that replicated as closely as possible [and this is the problem] all aspects of real-life language use, including time constraints, dealing with multiple interlocutors, and ambient noise. Attempts to reproduce aspects of real life within tests have led to some interesting scenarios. Such tests include:

· Role-playing.

· Information gap tasks.

· Reading authentic texts, listening to authentic texts.

· Writing letters, reports, form filling and note taking.

· Summarising.

Direct tests are task oriented rather than test oriented, they require the ability to use language in real situations, and they therefore should have a good formative effect on your future teaching methods and help you with curricula writing. However, they do call for skill and judgment on the part of the teacher.

2. Indirect Test

An indirect test measures the ability or knowledge that underlies the skill we are trying to sample in our test. So, for example, you might test someone on the Highway Code in order to determine whether he is a safe and law-abiding driver [as is now done as part of the UK driving test]. An example from language learning might be to test the learners’ pronunciation ability by asking them to match words that rhymed with each other.

One of these words sound different from the others. Underline it.

Door, law, though, pore

This is essentially knowledge about the target language [or recognition of target language items] rather than actual performance in the language. Indirect testing is controversial, and views on it vary, but it is clear that many of the claims made for it in the past cannot be readily substantiated. It does not give any direct indication of the candidates’ oral proficiency, accuracy, or appropriateness of pronunciation. In many instances, an indirect approach involves the testing of enabling skills at a micro-level. Thus, in terms of spoken interaction, we might seek to test learners by asking them to write down what they would actually say in a given situation.

BASED ON CONCEPT OF LANGUAGE ABILITY

1. Discrete Testing

Discrete Point tests are based on an analytical view of language. This is where language is divided up so that components of it may be tested. Discrete point tests aim to achieve a high reliability factor by testing a large number of discrete items. From these separated parts, you can form an opinion is which is then applied to language as an entity. You may recognise some of the following Discrete Point tests:

1. Phoneme recognition.

2. Yes/No, True/ False answers.

3. Spelling.

4. Word completion.

5. Grammar items.

6. Most multiple choice tests.

Such tests have a down side in that they take language out of context and usually bear no relationship to the concept or use of whole language.

2. Integrative Testing

In order to overcome the above defect, you should consider Integrative tests. Such tests usually require the testees to demonstrate simultaneous control over several aspects of language, just as they would in real language use situations. Examples of Integrative tests that you may be familiar with include:

1. Cloze tests

2. Dictation

3. Translation

4. Essays and other coherent writing tasks

5. Oral interviews and conversation

6. Reading, or other extended samples of real text

3. Communicative Testing

Since the late 1970s and early 1980s the Communicative approach to language teaching has gained dominance. What is actually meant by ‘Communicative ability’ is still a matter of academic interest and research. Broadly speaking communicative ability should encompass the following skills:

· Grammatical competence. How grammar rules are actually applied in written and oral real life language situations.

· Sociolinguistic competence. Knowing the rules of language use, ‘Turn taking’ during conversation discourse, etc. or using appropriate language for a given situation.

· Strategic competence. Being able to use appropriate verbal and non-verbal communication strategies.

Communicative tests are concerned not only with these different aspects of knowledge but on the testees’ ability to demonstrate them in actual situations. So, how should you go about setting a Communicative test?

Firstly, you should attempt to replicate real life situations. Within these situations communicative ability can be tested as representatively as possible. There is a strong emphasis on the purpose of the test. The importance of context is recognised. There should be both authenticity of task and genuiness of texts. Tasks ought to be as direct as possible. When engaged in oral assessment you should attempt to reflect the interactive nature of normal speech and also assess pragmatic skills being used.

Communicative tests are both direct and integrative. They attempt to focus on the expression and understanding of the functional use of language rather than on the more limited mastery of language form found in discreet point tests.

The theoretical status of communicative testing is still subject to criticism in some quarters, yet as language teachers see the positive benefits accruing from such testing, they are becoming more and more acceptable. They will not only help you to develop communicative classroom competence but also to bridge the gap between teaching, testing and real life. They are useful tools in the areas of curriculum development and in the assessment of future needs, as they aim to reflect real life situations. For participating teachers and students this can only be beneficial.

4. Performance Testing

Performance test or assessment is a term that is commonly used in place of, or with, authentic assessment. Performance assessment requires students to demonstrate their knowledge, skills, and strategies by creating a response or a product (Rudner & Boston, 1994; Wiggins, 1989). Rather than choosing from several multiple-choice options, students might demonstrate their literacy abilities by conducting research and writing a report, developing a character analysis, debating a character's motives, creating a mobile of important information they learned, dramatizing a favorite story, drawing and writing about a story, or reading aloud a personally meaningful section of a story. For example, after completing a first-grade theme on families in which students learned about being part of a family and about the structure and sequence of stories, students might illustrate and write their own flap stories with several parts, telling a story about how a family member or friend helped them when they were feeling sad.

The formats for performance assessments range from relatively short answers to long-term projects that require students to present or demonstrate their work. These performances often require students to engage in higher-order thinking and to integrate many language arts skills. Consequently, some performance assessments are longer and more complex than more traditional assessments. Within a complete assessment system, however, there should be a balance of longer performance assessments and shorter ones.

BASED ON COURAGE ON THE MATERIAL AND TIME ALLOTMENT

1. Power Test

On a power test, the student is given sufficient time to finish the test. Some students may not answer all the questions, but this is because they are unable to do so, not because they were rushed. Most classroom tests are power tests: the length has been set to permit all students to complete the test.

2. Speed Test

On a speed test, the student works against time. A typical speed test is the typing test in which the student tries to improve his or her rate of words per minute. A language test that is so long the students are unable to finish within the time allotted and that contains items of more or less equal difficulty throughout the test would be considered a speed test. For instance, the reading and translation test given for doctoral candidates is frequently a speed test: the candidates must finish the translation within a specific time limit.

BASED ON ITEM PRESENTATION

1. Computer Adaptive Test

Computer-adaptive testing (CAT) is a technologically advanced method of assessment in which the computer selects and presents test items to examinees according to the estimated level of the examinee's language ability. The basic notion of an adaptive test is to mimic automatically what a wise examiner would normally do. Specifically, if an examiner asked a question that turned out to be too difficult for the examinee, the next question asked would be considerably easier. This approach stems from the realization that we learn little about an individual's ability if we persist in asking questions that are far too difficult or far too easy for that person. We learn the most about an examinee's ability when we accurately direct our questions at the current level of the examinee's ability (Wainer, 1990, p. 10).

Thus, in a CAT, the first item is usually of a medium-difficulty level for the test population. An examinee who responds correctly will then receive a more difficult item. An examinee who misses the first item will be given an easier question. And so it goes, with the computer algorithm adjusting the selection of the items interactively to the successful or failed responses of the test taker.

Advantages :

In a CAT, each examinee takes a unique test that is tailored to his or her ability level. Avoided are questions that have low information value about the test taker's proficiency. The result of this approach is higher precision across a wider range of ability levels (Carlson, 1994, p. 218). In fact, CAT was developed to eliminate the time-consuming and inefficient (and traditional) test that presents easy questions to high-ability persons and excessively difficult questions to low-ability testees. Other advantages of CAT include the following:

Self-Pacing. CAT allows test takers to work at their own pace. The speed of examinee responses could be used as additional information in assessing proficiency, if desired and warranted.
Challenge. Test takers are challenged by test items at an appropriate level; they are not discouraged or annoyed by items that are far above or below their ability level.
Immediate Feedback. The test can be scored immediately, providing instantaneous feedback for the examinees.
Improved Test Security. The computer contains the entire item pool, rather than merely those specific items that will make up the examinee's test. As a result, it is more difficult to artificially boost one's scores by merely learning a few items or even types of items (Wainer, 1990). However, in order to achieve improved security, the item pool must be sufficiently large to ensure that test items do not reappear with a frequency sufficient to allow examinees to memorize them.
Multimedia Presentation. Tests can include text, graphics, photographs, and even full-motion video clips, although multimedia CAT development is still in its infancy.

2. Paper-based Test

Paper based Testing uses traditional standard testing to assess the student abilities . It is commonly used in class during teachimg and learning activities. For example, it holds in daily test and some quizes.

BASED ON TEST-MAKER

1. Teacher Made Test

Teacher made test are prepared by teacher for use with particular groups of students with regard to curriculum . Teacher made test may reveal specific areas of instruction in which a students need remedial help.

2. Standardized Test

Standardized tests take the form of a series of questions with multiple choice answers which can be filled out by thousands of test takers at once and quickly graded using scanning machines. The test is designed to measure test takers against each other and a standard, and standardized tests are used to assess progress in schools, ability to attend institutions of higher education, and to place students in programs suited to their abilities. Many parents and educators have criticized standardized testing, arguing that it is not a fair measure of the abilities of the test taker, and that standardized testing, especially high-stakes testing, should be minimized or abolished altogether.

Standardized tests can either be on paper or on a computer. The test taker is provided with a question, statement, or problem, and expected to select one of the choices below it as an answer. Sometimes the answer is straightforward; when asked what two plus two is, a student would select “four” from the list of available answers. The answer is not always so clear, as many tests include more theoretical questions, like those involving a short passage that the test taker is asked to read. The student is instructed to pick the best available answer, and at the end of a set time period, answer sheets are collected and scored.

There are some advantages to standardized tests. They are cheap, very quick to grade, and they allow analysts to look at a wide sample of individuals. For this reason, they are often used to measure the progress of a school, by comparing standardized test results with students from other schools. However, standardized tests are ultimately not a very good measure of individual student performance and intelligence, because the system is extremely simplistic. A standardized test can measure whether or not a student knows when the Magna Carta was written, for example, but it cannot determine whether or not the student has absorbed and thought about the larger issues surrounding the historical document.

Studies on the format of standardized tests have suggested that many of them contain embedded cultural biases which make them inherently more difficult for children outside the culture of the test writers. Although most tests are analyzed for obvious bias and offensive terms, subconscious bias can never be fully eliminated. Furthermore, critics have argued that standardized tests do not allow a student to demonstrate his or her skills of reasoning, deductive logic, critical thinking, and creativity. For this reason, some tests integrate short essays. These essays are often given only short attention by graders, who frequently vary widely in opinion on how they think the essay should be scored.

Finally, many concerned parents and educators disapprove of the practice of high-stakes testing. When a standardized test is used alone to determine whether or not a student should advance a grade, graduate, or be admitted to school, this is known as high-stakes testing. Often, school accreditation or teacher promotion rests on the outcome of standardized tests alone, an issue of serious concern to many people. Critics of high-stakes testing believe that other factors should be accounted for when considering big issues including classroom performance, interviews, class work, and observations.

References :

Brown, H.Douglas. 2003. Language Assessment(Principles and Classroom Practices). San Francisco State University: Longman.

Cizek, G. (Ed.). (2001). Setting performance standards: Concepts, methods, and perspectives.

Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. New York: Cambridge University Press.

www.altalang.com/.../norm-referenced-vs-criterion-referenced

http://www.fairtest.org/facts/csrtests.html

"WE LOVE ENGLISH"

Pages

TYPE OF TEST

www.altalang.com/.../norm-referenced-vs-criterion-referenced

http://www.fairtest.org/facts/csrtests.html

www.polyu.edu.hk/.../reference_b_criterion%20referenced.html

http://www.ets.org/toefl/pbt/about

http://www.ets.org/gre/revised_general/register/pbt/ GRE Revised General Test: Paper-based Test

http://www.psychometric-success.com/aptitude-tests/speed-versus-power-tests.html

www.syvum.com/gmat/cat.html

0 comments:

Post a Comment

Labels

Blog Archive

Add My Facebook

Like This, please!