DIRECTIONS e-journal

Bessie Mitsikopoulou
EVALUATION CRITERIA FOR THE KPG WRITING TEST IN ENGLISH

This article is concerned with the evaluation criteria for the writing test paper of the English exam, i.e., Module 2. The philosophy of this test paper is based on a functional view of language and a genre approach to writing assessment. Genre-based approaches emphasize the social constructedness of language and acknowledge that, while language is produced by individuals, its shape and structure are to a great extent socially determined. Taking into account that in everyday life we write as members of specific communities, producing texts which conform to different social rules (influenced by a variety of contextual factors, such as who is writing what to whom and for what purpose), similarly in assessment conditions, we ask our candidates to produce scripts which take into consideration specific contextual factors. In fact, the rubrics (instructions) of the writing activities require candidates to assume a role in order to produce a text of a particular text type, addressing a specific audience and meeting a predefined communicative purpose; that is, they determine to a great extent the content, text organization and the language to be used in a candidate’s script.

Consequently, when evaluating scripts, KPG raters are trained to assess candidates’ ability to produce language which is appropriate for the situational context rather than simply correct in terms of form.

A second rule of the KPG writing evaluation system is to have achievement as the starting point rather than failure. In other words, raters are trained to consider what candidates have managed to accomplish rather than what they have failed to do. In fact the KPG rater guide and the rating grid serve as tools to help raters focus on candidates’ communicative performance rather than the grammatical, lexical, spelling or punctuation errors they might have made when writing. In assessing performance, they are guided to consider what candidates have done, how well they have responded to the writing activity and the degree to which they have used structures and forms appropriately, given the context which is always specified.

Evaluation criteria

There are three evaluation criteria which aim at helping raters focus not merely on sentence grammar and lexis, but on discourse and text as well as sentence grammar.

Evaluation criterion 1

The first evaluation criterion has to do with task completion and is directly related to contextual features, i.e. the communicative purpose of the produced script, its appropriateness in terms of genre, register and style.

The notion of genre is used here to refer to both text type and generic process. A genre as text type (e.g. new report, email, recipe, film review) is characterised by relatively stable structural forms (e.g. particular beginnings, middle and ends), particular ways of organizing information (e.g. in paragraphs or in bullet forms) and lexicogrammatical features and patterns used according to the social purpose of the text. However, texts are not only determined by an overall social purpose but they are also formed out of the dynamics of social processes, such as to instruct, to argue, to narrate, to describe or to explain. Each one of these processes is associated with different language features; for example, a newspaper article whose purpose is to report a racist event employs different language features from an article whose purpose is to argue against racism. Raters are trained to assess the degree to which a script has addressed the task set and has developed it in terms of theme/topic. They also consider whether candidates have produced an appropriate text type and responded to the required generic processes.

For example, let us say that the writing activity requires candidates to produce a report (text type) which informs readers (generic processes) about the work of a volunteer programme (theme) and that instead of producing a report some candidates produced a letter. This would be partially ok, if these candidates managed at least to inform readers about the volunteer programme’s work. However, if instead of informing about the work that the programme involves candidates provide information about the advantages of joining the programme, they may have failed to meet Criterion 1, especially because it is likely that they have not used appropriate register and style, which are additional requirements of this criterion –depending of course of the level of language proficiency being tested.

Evaluation criterion 2

Criterion 2 is related to text grammar (text organization, coherence and cohesion). The notion of text grammar as understood here addresses issues above the sentence level. Raters are trained to assess the degree to which candidates have managed to produce a coherent and cohesive script. Coherence refers to the presentation of ideas in a logical and understandable way. Candidates are expected to produce coherent texts by drawing on knowledge of how to organize and present their ideas from their previous experience as text producers and from their experience as readers. For instance, they know that events in a story are presented in chronological order, while arguments in an essay are often presented in terms of their importance (starting from the less important and moving to the most important arguments, or the opposite).

Candidates are also expected to produce cohesive texts. Cohesion refers to the ways a part of a text is linked to another part of the text and it can be achieved in a script through a variety of ways: through the use of connectives (linking words and expressions), pronoun reference, repetition of key words, etc:

Connectives indicate how an idea presented in one sentence relates to the next one (e.g. in an antithetical way through the use of connectives such as but, on the other hand, however, etc). The use of connectives is an indicator of writing development: the more advanced a candidate’s level of language competence, the more complex and logical connectives that s/he uses for the construction of complex sentences. However, some candidates make repeated use of some formal connectives (e.g. in addition, furthermore, to conclude) which are to be found in a formal essay, but which may be inappropriately used in various other genres, as well.

Reference is another way through which cohesion may be achieved. The use of personal pronouns is the most common way of maintaining reference which avoids the repetition of names. Raters are trained to take into account the fact that control of reference is an indicator of how well the flow of information from one sentence to the next or from one part of a script to another is maintained.

Tense consistency refers to appropriate use and control of tenses in a script. The use of tense changes from genre to genre. For example, factual descriptions are generally written in the present tense, while narratives in the past tense. Raters are trained to assess a script’s use and control of tense and possible changes in the same script.

Overall, the criterion of text grammar considers how all parts of a script are structured, organized and coded, in order to make it effective for the purposes of a particular communicative context.

Evaluation Criterion 3

Criterion 3 is related to sentence grammar and lexical features. Sentence grammar refers to the use of language or lexicogrammar according to formal rules of grammar, syntax and morphology. Raters are trained to assess candidates’ writing performance considering the features of their sentences; e.g., correctness of clause pattern, subject-verb agreement, verb form, preposition, articles, plurals, etc. Errors are expected in varying degrees, depending on the level. However, raters are trained not to seriously penalize errors as long as they do not interfere with intelligibility. The importance of errors which violate rules of formal grammar are assessed on the basis of whether a script manages to convey a socially meaningful message, despite formal errors.

Concerning lexical features, different text types require use of different types of vocabulary as well as a different range of vocabulary (or repetition), depending on determining categories such as topic, purpose and audience. For example, an academic report will use a range of technical vocabulary including nominalizations and technical noun groups; a literary description, on the other hand, will use descriptive verbs, adjectives and adverbs, and affective language in order to create an emotive effect on the reader. Raters should assess vocabulary appropriacy in terms of a specific text type and lexical range (for more advanced levels). Spelling errors are considered differently, depending on whether or not they interfere with intelligibility and depending on the level.

Punctuation and writing conventions are also assessed when considering Criterion 3.

Applying the evaluation criteria

The rating process is done on the basis of a rating grid which guides raters to follow a procedure which moves from a holistic view (i.e. overall impression) to finer points of assessment. Though the grid for each level is different, all grids have been designed on the basis of the same philosophy and therefore, they can be applied by using the same methodology. The idea behind it is to have a zone based assessment rather than to subtract points on the basis of how many errors a script has. The decision the rater has to make, by moving from right to left on the grid and from left to right, is to decide whether the script is fully satisfactory (responding to all three evaluation criteria), moderately satisfactory (satisfying some of the criteria or satisfying them partially) or unsatisfactory (partly responding to a limited number of criteria or points of the criteria).

The application of the rating grid is generally a demanding process. In order to ensure reliable assessment and marking, the system needs to make sure that during the script rating process, raters use the rating grid systematically and correctly. In order to achieve this goal the system takes a variety of measures. The two most important ones are:

1. To produce material that provides very concrete guidelines as to how to assess and mark KPG writing scripts. It is for this reason that for every exam period a very detailed Rater Guide is produced and given (free of charge) to KPG raters. This includes analyses of the evaluation criteria, the rating grids and guidelines regarding how to mark the scripts of the particular exam period with articulated expected outcomes and real sample scripts which have already been marked. To help further, a Handbook for this purpose is being prepared. It is an RCeL publication, to appear within the RCeL publication series, of which the editors are Bessie Dendrinos & Kia Karavas. The Handbook (edited by myself) is entitled The KPG Writing Test in English.

2. To systematically work with raters. In the last issue of the ELT News, the “KPG Corner” presented the KPG script rating programme which runs every exam period before script assessment begins. The overall aim of this programme is to train raters to be able to reliably mark the scripts which have resulted from the writing activities of that period’s exams. In order to enhance assessment reliability, KPG scripts are evaluated and marked by two raters, and the final mark of each script is the average of their marks. If there is high discrepancy between them, scripts are re-evaluated, and the incident a cause for concern. The raters involved are informed. Of course, to avoid such incidents, raters are consistently supervised by highly qualified coordinators who are also assigned with the responsibility of evaluating the raters for intra- and inter-rater reliability.

[Back]