Xenia Delieza
MONITORING AND EVALUATING THE KPG
SPEAKING TEST
Introduction
Oral
assessment of language proficiency is a
complex and largely subjective process
in which many variables or facets have
been found to affect the quality and
quantity of candidate language output
and the rating of their
performance. Ultimately, this threatens
the validity, reliability and fairness
of the oral test procedure. The role
and linguistic behaviour of the
interlocutor during the oral exam has
been highlighted by many researchers as
a major variable which can potentially
affect candidate output and examiner
rating. With this in mind, the KPG
English team has begun systematic
examiner training but also
examiner-conduct quality assessment,
through the English Speaking Test
Observation Project (ESTOP).
This project was
launched in November 2005, aiming to
identify whether and to what extent
examiners follow oral test conduct
rules,
adhere to the test
guidelines and carry out the oral test
as instructed. In other words, by having
especially trained professionals observe
examiners while testing candidates, with
the help of specially constructed
observations tools, the English team
wanted to obtain information about the
efficiency of the oral test
administration, about examiner conduct,
about the applicability of the oral
assessment criteria and the usefulness
of the marking grids. The information
actually obtained has been essential for
the development and refinement of the
oral test and for the training and
evaluation of oral examiners. The
results from this first phase (November
2005) were used in Observation Phase 2,
in May 2006 and this also led to four
more observation phases: May 2007,
November 2007,
May 2008 and November
2008.
To date, six
observation phases have been
carried out and for each one, a new,
refined observation form has been
produced, based on the findings of the
previous phase.
As one can see in the
table below, during
these six observation phases 1,948 oral
examiners were observed examining 6,755
candidates.
|
|
|
Examiners
|
Candidates
|
PHASE 1: November 2005
Levels B2 & C1 |
25 observers |
B2 |
138 |
470 |
C1 |
98 |
288 |
PHASE 2: May 2006
Levels B2 & C1 |
33 observers |
B2 |
155 |
540 |
C1 |
118 |
418 |
PHASE 3: May 2007
Levels
B1, B2 & C1 |
32 observers |
B1 |
35 |
132 |
B2 |
156 |
588 |
C1 |
105 |
342 |
PHASE 4: November 2007
Levels B1, B2 & C1 |
42 observers |
B1 |
50 |
201 |
B2 |
177 |
753 |
C1 |
100 |
339 |
PHASE 5: May 2008
Levels A1-2, B1, B2 & C1 |
48 observers |
A1-2 |
45 |
184 |
B1 |
60 |
193 |
B2 |
182 |
612 |
C1 |
136 |
440 |
PHASE 6: November 2008
Levels A1-2, B1, B2 & C1 |
41 observers |
A1-2 |
51 |
113 |
B1 |
55 |
154 |
B2
|
187 |
659 |
C1 |
100 |
329 |
Table 1: The KPG
observation project in numbers
How is Observation
conducted?
During observation, selected
trained professionals
are assigned to different examination
centres to monitor the oral test,
without interfering with the procedure
in any way. While watching, as third
parties, observers fill in their forms
before, while and after the oral test
has been conducted. The observation
forms are
designed so that each one is used for
only one test session, and observers are
instructed to monitor each examiner
twice; i.e. with two pairs of
candidates.
The project
has been conducted, so far, in
a random
choice of examination centres around
Greece,[1]
and the observers are present at the
examination centres going from one
examination room to the other, for as
long as the examination sessions last:
morning to afternoon.
When their
observation job is completed,
observers send their completed
observation forms to the English
Team so that information is processed,
and data is analysed. Qualitative and
quantitative results are included in a
report prepared and taken into account
by the speaking test development team,
by the persons responsible for designing
the next phase of observation and by
those responsible for the examiner
training programme.
The observation forms
The
tools prepared for this project, i.e.,
the observation forms are structured as
checklists, with specific categories and subcategories. Respondents circle YES/NO
or tick each item. There is also space
for open ended remarks next to certain
items. The
content of these forms helps the English
Team to elicit
information about the candidates (age,
sex, literacy level and how well they
did on which tasks). More importantly,
however, it is designed to elicit
information regarding the examiners and
their conduct, their choice of tasks,
whether or not they used time
effectively, how they applied the
criteria for marking, etc. Finally, they
elicit information regarding examiners’
language use and whether or not they
alter task rubrics and thus interfere
with candidates’ language output.
A
summary of some of the results[2]
The findings of the observation project
proved valuable in many respects.
Firstly, they verified what the English
Team suspected regarding the frequency
of examiner intervention
and
their potential
effects on the validity, reliability and
fairness of the test as a whole.
Secondly, they highlighted the need to
introduce changes in the examiner
training programme, so as to limit
examiner intervention. Thirdly, the
findings revealed that there are no
examiners that systematically intervene
and others that do not. Rather,
their interference depends on a
number of factors, such as candidate
level of competence and quality of
performance, stage of the test, etc.
More specifically, the findings reveal
that examiners most frequently change
task rubrics (by using an introductory
question, adding their own question or
expanding the original question with
added information) in the first activity
of the lower level exams. The
interpretation is that examiners tend to
do this to reduce candidates’ anxiety
and to facilitate language output. In
the other two activities of the B1 and
B2 level exams, examiners tend to tamper
with task rubrics, but less frequently
than at lower levels. Interventions
mainly take the form of expanding the
original task rubric or simplifying it
through the use of examples in order to
help candidates understand task
requirements and to ensure that
candidates respond to the demands of the
task.
A general conclusion is that the higher
the level of the oral test, the lower
the intensity of the examiners’
interference. During the C1 level
speaking test, there is sporadic
intervention.
The importance of the
observation project for the KPG oral
test
The
information elicited from the ESTOP
has proven
valuable and
extremely useful
for
the KPG test
developers in many ways, and especially
because the
results have contributed to the
improvement of test content. In other
words, the speaking tasks take into
consideration, among other things, the
results of the observation project.
Furthermore, the guidelines for how to
conduct the speaking test have been
affected by the observation project
results. One of the important outcomes
was that an
Interlocutor Frame was introduced, to
tackle the problem of examiner
performance variation.
The ESTOP has been
constructive on a variety of other
levels too. For one thing, it has
allowed the English Team to evaluate
examiners’
performance.
This is very important since the
ultimate aim of the system is to
establish and
maintain a certified body of trained
examiners.
Secondly,
insights from the project have been
crucial for the preparation of examiner
training material.
For all the reasons
above and for others that will be
discussed in future
publications, it has become obvious that
structured observation is a very
functional and expedient way to monitor
and assess the speaking test and
examiners.
References
Karavas E., & Delieza X.
(2009). On
site observation of KPG oral examiners:
Implications for
oral examiner training and evaluation.
Apples (Journal of Applied Language
Studies), 3 (1), 51-77.
[2]
For a more detailed presentation
of results see
Karavas & Delieza (2009).
[Back]