On evaluation design pt 2.

Third set of principles

The other thing to remember is that even if you’re leading the evaluation, it’s not your evaluation. One thing you don’t want to create is an “us and them” division within a project, where teachers provide data for the researchers. Education research should be designed with the end user in mind, which is educators, and they know better what they need to know. And everyone in the project is bound to have a good idea about research questions (I refrained from saying “better idea” but that’s probably true too). So the research questions, survey design, sources of data, all need to be collaboratively created, with practitioners and (if they’re interested) students. If there are other practitioners who want to contribute to the evaluation and writing of the report and any papers coming out of the project, they also have a right to do that, and should be included. I know of some projects (none I’ve been involved with thankfully) where academics have just gone to ground with the results and months later have a paper published, without offering anyone else within the project the opportunity to be involved and get a publication out of it. Which isn’t on. The AMORES project brought some of the schoolchildren along to the final conference. This shouldn’t really be exceptional, but it still is. Arguably it’s the learners who are the rationale behind doing all of the research in the first place. (Competing arguments are that it’s our mortgage lenders who are the rationale for doing it, but that’s another post entirely).

So .. #3 evaluation design should be Egalitarian, Inclusive, participative.

Now would be a good time to mention ethics, probably as it brings together all of the principles we’ve discussed so far.

Obviously everyone who takes part in the project needs to be protected. Everyone has the right to anonymity if they are taking part, so usually I get students to adopt a pseudonym for all interactions. There’s a piece of paper somewhere that matches pseudonym to real name (in case the student forgets and needs to look it up) but that never goes online and never leaves the classroom. Protecting the identities of staff is also important if that’s what they want, but also acknowledging their participation if that’s what they want too. Just remember to ask which it is. But ethics is really the underlying reason why you want the evaluation to be useful (you’re obliged ethically to put something back into the sector if you’re taking time and resources from it) and to be egalitarian (everyone deserves a chance to be published and have a creative input to the process.

So #4 Be ethical

The fifth set of principles are possibly the most difficult to put in place. Up to now every previous principle put in place has led to a whole set of different data, from different sources, that just happen to be around, contributed by and perhaps analysed by, a lot of different people. At this stage, it could be seen to be a bit of a mess.

However, that’s where the skill of the evaluator comes into its own. It’s taking these disparate sets of data, and looking for commonalities, differences, comparisons, and even single case studies that stand out and elucidate an area on their own. The strength of having such disparate sets of data are that they are :

#5.1 eclectic, multimodal, mixed methodologically,

However, it’s still necessary to put a minimum (remember, light touch) more robust evaluation in place at the core, in the form of a survey/questionnaire etc. This needs to contain a pre- and post-test and be open to quantitative analysis (some people only take numbers seriously). This runs against the idea of aligned with practice and opportunistic, as it’s an imposed minimum participation, but I think as long as it’s not too onerous, it’s not too much to ask. Usually though, this is the bit that requires the most struggle to get done.

So .. #5.2  quantitative comparative analysis, demanding minimum imposed involvement from practitioners to complete, provides an essential safeguard to the research to ensure robustness

However, this is not the only robust aspect. Even though the remainder of the data are opportunistic, because they are so wide-ranging they will inevitably provide qualitative data in sufficient quantity (and be triangulated), that this would in itself be an effective evaluation. It’s just good to have some numbers in there too.

To make the best of these elements, post-hoc, is the most difficult aspect of this style of evaluation, and requires a bit of time just sifting through everything and working out what it is you’ve actually got. Allow a week without actually getting anything concrete done. It’s OK, it’s just part of the process. It requires the evaluator synthesise the findings from each set of data and therefore to be

#5.3 flexible, creative, patient

As Douglas Adams once said (though he was quoting Gene Fowler) “Writing is easy. All you do is stare at a blank sheet of paper until drops of blood form on your forehead.”

 

Finally the outputs. Both the BIM Hub project and the AMORES project have the same two sets of evaluation reports. Given the aims of the project to be both useful, and robust methodologically, I think having the outputs in these two forms is essential.

Typically these two forms are:

A “how to” guide – the AMORES one is at this link:

http://www.amores-project.eu/news/why-the-amores-teaching-methodology-is-the-secret-ingredient-to-teaching-literature

The BIM Hub one is here:

http://bim-hub.lboro.ac.uk/guidance-notes/introduction/

Both of these summarise the key points of learning from the project, in a form that practitioners can adopt this learning and incorporate it into their own practice.

However, backing up these documents are fuller evaluation reports detailing the data and analysis and showing how these points of learning were arrived at, and providing the evidential basis for making the claims. This isn’t essential for people to read, but these documents do provide the authority for the statements made in the summary documents.

Finally both projects also include visual materials that contribute to the evidence. In the BIM Hub project, this is recordings of the meetings the students held, showing how their abilities developed over time. For the AMORES project there are dozens of examples of the students’ digital artefacts. In short, when you’re publishing the evaluation you also want to reassure your audience that you haven’t just made the whole thing up.

i.e. The final principle generate artefacts during the project so that at the end you can: show that it is a real project, with real students, doing real stuff

 

 

 

 

 

 

On evaluation design pt 1.

Some thoughts on my approach to evaluation design

I’ve just finished another internal evaluation of a project. This time it’s the AMORES project http://www.amores-project.eu/ Reflecting on the evaluation, and the similarities with the previous evaluation I did, led me to some realisations about the sort of evaluations I conduct, how they are designed, and what their essential elements are. I thought I’d collect these together into a couple of blog posts, mainly so that the next time I design one, I can remember the best of what I did before.

I should specify that I’m discussing particularly internal evaluation. For those not familiar with educational projects, most of them have two evaluation strands. One is the external evaluation; this is conducted by someone outside of the project who examines how well the project functioned, whether it met its goals or not, how well communications worked within it, and so on. It’s part of the Quality Assurance, compliance and accountability process.

The internal evaluation asks questions of the learners, teachers and anyone else involved with the educational aspects to identify good practice, look for tips that can be passed on, and encapsulate the overall experience for the learners and educators. In short, it’s there to answer the research questions addressed by the project.

There’s a good deal of overlap between the two, but they are essentially different things, and should be done by different people. You merge the two at your peril, as part of the external evaluation is to address the success of the internal evaluation. And you do really need both to be done.

I’ve been the internal evaluator on 13 education projects now, but the last two (the other one was the BIM Hub project http://bim-hub.lboro.ac.uk/) were very similar in evaluation design; I think I’ve cracked the essential elements of what an internal evaluation should look like.

Part of the issue with being an internal evaluator is that, even though you’re part of the project team, you’re not (usually) one of the teachers. And teachers on projects have their own agenda, which is to teach (obviously) and, quite rightly, this takes precedence over all the analysis, research and general nosiness that a researcher wants to conduct.

For this reason, an evaluation design needs to be as unobtrusive as possible. Most education activities generate a lot of data in themselves, artefacts, essays, recordings of teaching sessions, all of these can be used without placing any additional burden on the learners or teachers. Sometimes the evaluation can drive some of the learning activities. So, for example, you need students’ perceptions of their learning; so you set as an assignment a reflective essay. You need something to disseminate, so you set students the task of creating a video about their experiences, which can also be evaluation data. And when we’ve done this, not only does this prove to be a very useful set of data, it also becomes an excellent learning opportunity for the students. Teaching generates a lot of data already, too, such as grades, results of literacy testing, pupil premium figures, tracer studies. As long as the institution releases the data, then this is stuff you can use with no impact on the learners or teachers.

So here’s the first set of criteria. Evaluations must be:

Unobtrusive, opportunistic, aligned with teaching practice

The second set of criteria is related to actually having an evaluation that makes sense. There’s no point gathering a set of data that are more than you can deal with (having said that, every project I’ve done has). Also the data you collect have to be targeted towards finding out something that will be of use to other practitioners once you’ve finished the project (I’ll come to outputs later). The RUFDATA approach is a good one here. There’s also no point trying to gather so many data that no-one will look at the surveys you’re distributing, or complete them if they start them. For length of survey principles that seem to work are:

Quantitative questions – no more than one page (and use a 5-point Lykert scale obviously – anything else looks ridiculous – but add “don’t know” and “N/A” as options too.

Free text questions: well no-one wants to write an essay, and if it’s on paper you’ll have to transcribe them at some point anyway. As far as numbers go, a good rule of thumb is that if it’s a number you’d see in a movie title it’s OK. So seven, or a dozen, or even 13, is fine. More than that is pushing it (and if you’re going to ask about 451 or 1138 questions then full marks for movie trivia, but minus several million for being a smart arse. The point of the movie title thing is that if you see your research questions as characters in the narrative you’re going to weave, then you don’t want to overcrowd your story anyway. Putting too many in then becomes pointless. You want all your questions to be Yul Brynners, rather than Brad Dexters.

So: useful, targeted, light touch, practicable

A third set of principles is based around whose research is it anyway? Which will be covered when we reconvene in the next post.