By Tom Ling
European Evaluation Society
Years ago, I was talking to an experienced development professional about integrated nutrition programmes. I observed that the approach taken by their organisation differed from the approach taken by another leading organisation. I was curious to understand why these approaches differed, and asked if evaluation could identify good (or even better) practice. The answer was polite enough but disturbing; ‘I think, Tom, after 30 years in international development, I know what works’. The logic was implicit but clear; ‘I don’t need evaluators to tell me how to do my job’. To be honest, I have sympathy for this view. I believe the evaluation community should be more helpful. Here I suggest two ways we might be more useful. First, by putting complex programmes much more clearly in their social context and, second, by co-producing responses alongside practitioners and other decision makers.
Decision makers have not always been well served by what is produced by the evaluation ecosystem.  What help does this ecosystem offer our professionals from the previous paragraph? In the case of integrated nutrition programmes, she might think it is not at all helpful. Undernutrition contributes globally to 45 per cent of preventable deaths in children under 5 , but the evaluation community is a long way from providing coherent evidence to deliver better care for children. Some years after the conversation mentioned above, a systematic review reinforced the concerns of our practitioner:
There is substantial evidence of positive nutrition outcomes resulting from integrating nutrition-specific interventions into nutrition specific programmes. However, there is paucity of knowledge on establishing and sustaining effective integration of nutrition intervention in fragile context. 
‘What works in what context’ is a helpful starting point, and developing middle-range theories based on understanding the programme theory of change and its context is an important part of evaluation practice. However, the application of this mantra, and its relevance to highly complex programmes in ‘rugged’ operating environments  has multiple problems. These problems start, I believe, from designing the evaluation based on a narrow theory of change while asking: ‘based on this individual evaluation, what can we say about what works in what context?’ This immediately sets us off on the wrong foot.
The importance of context was highlighted in the important work of Pawson and Tilley.  However, it is not clear that evaluators have taken ‘context’ sufficiently seriously. Most often, a complex programme is a very small event in a very large system. Where the programme bears fruit and delivers benefits, it is because of how it lands in, and works with, this system. The primary causal driver is often not the programme but the social systems it is part of and contributes to. In these complex circumstances, programmes will often have the following characteristics (see Woolcock, 2022 ):
Implementing practitioners have considerable discretion when delivering the programme;
‘Success’ depends upon multiple transactions and negotiations across different individuals and organisations;
The aims, resources and imperatives of the programme are only part of what drives behaviours, including behaviours of the intended beneficiaries; and
Intended beneficiaries are not defined by the programme but they have agency which they use in ways which may have nothing to do with the programme (and individual-level behavioural theory may be especially unhelpful in this context).
Addressing this involves putting human agency more firmly at the centre of evaluation. It is people who make things happen and not programmes. But, although it is people that drive change, people do not choose their social circumstances.  These circumstances include (among other things) the unequal distribution of resources and power.  These circumstances are not immutable, but they account for the observed patterning of social life. We need to draw more heavily on social science (among other sciences) to bring this patterning into evaluations.
So where does this leave our development professional who we met in the first paragraph?. To be helpful, evaluations should relate to practitioner experience in three ways:
The theory of change should include a deep understanding of the social circumstances which led to the problems arising in the first place and thwarted previous efforts,
The analysis should use social science and resist over-individualising behavioural explanations without ignoring the importance of human agency, and
We should understand programmes as small events in large systems and attribute causality to social processes and not to programme logics.
And would our practitioner be satisfied with this? Well, to some extent but perhaps not entirely. The next piece of the ‘what works in what context’ jigsaw should involve paying much more attention to building scientific knowledge over time. We should move away from only asking ‘did the programme work’ and towards also asking ’what have we learned about how better to deliver the Sustainable Development Goals (SDGs) and the other pressing challenges of our age?’ Development professionals should be part of a learning system that uses evaluation to help answer questions that they think are important, using evidence that contributes to better informed judgements and decisions.
Finally, even if we engage more fully with understanding the complex environments of international development, evaluators should, I believe, take more responsibility for collaborating with practitioners. Over two decades ago, Ziman raised the question of how professionals from a scientific background could communicate better with decision makers in politics and law. I would extend this to include how evaluators (especially those seeking to draw more heavily on social science) should engage with practitioners in international development. Ziman  says:
Scientists who are only accustomed to the scientific mode of disputation are not well prepared for the debating rituals of transcientific controversies. They bring into the proceedings the scientific expertise and presentational skills which have stood them well professionally and find that these do not work as usual. That is to say, their accustomed rhetorical style, shaped and refined in purely scientific arenas, just does not succeed …
Our aim as evaluators should be to respond positively to the ‘different rhetorical styles’ of practitioners (including, I should add, a greater willingness to challenge some of these rhetorical styles) so that practitioners (and policy makers) have more confidence that evaluations can help them do their jobs better. For this to succeed we need to redesign the evaluation ecosystem, including reconsidering how we frame problems, design evaluations, include young and emerging evaluators and engage their energy and creativity, how we conduct evaluations, and how we communicate and make sense of our findings. Contributing to this is where Eval4Action and other leading parts of our evaluation community add great value. We need to reach out to the funders and users of our evaluations as part of this redesign. In this, we need to be less mesmerised by the need to be independent, and more concerned with how we contribute to turning around our stalled SDGs and build a just transition to a better future.
Editor's note: This blog was written during Tom Ling’s tenure as the President of the European Evaluation Society.
Tom Ling has over 30 years of experience in designing, managing, and delivering complex evaluations focused on innovation, impact and quality. His clients have included UK Government Departments and agencies, the European Commission, UNDP, OECD, the World Bank, and many others. He is a senior research leader at RAND Europe and head of evaluation. In addition to his current role at RAND Europe, Tom has worked as head of evaluation at Save the Children, a senior research fellow and the National Audit Office and held various academic posts including Professor Emeritus at Anglia Ruskin University. He is the former President of the European Evaluation Society and an advisor to the World Bank’s Global Evaluation Initiative. Tom can be reached via LinkedIn and email at firstname.lastname@example.org.
 The term ‘evaluation eco-system refers to the inter-locking processes through which evaluation needs are identified, evaluations are commissioned, suitable evaluation providers identified, proposals submitted, evaluations conducted, and evaluation results published and used.
 Abdullahi, L.H., Rithaa, G.K., Muthomi, B. et al. ‘Best practices and opportunities for integrating nutrition specific into nutrition sensitive interventions in fragile contexts: a systematic review.’ BMC Nutr 7, 46 (2021). https://doi.org/10.1186/s40795-021-00443-1
 A ‘rugged’ environment is one which is highly variable, unpredictable, and means that we cannot easily transfer lessons about best practice from one context to another. See: Pritchett, L., Samji, S., and Hammer, J. (2012) ‘It’s all about MeE: Using structured experiential learning (‘e’) to crawl the design space.’ Helsinki: UNU-WIDER Working Paper No. 2012/104
 R. Pawson, N. Tilley Realistic Evaluation Sage, London (1997)
 Woolcock, M. (2022). ‘Will It Work Here? Using Case Studies to Generate ‘Key Facts’ About Complex Development Programs.’ In J. Widner, M. Woolcock, & D. Ortega Nieto (Eds.), The Case for Case Studies: Methods and Applications in International Development (Strategies for Social Inquiry, pp. 87-116). Cambridge: Cambridge University Press. doi:10.1017/9781108688253.006
 Or as Marx noted in the 18th Brumaire of Louis Bonaparte: "Men make their own history, but they do not make it just as they please; they do not make it under circumstances chosen by themselves, but under circumstances directly encountered, given and transmitted from the past.”
 Bourdieu, P. (1984). Distinction: A Social Critique of the Judgement of Taste. London, Routledge.
 Ziman J. (2000) ‘Are debatable scientific questions debatable?’ Social Epistemology, 2000, vol. 14, nos. 2 3, 187–199.