EDUCATION REVIEW

This review has been accessed

times since February 19, 1998

Chelimsky, E. & Shadish, W. R. (Eds.). (1997). Evaluation for the 21st Century: A Handbook. Thousand Oaks, CA: SAGE Publications, Inc.

Reviewed by Daniel J. Heck, University of Illinois at Urbana-Champaign

February 19, 1998

Evaluation as an academic discipline, a profession, and a government function has only developed in the past four decades in the United States and in several other industrially developed nations. In many nations, however, evaluation is in its infancy as a standardized pursuit; and certainly on a global scale, evaluation is only beginning to enter the scene. Eleanor Chelimsky and William Shadish's Evaluation for the 21st Century: A Handbook examines the maturation of evaluation in economically advanced democracies alongside the advances in evaluation now budding in developing nations, and the escalation of evaluation approaches toward traditional social programs alongside the growth in applications of evaluation approaches to many non-traditional areas. The organization of the volume uses several lenses to structure its explorations, capitalizing on the experience and expertise of evaluators, theorists, and methodologists worldwide.

The chapters in Chelimsky and Shadish's volume derive from the International Evaluation Conference held in Vancouver, British Columbia, Canada in November, 1996. The conference represented one of the first large international events for the field of evaluation. The chapters of the resulting book are a distilled collection of diverse thinking about evaluation from evaluators, academics, and auditors from eleven nations and the World Bank.

Evaluation for the 21st Century provides a prospective view of the field of evaluation in the coming century through a divergent examination of how the field came to be where it is today and how its theories, principles, and methods are being applied across nations, disciplines, topics, and institutions. The book implicitly calls for those who study, practice, and use evaluation to broaden their view of the applications and functions of evaluation with an eye to the future. Its prospective view in this sense entails expanding our vision in time, as the practice and use of evaluation engages more lasting phenomena, in space, as evaluation is undertaken in more program areas and nations, and in scope, as local, national, regional, and global issues are increasingly recognized as pertinent to all evaluation studies. At the same time, the editors and authors urge evaluators to capitalize on unifying forces in the field.

The editors acquaint the reader with the major threads running through each section of the book with analytical summaries of the chapters. These summaries identify both the themes that connect the chapters in each section and highlight many of the most important contributions that each author makes. Each section independently offers important contributions to the field, which I will touch on briefly, while a number of cross- cutting themes connect various chapters throughout the volume. As the book is organized, each section stands well on its own, as a reader might expect in a handbook, but it is the cross-cutting themes relating historical developments, political implications for evaluation, purposes for evaluation, novel applications of evaluation, methodologies, and theoretical insights that will likely invite the reader to return to Evaluation for the 21st Century as a valuable source book.

The book challenges the reader to consider the shape evaluation may take in the new century; the roles that evaluation might play in local, national, international, and global matters; and some challenges that likely lay ahead for the still emerging field of evaluation. In particular, the many issues of changing political context with increasingly global concerns of sustainable development in the post-Cold War period must be considered in terms of what knowledge, understanding, and action evaluation might offer. Moreover, evaluators must contemplate how these issues will impact the field of evaluation. As such forces are likely to expand the perspectives in an already eclectic field, the editors and authors challenge evaluators to transform debates over methodological and epistemological conflict into serious pursuit of integrated approaches that respect pluralistic value systems.

I will not attempt to replicate here the introductory summaries that the editors provide to each section of Evaluation for the 21st Century, but it is worth briefly delineating a few of the strongest messages the book advances through each section. Chelimsky's preface on the future of evaluation and her chapter on the current shape of the field along with Thomas Cook's examination of lessons learned in the young history of evaluation lay out an issue that runs throughout the collection--as evaluation emerges in political context it must capitalize on diversity in the field while seeking and fostering unity.

Several chapters on auditing and evaluation along with two regarding performance measurement demonstrate that as traditional financial auditing, which focuses on inputs and expenditures, moves toward performance auditing, which focuses on outcomes and impacts, evaluative approaches are increasingly being used in auditing institutions. In particular, the common public calls for more affordable, effective, and efficient government across many nations have created a need for institutionalized evaluation functions in parallel with government auditing.

The next two sections of the volume contain a wealth of examples of how evaluation is newly developing in many nations (e.g. Denmark, People's Republic of China); how evaluation is being applied to global issues (e.g., gender issues, sustainable development, environmental issues) on a local, national, and global scale; and how evaluation approaches are being used to advance knowledge, understanding, and action in topical areas (e.g., human rights violations, the Chernobyl nuclear accident) beyond the usual reach of program evaluation. The section offers and encourages divergent thinking about where, how, and why evaluation is employed and to what its unique perspectives might be applied. Chelimsky argues that unique strengths of evaluation are its propensity to question the status quo and "its skepticism about conventional wisdom" (p. 25). The diversity of evaluation developments, applications, and uses that permeates these two sections represent a self-renewal of evaluation as a field through the reflexive employment of evaluative doubt and questioning toward certain traditional boundaries of the field.

While Evaluation for the 21st Century is by no means a comprehensive compendium of methodologies currently in use, the several chapters on current methodologies represent an interesting array of emerging techniques that demonstrate well how eclectic thinking about the purposes of evaluation and the nature of knowledge and valuing support an eclectic collection of methodologies. In terms of matching evaluation approaches to situations, cluster evaluation (James R. Sanders) is argued to be as well-matched to large-scale, multi-site programs as repeated measures, single case evaluation (Mansoor A. F. Kazi) is to individual case management in social work. In terms of the nature of knowledge and value, scientific realist evaluation (Ray Pawson & Nick Tilley) is argued to provide strong information about the nature of social programs understood from a critical realist perspective, while empowerment evaluation (David M. Fetterman) is just as convincingly argued, from a critical perspective, to connect knowledge with action in a democratizing function.

Finally, the volume concludes with two nuggets from influential figures in the international development of evaluation over the past several decades. Bob Stake describes the near inevitability, if not the necessity, for evaluators to accept the role of advocate for certain of the programs they evaluate. Michael Scriven argues against the advocate role, preferring instead a strict separation of the role of evaluator-- one who judges merit, worth, and quality--from the role of evaluation consultant--one who provides evaluation related services, which might include advocacy. The two pieces highlight an important and, to be sure, abiding debate about the nature of evaluation. Moreover, placed side-by-side the two chapters provide a fascinating comparison of two writing styles and methods of argument that Stake and Scriven have developed to art forms.

I choose to highlight three cross-cutting themes in the volume: the influence of political context on evaluation, expanding functions of evaluation, and the role of evaluation in a just, democratic society. Each of these themes is discussed in the introductory chapters of the book, along with others. I select them because they cross the somewhat blurred lines between topics of evaluation and methods of evaluation and as such represent deeply permeating trends in the field likely to hold sway during the coming century.

The political context. Across national boundaries, public sentiment for affordable and accountable government has created a new political climate for evaluation. Authors who have worked in auditing and evaluation from Canada (L. Denis Desautels), Sweden (Inga-Britt Ahlnius), the United States (Joseph S. Wholey), and the United Kingdom (Caroline Mawhood) show that public opinion and legislative action have combined to produce imperatives for evaluation of a broader range of government functions as well as methods that place a premium on attribution of results and impacts to specific appropriations and allocations of funds for particular government programs. Elimination of duplication and waste in efforts to streamline government characterizes the merging of evaluation and auditing functions across nations.

Yet it is more than common trends across nations that characterize the changing political context of evaluation. Global issues of common concern will also impact the field. In two chapters W. Haven North, Charles A. Zraket, and William Clark explore issues related to evaluation of global environmental impact. North's examination of the evaluation of the initial phase of the Global Environmental Facility (GEF) of the World Band and United Nations demonstrates that the political contentions underlying efforts to create transnational and transorganizational bodies such as the GEF are as much a concern for evaluators as the topical concerns of the bodies. Zraket and Clark explore more broadly environmental impact and sustainability as a new topic for evaluation. Their conclusions highlight the need to recognize and respect other juxtaposed, and highly political global issues such as industrial and economic development in any attempt to evaluate environmental impact and sustainability. Moreover, they determine, evaluation of environmental issues on the global scale will require innovation in international collaboration to create, maintain, and share information systems. Methodologically, the cluster evaluation approach (James R. Sanders) may provide a good fit as evaluators contend with political differences surrounding generally agreed upon overarching goals. The approach is designed to target both the relatively autonomous local level and the guiding global level of large-scale programs. It includes building political consensus around common goals from the top-down and bottom-up while respecting the political need for local autonomy in order to foster innovation and application of the program in ways that are responsive to local concerns. Finally, attention to sharing information and successful innovations is highlighted.

Another perspective on global issues is taken in a captivating piece by Masafumi Nagao on the influence of global trends on local community programs. Nagao's focus is on Oguni, a small, isolated mountain community in Japan. Oguni, like many rural communities around the world, is dealing with the local implications of global trends toward urbanization of economic functions, transiency of youth, and international competition in open markets. Oguni has initiated a comprehensive response to rebuild and transform the community through new uses of its principal resources (native lumber and spectacular scenery) and the inauguration of highly visible cultural programs (a music festival and youth exchange program). In the process, Oguni has also engaged other global issues, most notably the role of women in society. Nagao contends that any local evaluation situation must attend to the interdependence of local and global issues, actions, and reactions. By what methods might evaluators undertake such a challenge? One possibility is suggested by Lois- ellin Datta: multimethod approaches that include case studies. Datta's exploration of the methodology centers on an evaluation of the H-2A program in the United States, and effort to improve agricultural productivity through the granting of temporary immigration permits to farm workers when suitable local labor is not available. The program was further intended to protect the rights and wages of domestic farm workers. H-2A, like the Oguni development initiative, confronted global issues of immigration and fair wages at very local levels. In response, the evaluation included both broad-scale methods to examine the nationwide impact of H-2A in a multitude of agricultural areas alongside in- depth case studies of specific counties in a specific agricultural industry. The case studies revealed the dependency of the program's success on local political conditions. Efforts to discourage domestic farm workers from seeking employment were common as was the continued hiring of illegal immigrants. How local decisions were intertwined with global issues of immigration and labor rights were informed through case studies combined with more traditional methods of survey and document analysis.

Functions of evaluation. It is generally agreed that evaluation is undertaken for the purpose of judging the merit or worth of something, but whether that function is a worthwhile end in itself is hotly debated. A number of other possible functions emerge in the explorations and examples in Evaluation for the 21st Century. In some sense, each of these functions is a response to calls for evaluation to claim and demonstrate its own merit or worth, reiterated in this volume by a number of authors. The common claim for the worth of the functions I draw out of the book--informing policy and practice, building capacity for self-renewal, synthesizing information from multiple sources, and establishing sustainability--is that evaluation provides unique foundations for information, knowledge, and action otherwise unavailable.

The need to match evaluation approaches to evaluation situation for the purpose of informing policy and practice is presented in Mansoor A. F. Kazi's chapter on single case evaluation in social work in the United Kingdom. Kazi highlights how methodological choices in evaluation interact with the choices made in practice. While fixed, repeated measures intervention designs, which were originally intended by the policy requiring evaluation, would have provided greater evidence of causality and generalizability in the traditional senses, the practice of social work demands great flexibility in intervention and appropriate policies need to allow for such flexibility. Used for the purpose of informing and enriching social work practice and policy, the evaluation approach chosen had to remain responsive to the needs of a field in which fixed designs failed to capture appropriately the dynamics of practice. Rather than require fixed designs with fixed intervention schedules, ultimately, the approach taken was to allow designs to emerge in response to the needs of consumers. Synthesized evaluative information across cases with common features, but deriving from practical decisions, Kazi argues, is far better for informing policy and practice.

Eduardo Wiesner examines the function of evaluation to inform policy and practice on a vastly different scale in his chapter on evaluating the reform agendas of developing nations. In Wiesner's analysis policies and practices of reform typically include privatization, fiscal and political decentralization, and altered spending priorities. However, he asserts, each of these movements only supports development in conjunction with institutions and organizations operating in well-functioning markets that include competition. Evaluation can be misused, he warns, if institutions and organizations are simply held to unvalidated performance indicators rather than to standards of excellence benchmarked in well-functioning markets and to comparative standards of efficiency that result from competition. Independent evaluation, Weisner argues, is uniquely suited to informing the policies and practices of development programs by providing results- and efficiency-oriented information needed to create and maintain functioning markets.

An imperative of foreign aid programs in Sweden (Kristina Svensson) and of World Bank development programs (Robert Piciotto) is the development of capacity for self-renewal in recipient nations. Included in the capacity for self-renewal is the development of evaluation functions, which the authors view as an extension of the evaluation of aid and development programs. The notion of building evaluation capacity for the purpose of self-renewal also underlies the methodological approach David M. Fetterman calls empowerment evaluation. Empowerment evaluation engages program participants in generating and selecting evaluation questions, collecting and analyzing information, and making value judgments. The empowerment evaluator acts as a facilitator, consultant, and critic to the participant-evaluators. In many respects, the same functions are being called for in aid and development programs, especially as evaluations are expected to include local evaluators and the development of enduring evaluation technologies and capacities in recipient nations.

As evaluation grows within and across national boundaries, the potential contributions of information and understanding extend beyond the evaluative findings in individual studies. Surely, syntheses of information within nations and regions, within topical areas, and around global issues can be greater than the sum of individual studies. Syntheses, however, present one of the greatest challenges to evaluation in the coming century. Methodological and epistemological pluralism in the field will likely make the process and outcomes of syntheses difficult and controversial, but the potential contribution outweighs such obstacles. Zraket and Clark particularly highlight the need for and benefits of syntheses in their chapter on evaluation of environmental impact. Their recommendations include a need to encourage large-scale and small-scale government evaluations along with independent professional and citizen evaluations in order to draw on complementary strengths. They further emphasize a need to coordinate and integrate evaluation efforts, access to information systems, and connections between knowledge and action so that the production and use of syntheses can be most effectively facilitated.

How such syntheses might be undertaken in one context is the subject of Judith A. Droitcour's chapter which draws on an example from medical research. Droitcour lays out the strengths and shortcomings of randomized experimental designs and designs utilizing naturalistic information in various treatments for breast cancer. While randomized designs favor conclusions regarding causality, they may lack generalizability to practice due to differences in the range of subjects included and differences from practice that cannot be replicated. While a design using naturalistic information may provide generalizability, it is unlikely to support causal inferences due to uncontrolled biases in the information. Utilizing these strengths and weaknesses in a complementary synthesis design, Droitcour shows that cross-design syntheses not only reach beyond the potential of single studies but also beyond meta-analysis of studies of the same design.

Democracy and justice. Chelimsky asserts in the book's introduction that the nature of evaluation includes "dedication to democratic reform on the basis of knowledge" (p. 25). Many of the contributing authors echo the inescapable link between evaluation and democracy. The implications of that link are evident both in the methods described and in the topics engaged by evaluation in examples included in Evaluation for the 21st Century.

Michael Bamberger explores the study of gender issues in projects funded by the World Bank in Tunisia though the Bank's International Development Fund program with Tunisia's Ministry of Women's Affairs. Bamberger points out that the World Bank has recognized that successful development depends on the democratic participation of women in development projects. However, existing roles of women present a number of challenges for evaluation of their participation in development programs. Baseline data is difficult to collect due to the many non-paid economic functions women perform. Control groups are difficult to choose or construct due to the interdependence and systemic nature of many development projects. Data collection can be challenging due to expectations for and roles of women in a society that limits the access of outsiders to women and limits women's freedom to speak and express views to outsiders. The challenges of evaluative study of women's democratic participation in developing nations are exacerbated by their very lack of participation in democratic processes.

Another topic related to democracy and justice examined in the book is Ignacio Cano's compelling description of the potential and challenges of applying evaluation to the study of massive human rights violations. In attempts to provide democratic justice to the victims of violations, the judicial model is limited to examinations of each individual case separately. Evaluation offers the possibility of examining massive violations collectively. Cano suggests that the information derived from such studies may be used in attempts to apply democratic justice, but that the very process of evaluating violations is useful for confronting and ameliorating suffering as victims are given a democratic voice that has been suppressed by violations. Cano reminds the reader, too, that democratic principles necessitate that the value and significance of each individual case cannot be subordinated to the aggregation of information on many cases. Another democratic ideal made explicit in the chapter is the need to use knowledge to avoid violations proactively in other places and times.

In terms of methodologies that reflect democratic principles the volume provides an interesting range of possibilities. Pawson and Tilley's scientific realist evaluation is grounded in a view of democracy built on a progressive knowledge base about human agency and capacity that can be applied to continuous improvement of social programs and functions. Sanders's cluster evaluation reflects widespread notions about representative democracy and its commitment to common goals of society with respect for local and individual variation. Fetterman's empowerment evaluation derives from a participatory view of direct democracy, in which all players become engaged in decisions and actions. Within the broad concept of democracy reside multiple variations and ideologies. In the coming century, evaluators should expect that evaluation situations and approaches will reflect that variation.

What else might the 21st century hold for evaluation? Evaluation for the 21st Century collects a number of examples and ideas about the directions evaluation might take in the coming century, and many of the predictions advanced in the book probably will come to pass, if they are not already upon us. The editors and authors recognize that part of the strength of evaluation lies in its variety and eclecticism in method, topical area, and epistemology. They also suggest that evaluation can be further strengthened by expanding variety and eclecticism while seeking unifying forces in the field.

I propose here two additional forces likely to impact the field of evaluation in the coming century, if they are not already being felt. First, an intriguing possibility arises when evaluators discuss building evaluation capacity in developing nations. I would agree that assisting in the building of capacity and sharing lessons learned from past experience are appropriate roles for nations and organizations with existing capacities to play. However, I suggest serious caution in this role. Two possibly essential components of capacity building are the need to make certain mistakes in order to learn from them and the need to confront novel situations in order to innovate evaluation approaches suited to them. Nations and organizations committed to building evaluation capacity in developing nations will tread a fine line between replicating what exists elsewhere and building capacities that will advance the field of evaluation in terms of topical areas and methodologies that at present cannot be conceived.

Second, the ontologies, epistemologies, methodologies, and axiologies underlying evaluation theory and practice have been questioned, advanced, and expanded greatly in the past forty years. Yet the field is only now recognizing the critical and deconstructionist voices of feminist, minority, and gay and lesbian scholars. The 21st century is quite likely to witness great advances in evaluation as these voices are making their way into or are being invited into discussions of evaluation theory and practice. I do not view these critiques and deconstructions as divisive, but rather as necessary to advance the field and to provide a unity that is truly built on diversity and pluralism.

Evaluation for the 21st Century is a challenging book. It prompts the reader to reconsider the boundaries of evaluation in terms of location, scope, scale, method, topic, and purpose. It balances the imperative for pluralistic voice in the field with the challenge for discovering unity. The push and pull of those two forces will undoubtedly characterize advances in the field of evaluation in the new century.