Your contribution can help change lives. Donate now. Learn more. Throughout the world, people and organizations come together to address issues that matter to them. For example, some community partnerships have formed to reduce substance abuse, teen pregnancy, or violence.
Quantity Quantity refers to the amount of evidence gathered in an evaluation. Within the KEQs, it is also useful to identify the different types of questions involved — descriptive, causal and evaluative. Combine Qualitative and Quantitative Evaluatlve. Preskill, H. It is difficult to imagine that a model is acceptable for a Models for evaluative research application without some level of performance evaluation showing that the model matches field observations or at least that its results match fog results of another well-established model. For some interventions, it may be possible to document the emerging theory of change as different strategies are trialled and adapted or replaced. Alternatively, summaries of the distri.
Cervical mucus skip ovulation. Appreciative enquiry
The panel notes that the Centers for Disease Control CDC is already collecting some administrative records on its counseling and testing program and Models for evaluative research projects. Without cookies your experience may not be seamless. Here we introduce Rubber foot socks baby idea of evaluation and some of the major terms and issues in the field. When the object is to estimate whether a given intervention has any effects, individuals are randomly assigned to the project or Models for evaluative research a zero-treatment control group. With differently constituted groups, inferences about results are hostage to uncertainty about the extent to which the observed outcome actually results from the intervention and is not an artifact of intergroup differences Models for evaluative research may not have been removed by matching or adjustment. Although randomized experiments have many benefits, the approach is not without pitfalls. This is a salient principle in the design and execution of intervention programs as well as in the assessment of their results. Related information. Federal Judicial Center Experimentation in the Law. Evaluation utilizes many of the same methodologies used in traditional social research, but because evaluation takes place within a political and organizational context, it requires group skills, management ability, political dexterity, sensitivity to multiple stakeholders and other skills that social research in general does not rely on as much. Evaluability assessment can be used here, as well as standard approaches for selecting an appropriate evaluation design.
Logic models are hypothesized descriptions of the chain of causes and effects see Causality leading to an outcome of interest e.
- Evaluation is a methodological area that is closely related to, but distinguishable from more traditional social research.
- NCBI Bookshelf.
- Evaluation research can be defined as a type of study that uses standard social research methods for evaluative purposes, as a specific research methodology, and as an assessment process that employs special techniques unique to the evaluation of social programs.
Jump to navigation. An impact evaluation provides information about the impacts produced by an intervention - positive and negative, intended and unintended, direct and indirect. If an impact evaluation fails to systematically undertake causal attribution, there is a greater risk that the evaluation will produce incorrect findings and lead to incorrect decisions. For example, deciding to scale up when the programme is actually ineffective or effective only in certain limited situations, or deciding to exit when a programme could be made to work if limiting factors were addressed.
Many development agencies use the definition of impacts provided by the Organisation for Economic Co-operation and Development — Development Assistance Committee :.
An impact evaluation can be undertaken to improve or reorient an intervention i. While many formative evaluations focus on processes, impact evaluations can also be used formatively if an intervention is ongoing.
For example , the findings of an impact evaluation can be used to improve implementation of a programme for the next intake of participants by identifying critical elements to monitor and tightly manage. Most often, impact evaluation is used for summative purposes. An impact evaluation should only be undertaken when its intended use can be clearly identified and when it is likely to be able to produce useful findings, taking into account the availability of resources and the timing of decisions about the intervention under investigation.
An evaluability assessment might need to be done first to assess these aspects. It is also important to consider the timing of an impact evaluation. When conducted belatedly, the findings come too late to inform decisions. When done too early, it will provide an inaccurate picture of the impacts i. Regardless of the type of evaluation, it is important to think through who should be involved, why and how in each step of the evaluation process to develop an appropriate and context-specific participatory approach.
Participation can occur at any stage of the impact evaluation process: in deciding to do an evaluation, in its design, in data collection, in analysis, in reporting and, also, in managing it. Being clear about the purpose of participatory approaches in an impact evaluation is an essential first step towards managing expectations and guiding implementation.
Is the purpose to ensure that the voices of those whose lives should have been improved by the programme or policy are central to the findings? Is it to ensure a relevant evaluation focus? Is it to build ownership of a donor-funded programme? These, and other considerations, would lead to different forms of participation by different combinations of stakeholders in the impact evaluation.
The underlying rationale for choosing a participatory approach to impact evaluation can be either pragmatic or ethical, or a combination of the two. Pragmatic because better evaluations are achieved i. Participatory approaches can be used in any impact evaluation design. In other words, they are not exclusive to specific evaluation methods or restricted to quantitative or qualitative data collection and analysis. The starting point for any impact evaluation intending to use participatory approaches lies in clarifying what value this will add to the evaluation itself as well as to the people who would be closely involved but also including potential risks of their participation.
Three questions need to be answered in each situation:. Only after addressing these, can the issue of how to make impact evaluation more participatory be addressed. Like any other evaluation, an impact evaluation should be planned formally and managed as a discrete project, with decision-making processes and management arrangements clearly described from the beginning of the process.
Determining causal attribution is a requirement for calling an evaluation an impact evaluation. The design options whether experimental, quasi-experimental, or non-experimental all need significant investment in preparation and early data collection, and cannot be done if an impact evaluation is limited to a short exercise conducted towards the end of intervention implementation. Hence, it is particularly important that impact evaluation is addressed as part of an integrated monitoring, evaluation and research plan and system that generates and makes available a range of evidence to inform decisions.
The evaluation purpose refers to the rationale for conducting an impact evaluation. Evaluations that are being undertaken to support learning should be clear about who is intended to learn from it, how they will be engaged in the evaluation process to ensure it is seen as relevant and credible, and whether there are specific decision points around where this learning is expected to be applied. Evaluations that are being undertaken to support accountability should be clear about who is being held accountable, to whom and for what.
Evaluation relies on a combination of facts and values i. Evaluative criteria specify the values that will be used in an evaluation and, as such, help to set boundaries. Other, commonly used evaluative criteria are about equity, gender equality, and human rights. And, some are used for particular types of development interventions such humanitarian assistance such as: coverage, coordination, protection, coherence.
They are insufficiently defined to be applied systematically and in a transparent manner to make evaluative judgements about the intervention. The evaluative criteria should be clearly reflected in the evaluation questions the evaluation is intended to address.
Impact evaluations should be focused around answering a small number of high-level key evaluation questions KEQs that will be answered through a combination of evidence. These questions should be clearly linked to the evaluative criteria. For example:. A range of more detailed mid-level and lower-level evaluation questions should then be articulated to address each evaluative criterion in detail.
All evaluation questions should be linked explicitly to the evaluative criteria to ensure that the criteria are covered in full. The KEQs also need to reflect the intended uses of the impact evaluation. Equity concerns require that impact evaluations go beyond simple average impact to identify for whom and in what ways the programmes have been successful.
Within the KEQs, it is also useful to identify the different types of questions involved — descriptive, causal and evaluative. It should also be noted that some impacts may be emergent, and thus, cannot be predicted.
Evaluation, by definition, answers evaluative questions, that is, questions about quality and value. This is what makes evaluation so much more useful and relevant than the mere measurement of indicators or summaries of observations and stories. One way of doing so is to use a specific rubric that defines different levels of performance or standards for each evaluative criterion, deciding what evidence will be gathered and how it will be synthesized to reach defensible conclusions about the worth of the intervention.
At the very least, it should be clear what trade-offs would be appropriate in balancing multiple impacts or distributional effects.
Since development interventions often have multiple impacts, which are distributed unevenly, this is an essential element of an impact evaluation. For example, should an economic development programme be considered a success if it produces increases in household income but also produces hazardous environment impacts?
Should it be considered a success if the average household income increases but the income of the poorest households is reduced? Quality refers to how good something is; value refers to how good it is in terms of the specific situation, in particular taking into account the resources used to produce it and the needs it was supposed to address. Evaluative reasoning is required to synthesize these elements to formulate defensible i.
Evaluative reasoning is a requirement of all evaluations, irrespective of the methods or evaluation approach used.
An evaluation should have a limited set of high-level questions which are about performance overall. Each of these KEQs should be further unpacked by asking more detailed questions about performance on specific dimensions of merit and sometimes even lower-level questions. Evaluative reasoning is the process of synthesizing the answers to lower- and mid-level questions into defensible judgements that directly answer the high-level questions.
Evaluations produce stronger and more useful findings if they not only investigate the links between activities and impacts but also investigate links along the causal chain between activities, outputs, intermediate outcomes and impacts. A theory of change should be used in some form in every impact evaluation. It can be used with any research design that aims to infer causality, it can use a range of qualitative and quantitative data, and provide support for triangulating the data arising from a mixed methods impact evaluation.
When planning an impact evaluation and developing the terms of reference, any existing theory of change for the programme or policy should be reviewed for appropriateness, comprehensiveness and accuracy, and revised as necessary. It should continue to be revised over the course of the evaluation should either the intervention itself or the understanding of how it works — or is intended to work — change. Some interventions cannot be fully planned in advance, however — for example, programmes in settings where implementation has to respond to emerging barriers and opportunities such as to support the development of legislation in a volatile political environment.
For some interventions, it may be possible to document the emerging theory of change as different strategies are trialled and adapted or replaced. In other cases, there may be a high-level theory of how change will come about e.
Elsewhere, its fundamental basis may revolve around adaptive learning, in which case the theory of change should focus on articulating how the various actors gather and use information together to make ongoing improvements and adaptations. The evaluation may confirm the theory of change or it may suggest refinements based on the analysis of evidence.
An impact evaluation can check for success along the causal chain and, if necessary, examine alternative causal paths. For example, failure to achieve intermediate results might indicate implementation failure; failure to achieve the final intended impacts might be due to theory failure rather than implementation failure.
This has important implications for the recommendations that come out of an evaluation. In cases of implementation failure, it is reasonable to recommend actions to improve the quality of implementation; in cases of theory failure, it is necessary to rethink the whole strategy for achieving impacts.
The evaluation methodology sets out how the key evaluation questions KEQs will be answered. It specifies designs for causal attribution, including whether and how comparison groups will be constructed, and methods for data collection and analysis.
This definition does not require that changes are produced solely or wholly by the programme or policy under investigation UNEG Using a combination of these strategies can usually help to increase the strength of the conclusions that are drawn.
For more information, see:. Well-chosen and well-implemented methods for data collection and analysis are essential for all types of evaluations. Impact evaluations need to go beyond assessing the size of the effects i. The framework includes how data analysis will address assumptions made in the programme theory of change about how the programme was thought to produce the intended results. In a true mixed methods evaluation, this includes using appropriate numerical and textual analysis methods and triangulating multiple data sources and perspectives in order to maximize the credibility of the evaluation findings.
Start the data collection planning by reviewing to what extent existing data can be used. After reviewing currently available information, it is helpful to create an evaluation matrix see below showing which data collection and analysis methods will be used to answer each KEQ and then identify and prioritize data gaps that need to be addressed by collecting new data.
Evaluation matrix: Matching data collection to key evaluation questions. Examples of key evaluation questions KEQs. Programme participant survey. Key informant interviews. Observation of programme implementation. There are many different methods for collecting data. A key reason for mixing methods is that it helps to overcome the weaknesses inherent in each method when used alone. It also increases the credibility of evaluation findings when information from different data sources converges i.
Good data management includes developing effective processes for: consistently collecting and recording data, storing data securely, cleaning data, transferring data e. The particular analytic framework and the choice of specific data analysis methods will depend on the purpose of the impact evaluation and the type of KEQs that are intrinsically linked to this.
For answering causal KEQs, there are essentially three broad approaches to causal attribution analysis: 1 counterfactual approaches; 2 consistency of evidence with causal relationship; and 3 ruling out alternatives see above.
The latter definition emphasizes acquiring and assessing information rather than assessing worth or merit because all evaluation work involves collecting and sifting through data, making judgements about the validity of the information and of inferences we derive from it, whether or not an assessment of worth or merit results. That is, how small a treatment difference is it essential to detect if it is present? Many of the research strategies proposed in this report require investments that are perhaps greater than has been previously contemplated. The panel recommends that any intensive evaluation of an intervention be conducted on a subset of projects selected according to explicit criteria. The process of data analysis and the evaluation report are also given attention. There are many types of evaluations that do not necessarily result in an assessment of worth or merit -- descriptive studies, implementation analyses, and formative evaluations, to name a few. These functions require not just bureaucratic oversight but appropriate scientific expertise.
Models for evaluative research. Types of Evaluation
This setting usually engenders a great need for cooperation between those who conduct the program and those who evaluate it. This need for cooperation can be particularly acute in the case of AIDS prevention programs because those programs have been developed rapidly to meet the urgent demands of a changing and deadly epidemic. Although the characteristics of AIDS intervention programs place some unique demands on evaluation, the techniques for conducting good program evaluation do not need to be invented.
Two decades of evaluation research have provided a basic conceptual framework for undertaking such efforts see, e. In this chapter the panel provides an overview of the terminology, types, designs, and management of research evaluation. The following chapter provides an overview of program objectives and the selection and measurement of appropriate outcome variables for judging the effectiveness of AIDS intervention programs.
These issues are discussed in detail in the subsequent, program-specific Chapters 3 - 5. The term evaluation implies a variety of different things to different people.
Evaluation is a systematic process that produces a trustworthy account of what was attempted and why; through the examination of results—the outcomes of intervention programs—it answers the questions, "What was done?
These questions differ in the degree of difficulty of answering them. An evaluation that tries to determine the outcomes of an intervention and what those outcomes mean is a more complicated endeavor than an evaluation that assesses the process by which the intervention was delivered. Both kinds of evaluation are necessary because they are intimately connected: to establish a project's success, an evaluator must first ask whether the project was implemented as planned and then whether its objective was achieved.
Questions about a project's implementation usually fall under the rubric of process evaluation. If the investigation involves rapid feedback to the project staff or sponsors, particularly at the earliest stages of program implementation, the work is called formative evaluation. Questions about effects or effectiveness are often variously called summative evaluation, impact assessment, or outcome evaluation, the term the panel uses.
Formative evaluation is a special type of early evaluation that occurs during and after a program has been designed but before it is broadly implemented. Formative evaluation is used to understand the need for the intervention and to make tentative decisions about how to implement or improve it. During formative evaluation, information is collected and then fed back to program designers and administrators to enhance program development and maximize the success of the intervention.
For example, formative evaluation may be carried out through a pilot project before a program is implemented at several sites. A pilot study of a community-based organization CBO , for example, might be used to gather data on problems involving access to and recruitment of targeted populations and the utilization and implementation of services; the findings of such a study would then be used to modify if needed the planned program.
Another example of formative evaluation is the use of a "story board" design of a TV message that has yet to be produced. A story board is a series of text and sketches of camera shots that are to be produced in a commercial.
To evaluate the effectiveness of the message and forecast some of the consequences of actually broadcasting it to the general public, an advertising agency convenes small groups of people to react to and comment on the proposed design. Once an intervention has been implemented, the next stage of evaluation is process evaluation, which addresses two broad questions: "What was done?
When intervention programs continue over a long period of time as is the case for some of the major AIDS prevention programs , measurements at several times are warranted to ensure that the components of the intervention continue to be delivered by the right people, to the right people, in the right manner, and at the right time.
Process evaluation can also play a role in improving interventions by providing the information necessary to change delivery strategies or program objectives in a changing epidemic. Research designs for process evaluation include direct observation of projects, surveys of service providers and clients, and the monitoring of administrative records.
The panel notes that the Centers for Disease Control CDC is already collecting some administrative records on its counseling and testing program and community-based projects. The panel believes that this type of evaluation should be a continuing and expanded component of intervention projects to guarantee the maintenance of the projects' integrity and responsiveness to their constituencies.
The purpose of outcome evaluation is to identify consequences and to establish that consequences are, indeed, attributable to a project. This type of evaluation answers the questions, "What outcomes were observed? The panel believes that these stages of evaluation i. After a body of findings has been accumulated from such evaluations, it may be fruitful to launch another stage of evaluation: cost-effectiveness analysis see Weinstein et al.
Like outcome evaluation, cost-effectiveness analysis also measures program effectiveness, but it extends the analysis by adding a measure of program cost.
The panel believes that consideration of cost-effective analysis should be postponed until more experience is gained with formative, process, and outcome evaluation of the CDC AIDS prevention programs. Process and outcome evaluations require different types of research designs, as discussed below. Formative evaluations, which are intended to both assess implementation and forecast effects, use a mix of these designs.
To conduct process evaluations on how well services are delivered, data need to be gathered on the content of interventions and on their delivery systems. Suggested methodologies include direct observation, surveys, and record keeping. Direct observation designs include case studies, in which participant-observers unobtrusively and systematically record encounters within a program setting, and nonparticipant observation, in which long, open-ended or "focused" interviews are conducted with program participants.
Surveys —either censuses of the whole population of interest or samples—elicit information through interviews or questionnaires completed by project participants or potential users of a project. For example, surveys within community-based projects can collect basic statistical information on project objectives, what services are provided, to whom, when, how often, for how long, and in what context.
Record keeping consists of administrative or other reporting systems that monitor use of services. Standardized reporting ensures consistency in the scope and depth of data collected.
To use the media campaign as an example, the panel suggests using standardized data on the use of the AIDS hotline to monitor public attentiveness to the advertisements broadcast by the media campaign. These designs are simple to understand, but they require expertise to implement. For example, observational studies must be conducted by people who are well trained in how to carry out on-site tasks sensitively and to record their findings uniformly.
Observers can either complete narrative accounts of what occurred in a service setting or they can complete some sort of data inventory to ensure that multiple aspects of service delivery are covered.
These types of studies are time consuming and benefit from corroboration among several observers. The use of surveys in research is well-understood, although they, too, require expertise to be well implemented. As the program chapters reflect, survey data collection must be carefully designed to reduce problems of validity and reliability and, if samples are used, to design an appropriate sampling scheme.
Record keeping or service inventories are probably the easiest research designs to implement, although preparing standardized internal forms requires attention to detail about salient aspects of service delivery. Research designs for outcome evaluations are meant to assess principal and relative effects.
Ideally, to assess the effect of an intervention on program participants, one would like to know what would have happened to the same participants in the absence of the program. Because it is not possible to make this comparison directly, inference strategies that rely on proxies have to be used.
Scientists use three general approaches to construct proxies for use in the comparisons required to evaluate the effects of interventions: 1 nonexperimental methods, 2 quasi-experiments, and 3 randomized experiments.
The first two are discussed below, and randomized experiments are discussed in the subsequent section. The most common form of nonexperimental design is a before-and-after study. In this design, pre-intervention measurements are compared with equivalent measurements made after the intervention to detect change in the outcome variables that the intervention was designed to influence.
Although the panel finds that before-and-after studies frequently provide helpful insights, the panel believes that these studies do not provide sufficiently reliable information to be the cornerstone for evaluation research on the effectiveness of AIDS prevention programs.
The panel's conclusion follows from the fact that the postintervention changes cannot usually be attributed unambiguously to the intervention. Quasi-experimental and matched control designs provide a separate comparison group. In these designs, the control group may be selected by matching nonparticipants to participants in the treatment group on the basis of selected characteristics.
It is difficult to ensure the comparability of the two groups even when they are matched on many characteristics because other relevant factors may have been overlooked or mismatched or they may be difficult to measure e. In some situations, it may simply be impossible to measure all of the characteristics of the units e.
Matched control designs require extraordinarily comprehensive scientific knowledge about the phenomenon under investigation in order for evaluators to be confident that all of the relevant determinants of outcomes have been properly accounted for in the matching.
Three types of information or knowledge are required: 1 knowledge of intervening variables that also affect the outcome of the intervention and, consequently, need adjustment to make the groups comparable; 2 measurements on all intervening variables for all subjects; and 3 knowledge of how to make the adjustments properly, which in turn requires an understanding of the functional relationship between the intervening variables and the outcome variables.
Satisfying each of these information requirements is likely to be more difficult than answering the primary evaluation question, "Does this intervention produce beneficial effects? Given the size and the national importance of AIDS intervention programs and given the state of current knowledge about behavior change in general and AIDS prevention, in particular, the panel believes that it would be unwise to rely on matching and adjustment strategies as the primary design for evaluating AIDS intervention programs.
With differently constituted groups, inferences about results are hostage to uncertainty about the extent to which the observed outcome actually results from the intervention and is not an artifact of intergroup differences that may not have been removed by matching or adjustment. A remedy to the inferential uncertainties that afflict nonexperimental designs is provided by randomized experiments.
In such experiments, one singly constituted group is established for study. A subset of the group is then randomly chosen to receive the intervention, with the other subset becoming the control. The two groups are not identical, but they are comparable. Because they are two random samples drawn from the same population, they are not systematically different in any respect, which is important for all variables—both known and unknown—that can influence the outcome.
Dividing a singly constituted group into two random and therefore comparable subgroups cuts through the tangle of causation and establishes a basis for the valid comparison of respondents who do and do not receive the intervention. Randomized experiments provide for clear causal inference by solving the problem of group comparability, and may be used to answer the evaluation questions "Does the intervention work?
Which question is answered depends on whether the controls receive an intervention or not. When the object is to estimate whether a given intervention has any effects, individuals are randomly assigned to the project or to a zero-treatment control group. The control group may be put on a waiting list or simply not get the treatment.
This design addresses the question, "Does it work? When the object is to compare variations on a project—e. This design addresses the question, "What works better? A randomized experiment requires that individuals, organizations, or other treatment units be randomly assigned to one of two or more treatments or program variations. Random assignment ensures that the estimated differences between the groups so constituted are statistically unbiased; that is, that any differences in effects measured between them are a result of treatment.
The absence of statistical bias in groups constituted in this fashion stems from the fact that random assignment ensures that there are no systematic differences between them, differences that can and usually do affect groups composed in ways that are not random.
To improve interventions that are already broadly implemented, the panel recommends the use of randomized field experiments of alternative or enhanced interventions. Under certain conditions, the panel also endorses randomized field experiments with a nontreatment control group to evaluate new interventions. In the context of a deadly epidemic, ethics dictate that treatment not be withheld simply for the purpose of conducting an experiment.
Nevertheless, there may be times when a randomized field test of a new treatment with a no-treatment control group is worthwhile. One such time is during the design phase of a major or national intervention.
Before a new intervention is broadly implemented, the panel recommends that it be pilot tested in a randomized field experiment. The panel considered the use of experiments with delayed rather than no treatment. A delayed-treatment control group strategy might be pursued when resources are too scarce for an intervention to be widely distributed at one time.
For example, a project site that is waiting to receive funding for an intervention would be designated as the control group. If it is possible to randomize which projects in the queue receive the intervention, an evaluator could measure and compare outcomes after the experimental group had received the new treatment but before the control group received it. The panel believes that such a design can be applied only in limited circumstances, such as when groups would have access to related services in their communities and that conducting the study was likely to lead to greater access or better services.
For example, a study cited in Chapter 4 used a randomized delayed-treatment experiment to measure the effects of a community-based risk reduction program. However, such a strategy may be impractical for several reasons, including:. Although randomized experiments have many benefits, the approach is not without pitfalls.
In the planning stages of evaluation, it is necessary to contemplate certain hazards, such as the Hawthorne effect 6 and differential project dropout rates. Precautions must be taken either to prevent these problems or to measure their effects. Fortunately, there is some evidence suggesting that the Hawthorne effect is usually not very large Rossi and Freeman, Attrition is potentially more damaging to an evaluation, and it must be limited if the experimental design is to be preserved.
If sample attrition is not limited in an experimental design, it becomes necessary to account for the potentially biasing impact of the loss of subjects in the treatment and control conditions of the experiment. The statistical adjustments required to make inferences about treatment effectiveness in such circumstances can introduce uncertainties that are as worrisome as those afflicting nonexperimental and quasi-experimental designs.
Thus, the panel's recommendation of the selective use of randomized design carries an implicit caveat: To realize the theoretical advantages offered by randomized experimental designs, substantial efforts will be required to ensure that the designs are not compromised by flawed execution. Another pitfall to randomization is its appearance of unfairness or unattractiveness to participants and the controversial legal and ethical issues it sometimes raises.
Often, what is being criticized is the control of project assignment of participants rather than the use of randomization itself. In deciding whether random assignment is appropriate, it is important to consider the specific context of the evaluation and how participants would be assigned to projects in the absence of randomization.
The Federal Judicial Center offers five threshold conditions for the use of random assignment. The parent committee has argued that these threshold conditions apply in the case of AIDS prevention programs see Turner, Miller, and Moses, Although randomization may be desirable from an evaluation and ethical standpoint, and acceptable from a legal standpoint, it may be difficult to implement from a practical or political standpoint.
Again, the panel emphasizes that questions about the practical or political feasibility of the use of randomization may in fact refer to the control of program allocation rather than to the issues of randomization itself.
In fact, when resources are scarce, it is often more ethical and politically palatable to randomize allocation rather than to allocate on grounds that may appear biased. It is usually easier to defend the use of randomization when the choice has to do with assignment to groups receiving alternative services than when the choice involves assignment to groups receiving no treatment.
For example, in comparing a testing and counseling intervention that offered a special "skills training" session in addition to its regular services with a counseling and testing intervention that offered no additional component, random assignment of participants to one group rather than another may be acceptable to program staff and participants because the relative values of the alternative interventions are unknown.
The more difficult issue is the introduction of new interventions that are perceived to be needed and effective in a situation in which there are no services. An argument that is sometimes offered against the use of randomization in this instance is that interventions should be assigned on the basis of need perhaps as measured by rates of HIV incidence or of high-risk behaviors. But this argument presumes that the intervention will have a positive effect—which is unknown before evaluation—and that relative need can be established, which is a difficult task in itself.
The panel recognizes that community and political opposition to randomization to zero treatments may be strong and that enlisting participation in such experiments may be difficult. This opposition and reluctance could seriously jeopardize the production of reliable results if it is translated into noncompliance with a research design. The feasibility of randomized experiments for AIDS prevention programs has already been demonstrated, however see the review of selected experiments in Turner, Miller, and Moses, The substantial effort involved in mounting randomized field experiments is repaid by the fact that they can provide unbiased evidence of the effects of a program.
The unit of assignment of an experiment may be an individual person, a clinic i. The treatment unit is selected at the earliest stage of design. Variations of units are illustrated in the following four examples of intervention programs. Two different pamphlets A and B on the same subject e. The outcome to be measured is whether the recipient returns a card asking for more information.
The outcome to be measured is a score on a knowledge test. Of all clinics for sexually transmitted diseases STDs in a large metropolitan area, some are randomly chosen to introduce a change in the fee schedule. The outcome to be measured is the change in patient load. A coordinated set of community-wide interventions—involving community leaders, social service agencies, the media, community associations and other groups—is implemented in one area of a city.
Outcomes are knowledge as assessed by testing at drug treatment centers and STD clinics and condom sales in the community's retail outlets. In example 1 , the treatment unit is an individual person who receives pamphlet A or pamphlet B.
If either "treatment" is applied again, it would be applied to a person. In example 2 , the high school class is the treatment unit; everyone in a given class experiences either curriculum A or curriculum B.
If either treatment is applied again, it would be applied to a class. The treatment unit is the clinic in example 3 , and in example 4 , the treatment unit is a community. The consistency of the effects of a particular intervention across repetitions justly carries a heavy weight in appraising the intervention. It is important to remember that repetitions of a treatment or intervention are the number of treatment units to which the intervention is applied.
This is a salient principle in the design and execution of intervention programs as well as in the assessment of their results. The adequacy of the proposed sample size number of treatment units has to be considered in advance. Adequacy depends mainly on two factors:. Many formal methods for considering and choosing sample size exist see, e. Practical circumstances occasionally allow choosing between designs that involve units at different levels; thus, a classroom might be the unit if the treatment is applied in one way, but an entire school might be the unit if the treatment is applied in another.
When both approaches are feasible, the use of a power analysis for each approach may lead to a reasoned choice. There is some controversy about the advantages of randomized experiments in comparison with other evaluative approaches. It is the panel's belief that when a well executed randomized study is feasible, it is superior to alternative kinds of studies in the strength and clarity of whatever conclusions emerge, primarily because the experimental approach avoids selection biases.
Experiments in medical research shed light on the advantages of carefully conducted randomized experiments. The Salk vaccine trials are a successful example of a large, randomized study. In a double-blind test of the polio vaccine, 8 children in various communities were randomly assigned to two treatments, either the vaccine or a placebo.
By this method, the effectiveness of Salk vaccine was demonstrated in one summer of research Meier, A sufficient accumulation of relevant, observational information, especially when collected in studies using different procedures and sample populations, may also clearly demonstrate the effectiveness of a treatment or intervention.
The process of accumulating such information can be a long one, however. When a well-executed randomized study is feasible, it can provide evidence that is subject to less uncertainty in its interpretation, and it can often do so in a more timely fashion.
In the midst of an epidemic, the panel believes it proper that randomized experiments be one of the primary strategies for evaluating the effectiveness of AIDS prevention efforts.
After the reasons for conducting evaluation research are discussed, the general principles and types are reviewed. Other aspects of evaluation research considered are the steps of planning and conducting an evaluation study and the measurement process, including the gathering of statistics and the use of data collection techniques. The process of data analysis and the evaluation report are also given attention. Evaluation research should enhance knowledge and decision making and lead to practical applications.
Project MUSE promotes the creation and dissemination of essential humanities and social science resources through collaboration with libraries, publishers, and scholars worldwide. Forged from a partnership between a university press and a library, Project MUSE is a trusted part of the academic and scholarly community it serves. Built on the Johns Hopkins University Campus.
Logic model - Wikipedia
Your contribution can help change lives. Donate now. Learn more. Throughout the world, people and organizations come together to address issues that matter to them. For example, some community partnerships have formed to reduce substance abuse, teen pregnancy, or violence. Alliances among community people have also focused on promoting urban economic development, access to decent housing, and quality education. These initiatives try to improve the quality of life for everyone in a community.
Often, they do this in two ways. Initiatives use universal approaches -- that is, they try to reach everyone who could possibly be affected by the concern. They also use targeted approaches , which try to affect conditions for people who are at higher risk for the problem. Through these two approaches, initiatives try to change people's behavior, such as using illegal drugs, being physically active, or caring for children. They also might go deeper and try to change the conditions, such as the availability of drugs, or opportunity for drugs or daycare, under which these behaviors occur.
Community health promotion is a process that includes many things at many levels. For example, efforts use multiple strategies, such as providing information about the problem or improving people's access to assistance. They also operate at multiple levels, including individuals, families and organizations, and through a variety of community sectors, such as schools, businesses, and religious organizations.
All of this works together to make small but widespread changes in the health of the community. The goal is to promote healthy behaviors by making them easier to do and more likely to meet with positive reinforcement. There are a lot of different models that describe how to best promote community health and development. Similarly, our University of Kansas U. While how things should be done differs in each model, the basic goal of these and other community approaches is the same.
They aim to increase opportunities for community members to work together to improve their quality of life. Unfortunately, only modest information on the effectiveness of community-based initiatives exists. That's because evaluation practice hasn't fully caught up with a recent shift towards community control of programs.
Although there are models for studying community health efforts, community initiatives are often evaluated using research methods borrowed from clinical trials and other researcher-controlled techniques. While these methods work very well in the fields for which they were developed, they're not necessarily a "good fit" for evaluating community work. It's like trying to put a square peg into a round hole -- with a lot of work, you might be able to do it, but it will never be as smooth as you want.
New ideas about community evaluation have their roots in several different models and traditions. These include:. These and other types of research actively involve community members in designing and conducting the evaluation.
They all have two primary goals: understanding what is going on, and empowering communities to take care of themselves. What is different between these methods is the various balances they strike between these two ends. In this section, we'll look at models, methods, and applications of community evaluation in understanding and improving comprehensive community initiatives.
We'll start with a look at some of the reasons why community groups should evaluate their efforts. Then, we'll describe some of the major challenges to evaluation. We'll also describe a model of community initiatives as catalysts for change.
Then, we'll discuss some principles, assumptions, and values that guide community evaluation and outline a "logic model" for our KU Center for Community Health and Development's system of evaluation. We'll also make some specific recommendations to practitioners and policymakers about how these issues can be addressed. Finally, we'll end with a discussion examining some of the broad issues and opportunities in community evaluation. There are many good reasons for a community group to evaluate its efforts.
When done properly, evaluation can improve efforts to promote health and development at any level -- from a small local nonprofit group to a statewide or even national effort. Evaluation offers the following advantages for groups of almost any size:. Although there are a lot of advantages to evaluating community efforts, that doesn't mean it's an easy thing to do. There are some serious challenges that make it difficult to do a meaningful evaluation of community work.
They are:. Despite the challenges that evaluation poses, our belief is that it is a very worthwhile pursuit. In order to minimize these challenges, the KU Center for Community Health and Development has developed a model and some principles that may provide guidance for people trying to evaluate the work done in their community. Although different community groups have different missions, many of them use the same logic model or framework: that of a community initiative as a catalyst for change.
This type of community initiative tries to transform specific parts of the community. They change programs, policies, and practices to make healthy behaviors more likely for large numbers of people.
Below, we offer a model of what occurs in a comprehensive community initiative and its results. This model is nonlinear -- that is, community partnerships don't just do one thing at a time. Instead, they take part in many interrelated activities that occur simultaneously. A new initiative to reduce the risks for youth violence, for example, may be refining its action plan while pursuing relatively easy changes in the community, such as posting billboards that warn people of the results of gang-related violence.
The components of the model are also interrelated -- that is, they can't be taken separately. They are all part of the same puzzle. For example, collaborative planning should decide what needs to happen in the community. That, in turn, should guide community action and change.
Important community actions may be adapted to fit local conditions, and then kept going through policy changes, public funding, or other means of institutionalization. Also important in this model is the idea that success breeds success.
If a community is able to successfully bring about changes, their capacity to create even more community changes related to the group's mission should improve. This, in turn, may affect more distal outcomes -- the long term goals the group is working for. Finally, successful comprehensive initiatives or their components e. The goals and expectations of community initiatives vary.
A community may have a single, narrowly defined mission, such as increasing children's immunizations against disease. It may also have much broader goals that involve several different objectives. For example, members of an initiative may wish to work on two problems, such as reducing child abuse and domestic violence, which share common risk and protective factors. Some communities have a relatively free hand in deciding what to do. Other partnerships may be required by grantmakers to use "tried and true" strategies or interventions.
Some initiatives try hybrid approaches that combine the use of these "tried and true" methods with the role of a catalyst. They do this by implementing core components, such as sexuality education and peer support for preventing adolescent pregnancy, along with developing new community changes, such as enhancing access to contraceptives, that are related to the group's desired outcomes.
Different initiatives will modify programs to make them work well in their community. For example, different groups might want to develop supervised alternative activities for teens to make their taking part in risky behavior, such as unsafe sex or drug abuse, less likely.
However, different communities may start any one of a variety of interventions, such as expanding recreational opportunities, offering summer jobs, or developing community gardens.
Adapting interventions to fit community needs has several advantages. First of all, it creates an approach that "belongs" to community members -- it's something they are proud of, that they feel they created -- it's really theirs.
Second, because it has been modified to fit the community's needs, the program or policy is more likely to remain in existence. Finally, through changing interventions to fit local needs, community members improve their ability to take care of their own problems. If a comprehensive community initiative or a program or policy that is part of it proves to be successful over a long period, it may be used as an example that other communities can follow.
For example, comprehensive interventions for reducing risks for cardiovascular diseases, or specific parts of the intervention such as increasing access to lower fat foods, might be held up as examples for other groups. Leaders of nonprofit organizations need to know what works, what makes it work, and what doesn't work.
That way, local efforts can learn from other community-based projects and demonstrations, and adopt some of what experience and research suggest are the "best practices" in the field. When we look at the process of supporting and evaluating community initiatives, we need to look at what our ideas are based on. The following principles, assumptions, and values serve as the foundation for these processes. You'll notice that they reflect the challenges of addressing both of the major aims of evaluation: understanding community initiatives while empowering the community to address its concerns.
Community initiatives often function as catalysts for change in which community members and organizations work together to improve the quality of life.
Community initiatives are complex and ever-changing, and they must be analyzed on multiple levels. Community initiatives help launch interventions that are planned and implemented by community members. Community evaluation must understand and reflect the issue, and the context in which it is happening.
Community evaluation information should be linked to questions of importance to key stakeholders. Community evaluation should better community member's ability to understand what's going on, improve practices, and increase self-determination. Community initiatives engage community members and organizations as catalysts for change: they transform the community to have a better quality of life.
Community evaluation is based on the premise that community initiatives are very complex. To be effective, they need many levels of intervention. Researchers try to understand the issue, the history of the initiative, and the community in which it operates.
Ideally, local initiatives are planned and implemented with the involvement of many community members, including those from diverse backgrounds. Because of this, community evaluation is a participatory process involving a lot of collaboration and negotiation among many different people. Evaluation should take place from the beginning of an initiative. That way, it can offer ongoing information and feedback to better understand and improve the initiative.
Evaluation priorities that is, what to evaluate should be based on what's of most importance to community members, grantmakers, and the field.