The Evidence Generation initiative’s team of Graduate Fellows worked with John Jay faculty and staff to consult with nonprofit community-based organizations in the youth justice sector in order to improve each agency’s readiness for rigorous evaluation. Before they began working with affiliated agencies, the Graduate Fellows received training in applied evaluation skills. They applied these skills in building the analytic capacities of participating agencies.
The process began when we visited an affiliated agency to observe routine operations, interview staff, and collect documents and other materials to assist us in formulating our approach. The process was organized as seven steps. Steps 1 and 2 represented the “diagnostic phase,” while Steps 3 through 6 made up the “implementation phase.” Agencies completing the first six steps then had the option of continuing to a seventh step, which was an extended opportunity for affiliated agencies to propose one or more special projects to our team.

Step 1. PEvGen
The Evidence Generation initiative helped affiliated agencies to develop the skills and resources needed to create a better evidence base. The first step in the process was to determine whether the agency already possessed some of the key skills and resources. The EvGen team conducted this assessment with a tool developed by John Jay, called the PEvGen, or “Protocol for Evidence Generation.”
PEvGen was inspired by the well-known program evaluation tool from Vanderbilt University called the “Standardized Program Evaluation Protocol or “SPEP.” The SPEP was developed by Mark Lipsey and his colleagues as a protocol for scoring youth-serving agencies on how well they use practices that have already been shown to be effective in reducing recidivism. The SPEP is widely admired for its flexibility – it can be applied in a variety of different programs and practice models. The SPEP, however, cannot be used to address or guide the measurement of all agency practices and procedures. It is designed for behavioral interventions focused on recidivism, and it scores program models based on already-existing research.
In a nod to the SPEP, the Evidence Generation initiative developed the PEvGen tool as a checklist as well. It is a systematic checklist for assessing the extent to which an agency has developed the tools and resources needed for high-quality evaluation. The tool reviews the key elements necessary for an agency to participate in an evaluation and it assigns values for the agency’s performance on these dimensions, resulting in an overall score.
Step 2. Documentation of Routine Practices
The second step in the EvGen process was to document an agency’s routine practices. We began by visiting affiliated agencies and programs where they examined documents, interviewed staff, and compiled other information needed to document the routine practices of the agency or program. This served a variety of purposes:
- Forming a clear understanding of how the program or practice model works. In other words, what actually happens on a day-to-day basis?
- Encouraging the agency to articulate its practices and procedures in a very detailed way.
- Allowing agency staff members to describe their accumulated experience in working for the program, information that is not often found in a brochure, website, or agency mission statement.
- Helping staff to discover whether agency practices are different from stated routines and engaging the agency with more practical knowledge.
The documentation of routine practices played an important role in guiding subsequent activities. Documenting routine practices can reveal the sort of problems that would undermine an actual collaboration between the agency and a team of outside researchers.
Step 3. Theory of Change
A theory of change, sometimes called a program theory, is a set of testable propositions about how a program is supposed to affect a set of conditions or behaviors. A theory of change describes the process by which change is produced, articulating how and why a set of activities is expected to affect participants. To accomplish the goal or goals embodied by a theory of change, the theory must be plausible, doable, testable, and meaningful.
A properly developed theory of change should guide a program’s daily activities and it should provide a clear framework for evaluation. This includes the development of data collection routines and measures that should demonstrate the effectiveness of a program’s intervention. With an appropriate theory of change, an evaluation project is more likely to measure program components in a way that leads to strong conclusions about cause and effect.
An effective theory of change should be based on knowledge of the research literature and best practices, with strategies designed to address a specific problem in a specific context. Theories of change should be developed by systematically organizing what is known about a particular problem and how a program is designed to solve the problem.
Four basic steps can be used to develop a theory of change:
- State the problem that needs to be addressed;
- Identify the program’s goals and objectives and how they address the problem;
- Specify what actions will be taken to achieve those goals and objectives; and
- Clarify the rationale for taking those actions.
Many organizations report that they have already developed a theory of change, or something like it. In many cases, however, a researcher will find that agency’s theory of change is more a statement of aspirations than it is a detailed articulation of cause and effect. The EvGen initiative helped affiliated agencies to develop theories of change compatible with evaluation research. We never characterized theories of change as synonymous with mission statements.
Step 4. Logic Model
A logic model is a detailed, visual depiction of a program’s underlying theory of change, illustrating how the activities of a specific program lead to certain outcomes. Logic models are tools for determining which components of a program are designed to achieve specific outcomes, and who each particular component fits into the overall program strategy. The content of a logic model depends on the purpose, context, and intended clients or targets of the program.
At a minimum, however, a logic model should include:
- Inputs (resources);
- Activities (services offered or strategies pursued);
- Outputs (immediate, measurable results from each activity);
- Outcomes (intermediate and long-term results of each activity); and
- Impact (the long-term and/or systemic change achieved).
When designing logic models, it is often helpful to begin with a series of “if/then” statements as a means of clarifying how the goals of the program can be achieved through its proposed activities. This can also serve as a means of designing activities that will lead to outcomes the program can realistically achieve within the designated time. Logic models are most useful when they can be linked to specific evaluation measures.
Step 5. Measurement Matrix
Since the principal goal of the Evidence Generation initiative was to assist organizations in developing evidence to support their program activities, a review of their existing measures was an important product of our process. Specific measures were linked to each agency’s logic model by a measurement matrix that summarized the measures and the data necessary to create them.
In many cases, affiliated agencies already collect a large quantity of information, and much of it can be used to produce essential measures. The measurement matrix identifies these existing sources of information. Few agencies, on the other hand, are able to collect the type and amount of data necessary to create appropriate measures for all of their key program components. Our measurement matrix identified the gaps and we used it to discuss recommended strategies for data enhancement with staff from affiliated agencies.
Each matrix included several ratings that reflected our assessment of data reliability and importance. “Reliability” estimates whether an agency is capable to collecting accurate and consistent measures based on existing routines. “Importance” balances what can be learned from individual measures against our assessment of how difficult it would be for an organization to collect reliable data.
The measurement matrix also helps agencies to distinguish between process and outcome measures. Both types of measurement, of course, are important for evaluation research.
Step 6. Recommendations
After completing Steps 1 through 5, the Evidence Generation team crafted a specific set of recommendations for each affiliated agency. The recommendations reflected what we learned about the agency’s needs, together with our suggestions for how the agency could proceed with building up its evaluation capacity.
Recommendations were crafted with full knowledge of the constraints that faced each agency, and what we believed were the most pressing things of interest to that agency. If the agency was in serious need of basic data collection, the recommendations focused on that task. If the agency had a lot of sound data resources, but needed to begin implementing a strategy for formal evaluation, the recommendations pointed that out. The recommendations drew on material from all previous steps, including the results of the PEvGen, the theory of change, logic model, routine practices document, and the measurement matrix.
Step 7. Special Projects
Once an affiliated agency had worked through the first 6 steps of the Evidence Generation process, the agency was invited to begin proposing special projects. Agencies were eligible to participate in more than one project over time, but it was likely that they would be selected for at most one special project per year.
The special projects in Step 7 had several benefits: 1) they allowed the Evidence Generation initiative to work with each agency for a long period of time and at a natural pace, according to each agency’s expectations and needs for assistance; 2) they allowed us to spread our efforts among affiliated agencies more evenly and more strategically, and 3) they allowed us to assist agencies with tasks that were important for future evaluation efforts but that may not be directly related to the first 6 Steps in the process. Example of a Special Project
Implementation is Key
The purpose of the entire Evidence Generation process was to facilitate each affiliated agency’s pursuit of evidence-informed practice, and ultimately, to empower the consistent implementation of evidence-informed practices. Our approach was compatible with that of the Ontario Centre of Excellence for Child and Youth Mental Health in Canada, as described in this brief video.