A routine part of most administrators’ jobs are data collection for reporting: funders and agencies have to know—via numbers—that their money is being effectively spent. Internally, museums find value in knowing whom they reach, and to what extent they achieve their outcome goals. Granting organizations and other funders increasingly require proof that learning has occurred.
Pegging success to numbers does create a terrible outcome-based environment where statistics on visitors, revenue, and school groups are benchmarks to reach and exceed. The chief report many site managers make is on the increase in event attendance between last year and this year. Your continued budget depends on that increase, not the quality of your programs. It makes these quantitative assessments of visitation the measure of success. The problem here is that qualitative measurements of engagement, relationship building, or community good is hard to come by and will not satisfy most oversight bodies.
This kind of data collection and reporting is not really the type of evaluation I’m talking about. (And I point this out because too many people don’t know the difference.) I mean the use of focus groups, surveys, prototypes, and interviews to collect information on visitor expectations, behavior, and the ways they engage (or potentially engage) in exhibits and programs. But the history museums I am familiar with are (necessarily) obsessed with collecting visitor data for reporting, and do not really have time to make other evaluation tools a part of their institutional cultures. I asked a friend who works at my favorite state history museum if evaluation is part of exhibit development. He replied, with knowing, rolled, eyes, “yeah, we don’t do that.” No time, no staff, no budget, and no appreciation by administration of how evaluation tools can enhance the quality of the museum visit. Why prove that learning occurs when the only thing that matters is that people pay admission as they walk through the door?
Joy Kubarek sympathizes in her article, “Building Staff Capacity to Evaluate Museum Education,” “how can museum educators, who already juggle multiple responsibilities, evaluate the learning outcomes of their programs and subsequently demonstrate their impact to others?” (9) Kubarek worked to embed an evaluation culture in her institution, the Shedd Aquarium (again, not a history museum), and described a two-pronged approach.
First, evaluation programs shouldn’t serve just one exhibit or event, but “establish valid, reliable measurements of learning at the aquarium including consistent qualitative and quantitative metrics for learning.” Evaluations, then, should serve not just a single instance, but also contribute to the creation of data sets and predictive models that support commonly understood indicators and interpretive goals across the institution. (In my history museum, I wouldn’t use the term “metrics for learning,” but would develop a robust metric for meaningfulness and engagement.)
Second, Kubarek advocates the building of staff capacity to foster an evaluation culture. Don’t just buy an evaluator for a one-off project, but give staff the tools and the time to do evaluations themselves. She suggests an evaluation “Toolkit” that offers staff members simple guidelines on creating the scope of a study, simple templates for instruments, including consistent survey questions and coding schemes, observation checklists, training materials, access to an evaluation professional, and standards for evidence, and schedules for analysis, review, and reflection. The key here is making evaluation tools accessible to staff across the museum, and more importantly, that data and evidence is used to demonstrate effectiveness. That elusive state—staff buy-in—will follow.
My takeaways are that,
*Data needs to be an essential product. It proves achievement of outcome goals to funders and stakeholders.
*Staff needs to be trained to incorporate evaluation techniques in all parts of the process. This leads to staff ownership. (In turn, it has to be understood and insisted on by administrators.)
*Standards for questions, tools, and techniques need to be established (yet flexible) so that cross-institutional comparisons may be made, a common dataset established, and the institution may coalesce around a new way to understand common interpretive outcome goals.
In part 2, I’ll think through potential evaluations on a certain Civil War exhibit.
See: Kubarek, Joy. “Building Staff Capacity to Evaluate in Museum Education.” Journal of Museum Education. 40(1): 8-12.
(Please note: Evaluation is a highly complex and vibrant discipline in the museum world and my descriptions here are beneath basic. The problem, however, is that history museums–of all the museum types–seem to employ evaluation the least. Some do, granted, but outside of Colonial Williamsburg, the National Park Service, and several places around New York City, you’d be hard pressed to find one that does so consistently. As far as I know, no history museum that I have any relationship with or some knowledge of, has evaluation as part of their institutional cultures. This is what I’m advocating for here.)