Archive for August, 2012

Measuring/Managing Social Value

August 28, 2012

From my December 1, 2008 personal journal, written not long after the October 2008 SoCap conference. I’ve updated a few things that have changed in the intervening years.

Over the last month, I’ve been digesting what I learned at the Social Capital Markets conference at Fort Mason in San Francisco, and at the conference I attended just afterward, Bioneers, in Marin county. Bioneers (www.Bioneers.org) could be called Natural Capital Markets. It was quite like the Social Capital Markets conference with only a slight shift in emphasis, and lots of discussion of social value.

The main thing that impressed me at both of these conferences, apart from what I already knew about the caring passion I share with so many, is the huge contrast between that passion and the quality of the data that so many are basing major decisions on. Seeing this made me step back and think harder about how to shape my message.

First, though it may not seem like it initially, there is incredible practical value to be gained from taking the trouble to construct good measures. We do indeed manage what we measure. So whatever we measure becomes what we manage. If we’re not measuring anything that has anything to do with our mission, vision, or values, then what we’re managing won’t have anything to do with those, either. And when the numbers we use as measures do not actually represent a constant unit amount that adds up the way the numbers do, then we don’t have a clue what we’re measuring and we could be managing just about anything.

This is not the way to proceed. First take-away: ask for more from your data. Don’t let it mislead you with superficial appearances. Dig deeper.

Second, to put it a little differently, percentages, scores, and counts per capita, etc. are not measures that have the same meaning or quality that measures of height, weight, time, temperature, or volts have. However, for over 50 years, we have been constructing measures mathematically equivalent to physical measures from ability tests, surveys, assessments, checklists, etc. The technical literature on this is widely available. The methods have been mainstream at ETS, ACT, state and national departments of education globally, etc for decades.

Second take-away: did I say you should ask for more from your data? You can get it. A lot of people already are, though I don’t think they’re asking for nearly as much as they could get.

Third, though the massive numbers of percentages, scores, and counts per capita are not the measures we seek, they are indeed exactly the right place to start. I have seen over and over again, in education, health care, sociology, human resource management, and most recently in the UN Millennium Development Goals data, that people do know exactly what data will form a proper basis for the measurement systems they need.

Third take-away: (one more time!) ask for more from your data. It may conceal a wealth beyond what you ever guessed.

So what are we talking about? There are methods for creating measures that give you numbers that verifiably stand for a substantive unit amount that adds up in the same way one-inch blocks do (probabilistically, and within a range of error). If the instrument is properly calibrated and administered, the unit size and meaning will not change across individuals or samples measured. You can reduce data volume dramatically, not only with no loss of information but also with false appearances of information either indicated as error or flagged for further attention. You can calibrate a continuum of less to more that is reliably and reproducibly associated with, annotated by, and interpreted through your own indicators. You can equate different collections of indicators that measure the same thing so that they do so in the same unit.

Different agencies using the same, different, or mixed collections of indicators in different countries or regions could assess their measures for comparability, and if they are of satisfactory quality, equate them so they measure in the same unit. That is, well-designed instruments written and administered in different languages routinely have their items calibrate in the same order and positions, giving the same meaning to the same unit of measurement. For instance, see the recent issue of the Journal of Applied Measurement ([link]) devoted to reports on the OECD’s Programme for International Student Assessment.

This is not a data analysis strategy. It is an instrument calibration strategy. Once calibrated, the instrument can be deployed. We need to monitor its structure, but the point is to create a tool people can take out into the world and use like a thermometer or clock.

I’ve just been looking at the Charity Navigator (for instance, [link]) and the UN’s Millenium Development Goals ([link]), and the databases that have been assembled as measures of progress toward these goals ([link]). I would suppose these web sites show data in forms that people are generally familiar with, so I’m working up analyses to use as teaching tools from the UN data.

You don’t have to take any of this at my word. It’s been documented ad nauseum in the academic literature for decades. Those interested can find out more than they ever wanted to know at http://www.Rasch.org, in the Wikipedia Rasch entry, in the articles and books at JAMPress.com, or in dozens of academic journals and hundreds of books. Though I’ve done my share of it, I’m less interested in continuing to add to that than I am in making a tangible contribution to improving people’s lives.

Sorry to go on like this. I meant to keep this short. Anyway, there it is.

PS, for real geeks: For those of you serious about learning about measurement as it is rigorously and mathematically defined, look into taking Everett Smith’s measurement course at Statistics.com ([link]) or David Andrich’s academic units at the University of Western Australia ([link]). Available software includes Mike Linacre’s Winsteps, Andrich’s RUMM, and Mark Wilson’s, at UC Berkeley, Conquest.

The methods Ev, Mike, David, and Mark teach have repeatedly been proven, both in mathematical theory and in real life, to be both necessary and sufficient in the construction of meaningful, practical measurement. Any number of ways of defining objectivity in measurement have been shown to reduce to the mathematical models they use. Why all the Chicago stuff? Because of Ben Wright. I’m helping (again) to organize a conference in his honor, to be held in Chicago next March. His work won him a Career Achievement Award from the Association of Test Publishers, and the coming conference will celebrate his foundational contributions to computerized measurement in health care.

As a final note, for those of you fearing reductionistic meaninglessness, look into my philosophical work.  But enough…

Review of “Advancing Social Impact Investments Through Measurement”

August 24, 2012

Over the last few days, I have been reading several of the most recent issues of the Community Development Investment Review, especially volume 7, number 2, edited by David Erickson of the Federal Reserve Bank of San Francisco, reporting the proceedings of the March 21, 2011 conference in Washington, DC on advancing social impact investments through measurement. I am so excited to see this work that I am (truly) fairly trembling with excitement. I feel as though I’ve finally made my way home. There are so many points of contact, it’s hard to know where to start. After several days of concentrated deep breathing and close study of the CDIR, it’s now possible to formulate some coherent thoughts to share.

The CDIR papers start to sort out the complex issues involved in clarifying how measurement might contribute to the integration of impact investing and community development finance. I am heartened by the statement that “The goal of the Review is to bridge the gap between theory and practice and to enlist as many viewpoints as possible—government, nonprofits, financial institutions, and beneficiaries.” On the other hand, the omission of measurement scientists from that list of viewpoints adds another question to my long list of questions as to why measurement science is so routinely ignored by the very people who proclaim its importance. The situation is quite analogous to demanding more frequent conversational interactions from colleagues while ignoring the invention of the telephone and not providing them with the tools and network connections.

The aims shared by the CDIR contributors and myself are evident in the fact that David Erickson opens his summary of the March 21, 2011 conference with the same quote from Robert Kennedy that I placed at the end of my 2009 article in Measurement (see references below; all papers referenced are available by request if they are not already online). In that 2009 paper, in others I’ve published over the last several years, in presentations I’ve made to my measurement colleagues abroad and at home, and in various entries in my blog, I take up virtually all of the major themes that arose in the DC conference: how better measurement can attract capital to needed areas, how the cost of measurement repels many investors, how government can help by means of standard setting and regulation, how diverse and ambiguous investor and stakeholder interests can be reconciled and/or clarified, etc.

The difference, of course, is that I present these issues from the technical perspective of measurement and cannot speak authoritatively or specifically from the perspectives represented by the community development finance and impact investing fields. The bottom line take-away message for these fields from my perspective is this: unexamined assumptions may unnecessarily restrict assessments of problems and their potential solutions. As Salamon put it in his remarks in the CDIR proceedings from the Washington meeting (p. 43), “uncoordinated innovation not guided by a clear strategic concept can do more than lose its way: it can do actual harm.”

A clear strategic concept capable of coordinating innovations in social impact measurement is readily available. Multiple, highly valuable, and eminently practical measurement technologies have proven themselves in real world applications over the last 50 years. These technologies are well documented in the educational, psychological, sociological, and health care research literatures, as well as in the practical experience of high stakes testing for professional licensure and certification, for graduation, and for admissions.

Numerous reports show how to approach problems of quantification and standards with new degrees of rigor, transparency, meaningfulness, and flexibility. When measurement problems are not defined in terms of these technologies, solutions that may offer highly advantageous features are not considered. When the area of application is as far reaching and fundamental as social impact measurement, not taking new technologies into account is nothing short of tragic. I describe some of the new opportunities for you in a Technical Postscript, below.

In his Foreword to the CDIR proceedings issue, John Moon mentions having been at the 2009 SoCap event bringing together stakeholders from across the various social capital markets arenas. I was at the 2008 SoCap, and I came away from it with much the same impression as Moon, feeling that the palpable excitement in the air was more than tempered by the evident fact that people were often speaking at cross purposes, and that there did not seem to be a common object to the conversation. Moon, Erickson, and their colleagues have been in one position to sort out the issues involved, and I have been in another, but we are plainly on converging courses.

Though the science is in place and has been for decades, it will not and cannot amount to anything until the people who can best make use of it do so. The community development finance and impact investing fields are those people. Anyone interested in getting together for an informal conversation on topics of mutual interest should feel free to contact me.

Technical Postscript

There are at least six areas in efforts to advance social impact investments via measurement that will be most affected by contemporary methods. The first has to do with scale quality. I won’t go into the technical details, but numbers do not automatically stand for something that adds up the way they do. Mapping a substantive construct onto a number line requires specific technical expertise; there is no evidence of that expertise in any of the literature I’ve seen on social impact investing, or on measuring intangible assets. This is not an arbitrary bit of philosophical esoterica or technical nicety. This is one of those areas where the practical value of scientific rigor and precision comes into its own. It makes all the difference in being able to realize goals for measurement, investment, and redefining profit in terms of social impacts.

A second area in which thinking on social impact measurement will be profoundly altered by current scaling methods concerns the capacity to reduce data volume with no loss of information. In current systems, each indicator has its own separate metric. Data volume quickly multiplies when tracking separate organizations for each of several time periods in various locales. Given sufficient adherence to data quality and meaningfulness requirements, today’s scaling methods allow these indicators to be combined into a single composite measure—from which each individual observation can be inferred.

Elaborating this second point a bit further, I noted that some speakers at the 2011 conference in Washington thought reducing data volume is a matter of limiting the number of indicators that are tracked. This strategy is self-defeating, however, as having fewer independent observations increases uncertainty and risk. It would be far better to set up systems in which the metrics are designed so as to incorporate the amount of uncertainty that can be tolerated in any given decision support application.

The third area I have in mind deals with the diverse spectrum of varying interests and preferences brought to the table by investors, beneficiaries, and other stakeholders. Contemporary approaches in measurement make it possible to adapt the content of the particular indicators (counts or frequencies of events, or responses to survey questions or test items) to the needs of the user, without compromising the comparability of the resulting quantitative measure. This feature makes it possible to mass customize the content of the metrics employed depending on the substantive nature of the needs at that time and place.

Fourth, it is well known that different people judging performances or assigning numbers to observations bring different personal standards to bear as they make their ratings. Contemporary measurement methods enable the evaluation and scaling of raters and judges relative to one another, when data are gathered in a manner facilitating such comparisons. The end result is a basis for fair comparisons, instead of scores that vary depending more on which rater is observing than on the quality of the performance.

Fifth, much of the discussion at the conference in Washington last year emphasized the need for shared data formatting and reporting standards. As might be guessed from the prior four areas I’ve described, significant advances have occurred in standard setting methods. It is suggested in the CDIR proceedings that the Treasury Department should be the home to a new institute for social impact measurement standards. In a series of publications over the last few years, I have suggested a need for an Intangible Assets Metric System to NIST and NSF (see below for references and links; all papers are available on request). That suggestion comes up again in my third-prize winning entry in the 2011 World Standards Day paper competition, sponsored by NIST and SES (the Society for Standards Professionals), entitled “What the World Needs Now: A Bold Plan for New Standards.” (See below for link.)

Sixth, as noted by Salamon (p. 43), “metrics are not neutral. They not only measure impact, they can also shape it.” Though this is not likely exactly what Salamon meant, one of the most exciting areas in measurement applications in education in recent years, one led in many ways by my colleague, Mark Wilson, and his group at UC Berkeley, concerns exactly this feedback loop between measurement and impact. In education, it has become apparent that test scaling reveals the order in which lessons are learned. Difficult problems that require mastery of easier problems are necessarily answered correctly less often than the easier problems. When the difficulty order of test questions in a given subject remains constant over time and across thousands of students, one may infer that the scale reveals the path of least resistance. Individualizing instruction by targeting lessons at the student’s measure has given rise to a concept of formative assessment, distinct from the summative assessment of accountability applications. I suspect this kind of a distinction may also prove of value in social impact applications.

Relevant Publications and Presentations

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2004, Thursday, January 22). Bringing capital to life via measurement: A contribution to the new economics. In  R. Smith (Chair), Session 3.3B. Rasch Models in Economics and Marketing. Second International Conference on Measurement in Health, Education, Psychology, and Marketing: Developments with Rasch Models, The International Laboratory for Measurement in the Social Sciences, School of Education, Murdoch University, Perth, Western Australia.

Fisher, W. P., Jr. (2005, August 1-3). Data standards for living human, social, and natural capital. In Session G: Concluding Discussion, Future Plans, Policy, etc. Conference on Entrepreneurship and Human Rights [http://www.fordham.edu/economics/vinod/ehr05.htm], Pope Auditorium, Lowenstein Bldg, Fordham University.

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2008, 3-5 September). New metrological horizons: Invariant reference standards for instruments measuring human, social, and natural capital. Presented at the 12th International Measurement Confederation (IMEKO) TC1-TC7 Joint Symposium on Man, Science, and Measurement, Annecy, France: University of Savoie.

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009). NIST Critical national need idea White Paper: Metrological infrastructure for human, social, and natural capital (Tech. Rep., http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC: National Institute for Standards and Technology.

Fisher, W. P., Jr. (2010). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics: Conference Series, 238(1), http://iopscience.iop.org/1742-6596/238/1/012016/pdf/1742-6596_238_1_012016.pdf.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In N. Brown, B. Duckor, K. Draney & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 2 (pp. 1-27). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2011). Measuring genuine progress by scaling economic indicators to think global & act local: An example from the UN Millennium Development Goals project. LivingCapitalMetrics.com. Retrieved 18 January 2011, from Social Science Research Network: http://ssrn.com/abstract=1739386.

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr., & Stenner, A. J. (2011, January). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). Retrieved 25 October 2011, from National Science Foundation: http://www.nsf.gov/sbe/sbe_2020/submission_detail.cfm?upld_id=36.

Fisher, W. P., Jr., & Stenner, A. J. (2011, August 31 to September 2). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO) TC1-TC7-TC13 Joint Symposium, http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24493/ilm1-2011imeko-018.pdf, Jena, Germany.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.