Archive for the ‘Innovation’ Category

Assignment from Wired’s Predict What’s Next page: “Imagine the Future of Medical Bills”

March 20, 2010

William P. Fisher, Jr.

william@livingcapitalmetrics.com
New Orleans, Louisiana
20 March 2010

Consider the following, formulated in response to Wired magazine’s 18.04 request for ideas on the future of medical bills, for possible use on the Predict What’s Next page. For background on the concepts presented here, see previous posts in this blog, such as https://livingcapitalmetrics.wordpress.com/2010/01/13/reinventing-capitalism/.

Visualize an online image of a Maiuetic Renaissance Bank’s Monthly Living Capital Stock, Investment, and Income Report. The report is shown projected as a vertical plane in the space above an old antique desk. Credits and debits to and from Mary Smith’s health capital account are listed, along with similar information on all of her capital accounts. Lying on the desk is a personalized MRB Living Capital Credit/Debit card, evidently somehow projecting the report from the eyes of Mary’s holographic image on it.

The report shows headings and entries for Mary Smith’s various capital accounts:

  • liquid (cash, checking and savings),
  • property (home, car, boat, rental, investments, etc.),
  • social capital (trust, honesty, commitment, loyalty, community building, etc.) credits/debits:
    • personal,
    • community’s,
    • employer’s,
    • regional,
    • national;
  • human capital:
    • literacy credits (shown in Lexiles; http://www.lexile.com),
    • numeracy credits (shown in Quantiles; http://www.quantiles.com),
    • occupational credits (hireability, promotability, retainability, productivity),
    • health credits/debits (genetic, cognitive reasoning, physical function, emotional function, chronic disease management status, etc.); and
  • natural capital:
    • carbon credits/debits,
    • local and global air, water, ecological diversity, and environmental quality share values.

Example social capital credits/debits shown in the report might include volunteering to build houses in N’Awlins Ninth Ward, tutoring fifth-graders in math, jury duty, voting, writing letters to congress, or charitable donations (credits), on the one hand, or library fines, a parking ticket, unmaintained property, etc. (debits), on the other.

Natural capital credits might be increased or decreased depending on new efficiencies obtained in electrical grid or in power generation, a newly installed solar panel, or by a recent major industrial accident, environmental disaster, hurricane, etc.

Mary’s share of the current value of the overall Genuine National Product, or Happiness Index, is broken out by each major form of capital (liquid, property, social, human, natural).

The monetary values of credits are shown at the going market rates, alongside the changes from last month, last year, and three years ago.

One entry could be a deferred income and property tax amount, given a social capital investment level above a recommended minimum. Another entry would show new profit potentials expressed in proportions of investments wasted due to inefficiencies, with suggestions for how these can be reduced, and with time to investment recovery and amount of new social capital generated also indicated.

The health capital portion of the report is broken out in a magnified overlay. Mary’s physical and emotional function measures are shown by an arrow pointing at a level on a vertical ruler. Other arrows point at the average levels for people her age (globally, nationally, regionally, and locally), for women and women of different ages, living in different countries/cities, etc.

Mary’s diabetes-specific chronic disease management metric is shown at a high level, indicating her success in using diet and exercise to control her condition. Her life expectancy and lifetime earning potentials are shown, alongside comparable values for others.

Recent clinical visits for preventative diabetes and dental care would be shown as debits against one account and as an investment in her health capital account. The debits might be paid out of a sale of shares of stock from her quite high social or natural capital accounts, or from credits transferred from those to her checking account.

Cost of declining function in the next ten years, given typical aging patterns, shown as lower rates of new capital investment in her stock and lower ROIs.

Cost of maintaining or improving function, in terms of required investments of time and resources in exercise, equipment, etc. balanced against constant rate of new investments and ROI.

Also shown:

A footnote could read: Given your recent completion of post-baccalaureate courses in political economy and advanced living capital finance, your increased stocks of literacy, numeracy, and occupational capital qualify you for a promotion or new positions currently compensated at annual rates 17.7% higher than your current one. Watch for tweets and beams from new investors interested in your rising stock!

A warning box: We all pay when dead capital lies unleveragable in currencies expressed in ordinal or otherwise nonstandard metrics! Visit http://www.CapitalResuscitationServices.com today to convert your unaccredited capital currencies into recognized value. (Not responsible for fraudulent misrepresentations of value should your credits prove incommensurable or counterfeit. Always check your vendor’s social capital valuations before investing in any stock offering. Go to http://www.Rasch.org for accredited capital metrics equating information, courses, texts, and consultants.)

Ad: Click here to put your occupational capital stock on the market now! Employers are bidding $$$, ¥¥¥ and €€€ on others at your valuation level!

Ad: You are only 110 Lexiles away from a literacy capital stock level on which others receive 23% higher investment returns! Enroll at BobsOnlineLiteracyCapitalBoosters.com now for your increased income tomorrow! (Past performance is not a guarantee of future results. Your returns may vary. Click here to see Bob’s current social capital valuations.)

Bottom line: Think global, act local! It is up to you to represent your shares in the global marketplace. Only you can demand the improvements you seek by shifting and/or intensifying your investments. Do so whenever you are dissatisfied with your own, your global and local business partners’, your community’s, your employer’s, your region’s, or your nation’s stock valuations.

For background on the concepts involved in this scenario, see:

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [http://www.livingcapitalmetrics.com/images/FisherJAM05.pdf].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement (Elsevier), 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (Tech. Rep. No. http://www.livingcapitalmetrics.com/images/FisherNISTWhitePaper2.pdf). New Orleans: http://www.LivingCapitalMetrics.com.

Fisher, W. P., Jr. (2010). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 11, in press [http://www.livingcapitalmetrics.com/images/BringingHSN_FisherARMII.pdf].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Review of Spitzer’s Transforming Performance Measurement

January 25, 2010

Everyone interested in practical measurement applications needs to read Dean R. Spitzer’s 2007 book, Transforming performance measurement: Rethinking the way we measure and drive organizational success (New York, AMACOM). Spitzer describes how measurement, properly understood and implemented, can transform organizational performance by empowering and motivating individuals. Measurement understood in this way moves beyond quick fixes and fads to sustainable processes based on a measurement infrastructure that coordinates decisions and actions uniformly throughout the organization.

Measurement leadership, Spitzer says, is essential. He advocates, and many organizations have instituted, the C-suite position of Chief Measurement Officer (Chapter 9). This person is responsible for instituting and managing the four keys to transformational performance measurement (Chapters 5-8):

  • Context sets the tone by presenting the purpose of measurement as either negative (to inspect, control, report, manipulate) or positive (to give feedback, learn, improve).
  • Focus concentrates attention on what’s important, aligning measures with the mission, strategy, and with what needs to be managed, relative to the opportunities, capacities, and skills at hand.
  • Integration addresses the flow of measured information throughout the organization so that the covariations of different measures can be observed relative to the overall value created.
  • Interactivity speaks to the inherently social nature of the purposes of measurement, so that it embodies an alignment with the business model, strategy, and operational imperatives.

Spitzer takes a developmental approach to measurement improvement, providing a Measurement Maturity Assessment in Chapter 12, and also speaking to the issues of the “living company” raised by Arie de Geus’ classic book of that title. Plainly, the transformative potential of performance measurement is dependent on the maturational complexity of the context in which it is implemented.

Spitzer clearly outlines the ways in which each of the four keys and measurement leadership play into or hinder transformation and maturation. He also provides practical action plans and detailed guidelines, stresses the essential need for an experimental attitude toward evaluating change, speaks directly to the difficulty of measuring intangible assets like partnership, trust, skills, etc., and shows appreciation for the value of qualitative data.

Transforming Performance Measurement is not an academic treatise, though all sources are documented, with the endnotes and bibliography running to 25 pages. It was written for executives, managers, and entrepreneurs who need practical advice expressed in direct, simple terms. Further, the book does not include any awareness of the technical capacities of measurement as these have been realized in numerous commercial applications in high stakes and licensure/certification testing over the last 50 years (Andrich, 2005; Bezruczko, 2005; Bond & Fox, 2007; Masters, 2007; Wilson, 2005). This can hardly be counted as a major criticism, since no books of this kind have yet to date been able to incorporate the often highly technical and mathematical presentations of advanced psychometrics.

That said, the sophistication of Spitzer’s conceptual framework and recommendations make them remarkably ready to incorporate insights from measurement theory, testing practice, developmental psychology, and the history of science. Doing so will propel the strategies recommended in this book into widespread adoption and will be a catalyst for the emerging re-invention of capitalism. In this coming cultural revolution, intangible forms of capital will be brought to life in common currencies for the exchange of value that perform the same function performed by kilowatts, bushels, barrels, and hours for tangible forms of capital (Fisher, 2009, 2010).

Pretty big claim, you say? Yes, it is. Here’s how it’s going to work.

  • First, measurement leadership within organizations that implements policies and procedures that are context-sensitive, focused, integrated, and interactive (i.e., that have Spitzer’s keys in hand) will benefit from instruments calibrated to facilitate:
    • meaningful mapping of substantive, additive amounts of things measured on number lines;
    • data volume reductions on the order of 80-95% and more, with no loss of information;
    • organizational and individual learning trajectories defined by hierarchies of calibrated items;
    • measures that retain their meaning and values across changes in item content;
    • adapting instruments to people and organizations, instead of vice versa;
    • estimating the consistency, and the leniency or harshness, of ratings assigned by judges evaluating performance quality, with the ability to remove those effects from the performance measures made;
    • adjusting measurement precision to the needs of the task at hand, so that time and resources are not wasted in gathering too much or too little data; and
    • providing the high quality and uniform information needed for networked collective thinking able to keep pace with the demand for innovation.
  • Second, measurement leadership sensitive to the four keys across organizations, both within and across industries, will find value in:
    • establishing industry-wide metrological standards defining common metrics for the expression of the primary human, social, and natural capital constructs of interest;
    • lubricating the flow of human, social, and natural capital in efficient markets broadly defined so as to inform competitive pricing of intangible assets, products, and services; and
    • new opportunities for determining returns on investments in human, community, and environmental resource management.
  • Third, living companies need to be able to mature in a manner akin to human development over the lifespan. Theories of hierarchical complexity and developmental stage transitions that inform the rigorous measurement of cognitive and moral transformations (Dawson & Gabrielian, 2003) will increasingly find highly practical applications in organizational contexts.

Leadership of the kind described by Spitzer is needed not just to make measurement contextualized, focused, integrated, and interactive—and so productive at new levels of effectiveness—but to apply systematically the technical, financial, and social resources needed to realize the rich potentials he describes for the transformation of organizations and empowerment of individuals. Spitzer’s program surpasses the usual focus on centralized statistical analyses and reports to demand the organization-wide dissemination of calibrated instruments that measure in common metrics. The flexibility, convenience, and scientific rigor of instruments calibrated to measure in units that really add up fit the bill exactly. Here’s to putting tools that work in the hands of those who know what to do with them!

References

Andrich, D. (2005). Georg Rasch: Mathematician and statistician. In K. Kempf-Leonard (Ed.), Encyclopedia of Social Measurement (Vol. 3, pp. 299-306). Amsterdam: Academic Press, Inc.

Bezruczko, N. (Ed.). (2005). Rasch measurement in health sciences. Maple Grove, MN: JAM Press.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Dawson, T. L., & Gabrielian, S. (2003, June). Developing conceptions of authority and contract across the life-span: Two perspectives. Developmental Review, 23(2), 162-218.

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement (Elsevier), 42(9), 1278-1287.

Fisher, W. P., Jr. (2010). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 11, in press [Pre-press version available at http://www.livingcapitalmetrics.com/images/BringingHSN_FisherARMII.pdf%5D.

Masters, G. N. (2007). Special issue: Programme for International Student Assessment (PISA). Journal of Applied Measurement, 8(3), 235-335.

Spitzer, D. (2007). Transforming performance measurement: Rethinking the way we measure and drive organizational success. New York: AMACOM.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Contrasting Network Communities: Transparent, Efficient, and Invested vs Not

November 30, 2009

Different networks and different communities have different amounts of social capital going for them. As was originally described by Putnam (1993), some networks are organized hierarchically in a command-and-control structure. The top layers here are the autocrats, nobility, or bosses who run the show. Rigid conformity is the name of the game to get by. Those in power can make or break anyone. Market transactions in this context are characterized by the thumb on the scale, the bribe, and the kickback. Everyone is watching out for themselves.

At the opposite extreme are horizontal networks characterized by altruism and a sense that doing what’s good for everyone will eventually come back around to be good for me. The ideal here is a republic in which the law rules and everyone has the same price of entry into the market.

What I’d like to focus on is what’s going on in these horizontal networks. What makes one a more tightly-knit community than another? The closeness people feel should not be oppressive or claustrophic or smothering. I’m thinking of community relations in which people feel safe, not just personally but creatively. How and when are diversity, dissent and innovation not just tolerated but celebrated? What makes it possible for a market in new ideas and new ways of doing things to take off?

And how does a community like this differ from another one that is just as horizontally structured but that does not give rise to anything at all creative?

The answers to all of these questions seem to me to hinge on the transparency, efficiency, and volume of investments in the relationships making up the networks. What kinds of investments? All kinds: emotional, social, intellectual, financial, spiritual, etc. Less transparent, inefficient, and low volume investments don’t have the thickness or complexity of the relationships that we can see through, that are well lubricated, and that are reinforced with frequent visits.

Putnam (1993, p. 183) has a very illuminating way of putting this: “The harmonies of a choral society illustrate how voluntary collaboration can create value that no individual, no matter how wealthy, no matter how wily, could produce alone.” Social capital is the coordination of thought and behavior that embodies trust, good will, and loyalty. Social capital is at play when an individual can rely on a thickly elaborated network of largely unknown others who provide clean water, nutritious food, effective public health practices (sanitation, restaurant inspections, and sewers), fire and police protection, a fair and just judiciary, electrical and information technology, affordably priced consumer goods, medical care, and who ensure the future by educating the next generation.

Life would be incredibly difficult if we could not trust others to obey traffic laws, or to do their jobs without taking unfair advantage of access to special knowledge (credit card numbers, cash, inside information), etc. But beyond that, we gain huge efficiencies in our lives because of the way our thoughts and behaviors are harmonized and coordinated on mass scales. We just simply do not have to worry about millions of things that are being taken care of, things that would completely freeze us in our tracks if they weren’t being done.

Thus, later on the same page, Putnam also observes that, “For political stability, for government effectiveness, and even for economic progress social capital may be even more important than physical or human capital.” And so, he says, “Where norms and networks of civic engagement are lacking, the outlook for collective action appears bleak.”

But what if two communities have identical norms and networks, but they differ in one crucial way: one relies on everyday language, used in conversations and written messages, to get things done, and the other has a new language, one with a heightened capacity for transparent meaningfulness and precision efficiency? Which one is likely to be more creative and innovative?

The question can be re-expressed in terms of Gladwell’s (2000) sense of the factors contributing to reaching a tipping point: the mavens, connectors, salespeople, and the stickiness of the messages. What if the mavens in two communities are equally knowledgeable, the connectors just as interconnected, and the salespeople just as persuasive, but messages are dramatically less sticky in one community than the other? In one network of networks, saying things once gets the right response 99% of the time, but in the other things have to be repeated seven times before the right response comes back even 50% of the time, and hardly anyone makes the effort to repeat things that many times. Guess which community will be safer, more creative, and thriving?

All of this, of course, is just another way to bring out the importance of improved measurement for improving network quality and community life. As Surowiecki put it in The Wisdom of Crowds, the SARS virus was sequenced in a matter of weeks by a network of labs sharing common technical standards; without those standards, it would have taken any one of them weeks to do the same job alone. The messages these labs sent back and forth had an elevated stickiness index because they were more transparently and efficiently codified than messages were back in the days before the technical standards were created.

So the question emerges, given the means to create common languages with enhanced stickiness properties, such as we have in advanced measurement models, what kinds of creativity and innovation can we expect when these languages are introduced in the domains of human, social, and natural capital markets? That is the question of the age, it seems to me…

Gladwell, M. (2000). The tipping point: How little things can make a big difference. Boston: Little, Brown, and Company.

Putnam, R. D. (1993). Making democracy work: Civic traditions in modern Italy. Princeton, New Jersey: Princeton University Press.

Surowiecki, J. (2004). The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations. New York: Doubleday.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

How Measurement, Contractual Trust, and Care Combine to Grow Social Capital: Creating Social Bonds We Can Really Trade On

October 14, 2009

Last Saturday, I went to Miami, Florida, at the invitation of Paula Lalinde (see her profile at http://www.linkedin.com/pub/paula-lalinde/11/677/a12) to attend MILITARY 101: Military Life and Combat Trauma As Seen By Troops, Their Families, and Clinicians. This day-long free presentation was sponsored by The Veterans Project of South Florida-SOFAR, in association with The Southeast Florida Association for Psychoanalytic Psychology, The Florida Psychoanalytic Society, the Soldiers & Veterans Initiative, and the Florida BRAIVE Fund. The goals of the session “included increased understanding of the unique experiences and culture related to the military experience during wartime, enhanced understanding of the assessment and treatment of trauma specific difficulties, including posttraumatic stress disorder, common co-occurring conditions, and demands of treatment on trauma clinicians.”

Listening to the speakers on Saturday morning at the Military 101 orientation, I was struck by what seemed to me to be a developmental trajectory implied in the construct of therapy-aided healing. I don’t recall if anyone explicitly mentioned Maslow’s hierarchy but it was certainly implied by the dysfunctionality that attends being pushed down to a basic mode of physical survival.

Also, the various references to the stigma of therapy reminded me of Paula’s arguments as to why a community-based preventative approach would be more accessible and likely more successful than individual programs focused on treating problems. (Echoes here of positive psychology and appreciative inquiry.)

In one part of the program, the ritualized formality of the soldier, family, and support groups’ stated promises to each other suggested a way of operationalizing the community-based approach. The expectations structuring relationships among the parties in this community are usually left largely unstated, unexamined, and unmanaged in all but the broadest, and most haphazard, ways (as most relationships’ expectations usually are). The hierarchy of needs and progressive movement towards greater self-actualization implies a developmental sequence of steps or stages that comprise the actual focus of the implied contracts between the members of the community. This sequence is a measurable continuum along which change can be monitored and managed, with all parties accountable for their contracted role in producing specific outcomes.

The process would begin from the predeployment baseline, taking that level of reliability and basis of trust existing in the community as what we want to maintain, what we might want to get back to, and what we definitely want to build on and surpass, in time. The contract would provide a black-and-white record of expectations. It would embody an image of the desired state of the relationships and it could be returned to repeatedly in communications and in visualizations over time. I’ll come back to this after describing the structure of the relational patterns we can expect to observe over the course of events.

The Saturday morning discussion made repeated reference to the role of chains in the combat experience: the chain of command, and the unit being a chain only as strong as its weakest link. The implication was that normal community life tolerates looser expectations, more informal associations, and involves more in the way of team interactions. The contrast between chains and teams brought to mind work by Wright (1995, 1996a, 1996b; Bainer, 1997) on the way the difficulties of the challenges we face influence how we organize ourselves into groups.

Chains tend to form when the challenge is very difficult and dangerous; here we have mountain climbers roped together, bucket brigades putting out fires, and people stretching out end-to-end over thin ice to rescue someone who’s fallen through. In combat, as was stressed repeatedly last Saturday, the chain is one requiring strict follow-through on orders and promises; lives are at risk and protecting them requires the most rigorous adherence to the most specific details in an operation.

Teams form when the challenge is not difficult and it is possible to coordinate a fluid response of partners whose roles shift in importance as the situation changes. Balls are passed and the lead is taken by each in turn, with others getting out of the way or providing supports that might be vitally important or merely convenient.

A third kind of group, packs, forms when the very nature of the problem is at issue; here, individuals take completely different approaches in an exploratory determination of what is at issue, and how it might be addressed. Examples include the Manhattan Project, for instance, where scientists following personal hunches went in their own directions looking for solutions to complex problems. Wolves and other hunting parties form packs when it is impossible to know where the game might be. And though the old joke says that the best place to look for lost keys is where there’s the most light, if you have others helping you, it’s best to split up and not be looking for them in the same place.

After identifying these three major forms of organization, Wright (1996b) saw that individual groups might transition to and from different modes of organization as the nature of the problem changed. For instance, a 19th-century wagon train of settlers heading through the American West might function well as a team when everyone feels safe traveling along with a cavalry detachment, the road is good, the weather is pleasant, and food and water are plentiful. Given vulnerability to attacks by Native Americans, storms, accidents, lack of game, and/or muddy, rutted roads, however, the team might shift toward a chain formation and circle the wagons, with a later return to the team formation after the danger has passed. In the worst case scenario, disaster breaks the chain into individuals scattered like a pack to fend for themselves, with the limited hope of possibly re-uniting at some later time as a chain or team.

In the current context of the military, it would seem that deployment fragments the team, with the soldier training for a position in the chain of command in which she or he will function as a strong link for the unit. The family and support network can continue to function together and separately as teams to some extent, but the stress may require intermittent chain forms of organization. Further, the separation of the soldier from the family and support would seem to approach a pack level of organization for the three groups taken as a whole.

An initial contract between the parties would describe the functioning of the team at the predeployment stage, recognize the imminent breaking up of the team into chains and packs, and visualize the day when the team would be reformed under conditions in which significant degrees of healing will be required to move out of the pack and chain formations. Perhaps there will be some need and means of countering the forcible boot camp enculturation with medicinal antidote therapies of equal but opposite force. Perhaps some elements of the boot camp experience could be safely modified without compromising the operational chain to set the stage for reintegrating the family and community team.

We would want to be able to draw qualitative information from all three groups as to the nature of their experiences at every stage. I think we would want to focus the information on descriptions of the extent to which each level in Maslow’s hierarchy is realized. This information would be used in the design of an assessment that would map out the changes over time, set up the evaluation framework, and guide interventions toward reforming the team. Given their experience with the healing process, the presenters from last Saturday have obvious capacities for an informed perspective on what’s needed here. And what we build with their input would then also plainly feed back into the kind of presentation they did.

There will likely be signature events in the process that will be used to trigger new additions to the contract, as when the consequences of deployment, trauma, loss, or return relative to Maslow’s hierarchy can be predicted. That is, the contract will be a living document that changes as goals are reached or as new challenges emerge.

This of course is all situated then within the context of measures calibrated and shared across the community to inform contracts, treatment, expectations, etc. following the general metrological principles I outline in my published work (see references).

The idea will be for the consistent production of predictable amounts of impact in the legally binding contractual relationships, such that the benefits produced in terms of individual functionality will attract investments from those in positions to employ those individuals, and from the wider society that wants to improve its overall level of mental health. One could imagine that counselors, social workers, and psychotherapists will sell social capital bonds at prices set by market forces on the basis of information analogous to the information currently available in financial markets, grocery stores, or auto sales lots. Instead of paying taxes, corporations would be required to have minimum levels of social capitalization. These levels might be set relative to the value the organization realizes from the services provided by public schools, hospitals, and governments relative to the production of an educated, motivated, healthy workforce able to get to work on public roads, able to drink public water, and living in a publicly maintained quality environment.

There will be a lot more to say on this latter piece, following up on previous blogs here that take up the topic. The contractual groundwork that sets up the binding obligations for formal agreements is the thought of the day that emerged last weekend at the session in Miami. Good stuff, long way to go, as always….

References
Bainer, D. (1997, Winter). A comparison of four models of group efforts and their implications for establishing educational partnerships. Journal of Research in Rural Education, 13(3), 143-152.

Fisher, W. P., Jr. (1995). Opportunism, a first step to inevitability? Rasch Measurement Transactions, 9(2), 426 [http://www.rasch.org/rmt/rmt92.htm].

Fisher, W. P., Jr. (1996, Winter). The Rasch alternative. Rasch Measurement Transactions, 9(4), 466-467 [http://www.rasch.org/rmt/rmt94.htm].

Fisher, W. P., Jr. (1997a). Physical disability construct convergence across instruments: Towards a universal metric. Journal of Outcome Measurement, 1(2), 87-113.

Fisher, W. P., Jr. (1997b, June). What scale-free measurement means to health outcomes research. Physical Medicine & Rehabilitation State of the Art Reviews, 11(2), 357-373.

Fisher, W. P., Jr. (1998). A research program for accountable and patient-centered health status measures. Journal of Outcome Measurement, 2(3), 222-239.

Fisher, W. P., Jr. (2000). Objectivity in psychosocial measurement: What, why, how. Journal of Outcome Measurement, 4(2), 527-563 [http://www.livingcapitalmetrics.com/images/WP_Fisher_Jr_2000.pdf].

Fisher, W. P., Jr. (2004, October). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-54.

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [http://www.livingcapitalmetrics.com/images/FisherJAM05.pdf].

Fisher, W. P., Jr. (2008). Vanishing tricks and intellectualist condescension: Measurement, metrology, and the advancement of science. Rasch Measurement Transactions, 21(3), 1118-1121 [http://www.rasch.org/rmt/rmt213c.htm].

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement (Elsevier), 42(9), 1278-1287.

Wright, B. D. (1995). Teams, packs, and chains. Rasch Measurement Transactions, 9(2), 432 [http://www.rasch.org/rmt/rmt92j.htm].

Wright, B. D. (1996a). Composition analysis: Teams, packs, chains. In G. Engelhard & M. Wilson (Eds.), Objective measurement: Theory into practice, Vol. 3 (pp. 241-264). Norwood, New Jersey: Ablex [http://www.rasch.org/memo67.htm].

Wright, B. D. (1996b). Pack to chain to team. Rasch Measurement Transactions, 10(2), 501 [http://www.rasch.org/rmt/rmt102s.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Comments on the National Accounts of Well-Being

October 4, 2009

Well-designed measures of human, social, and natural capital captured in genuine progress indicators and properly put to work on the front lines of education, health care, social services, human and environmental resource management, etc. will harness the profit motive as a driver of growth in human potential, community trust, and environmental quality. But it is a tragic shame that so many well-meaning efforts ignore the decisive advantages of readily available measurement methods. For instance, consider the National Accounts of Well-Being (available at http://www.nationalaccountsofwellbeing.org/learn/download-report.html).

This report’s authors admirably say that “Advances in the measurement of well-being mean that now we can reclaim the true purpose of national accounts as initially conceived and shift towards more meaningful measures of progress and policy effectiveness which capture the real wealth of people’s lived experience” (p. 2).

Of course, as is evident in so many of my posts here and in the focus of my scientific publications, I couldn’t agree more!

But look at p. 61, where the authors say “we acknowledge that we need to be careful about interpreting the distribution of transformed scores. The curvilinear transformation results in scores at one end of the distribution being stretched more than those at the other end. This means that standard deviations, for example, of countries with higher scores, are likely to be distorted upwards. As the results section shows, however, this pattern was not in fact found in our data, so it appears that this distortion does not have too much effect. Furthermore, being overly concerned with the distortion would imply absolute faith that the original scales used in the questions are linear. Such faith would be ill-founded. For example, it is not necessarily the case that the difference between ‘all or almost all of the time’ (a response scored as ‘4’ for some questions) and ‘most of the time’ (scored as ‘3’), is the same as the difference between ‘most of the time’ (‘3’) and ‘some of the time’ (‘2’).”

This is just incredible, that the authors admit so baldly that their numbers don’t add up at the same time that they offer those very same numbers in voluminous masses to a global audience that largely takes them at face value. What exactly does it mean to most people “to be careful about interpreting the distribution of transformed scores”?

More to the point, what does it mean that faith in the linearity of the scales is ill-founded? They are doing arithmetic with those scores! There is no way a constant difference between each number on the scale cannot be assumed! Instead of offering cautions, the creators of anything as visible and important as National Accounts of Well Being ought to do the work needed to construct scales that measure in numbers that add up. Instead of saying they don’t know what the size of the unit of measurement is at different places on the ruler, why don’t they formulate a theory of the thing they want to measure, state testable hypotheses as to the constancy and invariance of the measuring unit, and conduct the experiments? It is not, after all, as though we do not have a mature measurement science that has been doing this kind of thing for more than 80 years.

By its very nature, the act of adding up ratings into a sum, and dividing by the number of ratings included in that sum to produce an average, demands the assumption of a common unit of measurement. But practical science does not function or advance on the basis of untested assumptions. Different numbers that add up to the same sum have to mean the same thing: 1+3+4=8=2+3+3, etc. So the capacity of the measurement system to support meaningful inferences as to the invariance of the unit has to be established experimentally.

There is no way to do arithmetic and compute statistics on ordinal rating data without assuming a constant, additive unit of measurement. Either unrealistic demands are being made on people’s cognitive abilities to stretch and shrink numeric units, or the value of the numbers as a basis for action is seriously and unnecessarily compromised.

A lot can be done to construct linear units of measurement that provide the meaningfulness desired by the developers of the National Accounts of Well-Being.

For explanations and illustrations of why scores and percentages are not measures, see https://livingcapitalmetrics.wordpress.com/2009/07/01/graphic-illustrations-of-why-scores-ratings-and-percentages-are-not-measures-part-one/.

The numerous advantages real measures have over raw ratings are listed at https://livingcapitalmetrics.wordpress.com/2009/07/06/table-comparing-scores-ratings-and-percentages-with-rasch-measures/.

To understand the contrast between dead and living capital as it applies to measures based in ordinal data from tests and rating scales, see http://www.rasch.org/rmt/rmt154j.htm.

For a peer-reviewed scientific paper on the theory and research supporting the viability of a metric system for human, social, and natural capital, see http://dx.doi.org/doi:10.1016/j.measurement.2009.03.014.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Posted today at HealthReform.gov

July 26, 2009

Any bill serious about health care reform needs to demand that the industry take advantage of readily available and dramatically improved measurement methods. We manage what we measure, and 99% of existing outcome measures are measures in name only. A kind of metric system for outcomes could provide standard product definitions, could effect huge reductions in information transaction costs, and could bring about a whole new magnitude of market efficiencies. Far from being a drag on the system, the profit motive is the best source of energy we have for driving innovation and resetting the cost-quality equation. But the disastrously low quality of our measures corrupts the data and prevents informed decision making by consumers and quality improvement experts. Any health care reform effort that does not demand improved measurement is doomed to fall far short of the potential that is within our reach. For more information, see www.Rasch.org, www.livingcapitalmetrics.com, http://dx.doi.org/10.1016/j.measurement.2009.03.014, or http://home.att.net/~rsmith.arm/RMHS_flyer.pdf.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Graphic Illustrations of Why Scores, Ratings, and Percentages Are Not Measures, Part Two

July 2, 2009

Part One of this two-part blog offered pictures illustrating the difference between numbers that stand for something that adds up and those that do not. The uncontrolled variation in the numbers that pass for measures in health care, education, satisfaction surveys, performance assessments, etc. is analogous to the variation in weights and measures found in Medieval European markets. It is well established that metric uniformity played a vital role in the industrial and scientific revolutions of the nineteenth century. Metrology will inevitably play a similarly central role in the economic and scientific revolutions taking place today.

Clients and students often express their need for measures that are manageable, understandable, and relevant. But sometimes it turns out that we do not understand what we think we understand. New understandings can make what previously seemed manageable and relevant appear unmanageable and irrelevant. Perhaps our misunderstandings about measurement will one day explain why we have failed to innovate and improve as much as we could have.

Of course, there are statistical methods for standardizing scores and proportions that make them comparable across different normal distributions, but I’ve never once seen them applied to employee, customer, or patient survey results reported to business or hospital managers. They certainly are not used in determining comparable proficiency levels of students under No Child Left Behind. Perhaps there are consultants and reporting systems that make standardized z-scores a routine part of their practices, but even if they are, why should anyone willingly base their decisions on the assumption that normal distributions have been obtained? Why not use methods that give the same result no matter how scores are distributed?

To bring the point home, if statistical standardization is a form of measurement, why don’t we use the z-scores for height distributions instead of the direct measures of how tall we each are? Plainly, the two kinds of numbers have different applications. Somehow, though, we try to make do without the measures in many applications involving tests and surveys, with the unfortunate consequence of much lost information and many lost opportunities for better communication.

Sometimes I wonder, if we would give a test on the meaning of the scores, percentages, and logits discussed in Part One to managers, executives, and entrepreneurs, would many do any better on the parts they think they understand than on the parts they find unfamiliar? I suspect not. Some executives whose pay-for-performance bonuses are inflated by statistical accidents are going to be unhappy with what I’m going to say here, but, as I’ve been saying for years, clarifying financial implications will go a long way toward motivating the needed changes.

How could that be true? Well, consider the way we treat percentages. Imagine that three different hospitals see their patients’ percents agreement with a key survey item change as follows. Which one changed the most?

 

A. from 30.85% to 50.00%: a 19.15% change

B. from 6.68% to 15.87%: a 9.18% change

C. from 69.15% to 84.13%: a 14.99% change

As is illustrated in Figure 1 below, given that all three pairs of administrations of the survey are included together in the same measure distribution, it is likely that the three changes were all the same size.

In this scenario, all the survey administrations shared the same standard deviation in the underlying measure distribution that the key item’s percentage was drawn from, and they started from different initial measures. Different ranges in the measures are associated with different parts of the sample’s distribution, and so different numbers and percentages of patients are associated with the same amount of measured change. It is easy to see that 100-unit measured gains in the range of 50-150 or 1000-1100 on the horizontal axis would scarcely amount to 1% changes, but the same measured gain in the middle of the distribution could be as much as 25%.

Figure 1. Different Percents, Same Measures

Figure 1. Different Percentages, Same Measures

Figure 1 shows how the same measured gain can look wildly different when expressed as a percentage, depending on where the initial measure is positioned in the distribution. But what happens when percentage gains are situated in different distributions that have different patterns of variation?

More specifically, consider a situation in which three different hospitals see their percents agreement with a key survey item change as follows.

A. from 30.85% to 50.00%: a 19.15% change

B. from 30.85% to 50.00%: a 19.15% change

C. from 30.85% to 50.00%: a 19.15% change

Did one change more than the others? Of course, the three percentages are all the same, so we would naturally think that the three increases are all the same. But what if the standard deviations characterizing the three different hospitals’ score distributions are different?

Figure 2, below, shows that the three 19.15% changes could be associated with quite different measured gains. When the distribution is wider and the standard deviation is larger, any given percentage change will be associated with a larger measured change than in cases with narrower distributions and smaller standard deviations.

Same Percentage Gains, Different Measured Gains

Figure 2. Same Percentage Gains, Different Measured Gains

And if this is not enough evidence as to the foolhardiness of treating percentages as measures, bear with me through one more example. Imagine another situation in which three different hospitals see their percents agreement with a key survey item change as follows.

A. from 30.85% to 50.00%: a 19.15% change

B. from 36.96% to 50.00%: a 13.04% change

C. from 36.96% to 50.00%: a 13.04% change

Did one change more than the others? Plainly A obtains the largest percentage gain. But Figure 3 shows that, depending on the underlying distribution, A’s 19.15% gain might be a smaller measured change than either B’s or C’s. Further, B’s and C’s measures might not be identical, contrary to what would be expected from the percentages alone.

Figure 3. Percentages Completely at Odds with Measures

Figure 3. Percentages Completely at Odds with Measures

Now we have a fuller appreciation of the scope of the problems associated with the changing unit size illustrated in Part One. Though we think we understand percentages and insist on using them as something familiar and routine, the world that they present to us is as crazily distorted as a carnival funhouse. And we won’t even begin to consider how things look in the context of distributions skewed toward one end of the continuum or the other! There is similarly no point at all in going to bimodal or multimodal distributions (ones that have more than one peak). The vast majority of business applications employing scores, ratings, and percentages as measures do not take the underlying distribution into account. Given the problems that arise in optimal conditions (i.e., with a normal distribution), there is no need to belabor the issue with an enumeration of all the possible things that could be going wrong. Far better to simply move on and construct measurement systems that remain invariant across the different shapes of local data sets’ particular distributions.

How could we have gone so far in making these nonsensical numbers the focus of our attention? To put things back in perspective, we need to keep in mind the evolving magnitude of the problems we face. When Florence Nightingale was deploring the lack of any available indications of the effectiveness of her efforts, a little bit of flawed information was a significant improvement over no information. Ordinal, situation-specific numbers provided highly useful information when problems emerged in local contexts on a scale that could be comprehended and addressed by individuals and small groups.

We no longer live in that world. Today’s problems require kinds of information that must be more meaningful, precise, and actionable than ever before. And not only that, this information cannot remain accessible only to managers, executives, researchers, and data managers. It must be brought to bear in every transaction and information exchange in the industry.

Information has to be formatted in the common currency of uniform metrics to make it as fluid and empowering as possible. Would the auto industry have been able to bring off a quality revolution if every worker’s toolkit was calibrated in a different unit? Could we expect to coordinate schedules easily if we each had clocks scaled in different time units? Obviously not; why should we expect quality revolutions in health care and education when nearly all of our relevant metrics are incommensurable?

Management consultants realized decades ago that information creates a sense of responsibility in the person who possesses it. We cannot expect clinicians and teachers to take full responsibility for the outcomes they produce until they have the information they need to evaluate and improve them. Existing data and systems plainly are not up to the task.

The problem is far less a matter of complex or difficult issues than it is one of culture and priorities. It often takes less effort to remain in a dysfunctional rut and deal with massive inefficiencies than it does to get out of the rut and invent a new system with new potentials. Big changes tend to take place only when systems become so bogged down by their problems that new systems emerge simply out of the need to find some way to keep things in motion. These blogs are written in the hope that we might be able to find our way to new methods without suffering the catastrophes of total system failure. One might well imagine an entrepreneurially-minded consortium of providers, researchers, payors, accreditors, and patient advocates joining forces in small pilot projects testing out new experimental systems.

To know how much of something we’re getting for our money and whether its a fair bargain, we need to be able to compare amounts across providers, vendors, treatment options, teaching methods, etc. Scores summed from tests, surveys, or assessments, individual ratings, and percentages of a maximum possible score or frequency do not provide this information because they are not measures. Their unit sizes vary across individuals, collections of indicators (instruments), time, and space. The consequences of treating scores and percentages as measures are not trivial. We will eventually come to see that measurement quality is the primary source of the differences between the current health care and education systems’ regional variations and endlessly spiralling costs, on the one hand, and the geographically uniform quality, costs, and improvements in the systems we will create in the future.

Markets are dysfunctional when quality and costs cannot be evaluated in common terms by consumers, providers’ quality improvement specialists, researchers, accreditors, and payers. There are widespread calls for greater transparency in purchasing decisions, but transparency is not being defined and operationalized meaningfully or usefully. As currently employed, transparency refers to making key data available for public scrutiny. But these data are almost always expressed as scores, ratings, or percentages that are anything but transparent. In addition to not adding up, these data are also usually presented in indigestibly large volumes, and are not quality assessed.

All things considered, we’re doing amazingly well with our health care and education systems given the way we’ve hobbled ourselves with dysfunctional, incommensurable measures. And that gives us real cause for hope! What will we be able to accomplish when we really put our minds to measuring what we want to manage? How much better will we be able to do when entrepreneurs have the tools they need to innovate new efficiences? Who knows what we’ll be capable of when we have meaningful measures that stand for amounts that really add up, when data volumes are dramatically reduced to manageable levels, and when data quality is effectively assessed and improved?

For more on the problems associated with these kinds of percentages in the context of NCLB, see Andrew Dean Ho’s article in the August/September, 2008 issue of Educational Researcher, and Charles Murray’s “By the Numbers” column in the July 25, 2006 Wall Street Journal.

This is not the end of the story as to what the new measurement paradigm brings to bear. Next, I’ll post a table contrasting the features of scores, ratings, and percentages with those of measures. Until then, check out the latest issue of the Journal of Applied Measurement at http://www.jampress.org, see what’s new in measurement software at http://www.winsteps.com or http://www.rummlab.com.au, or look into what’s up in the way of measurement research projects with the BEAR group at UC Berkeley (http://gse.berkeley.edu/research/BEAR/research.html).

Finally, keep in mind that we are what we measure. It’s time we measured what we want to be.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Infrastructure and Health Care Reform

June 25, 2009

As an educator and researcher involved in the theory and application of advanced measurement methods, I am both encouraged by the (June 14) New York Times Sunday magazine’s focus on infrastructure, and chagrined at the uninformed level at which ongoing health care and economic reform discussions and analyses are taking place (as evident in the Sunday, June 21, Times editorial and business pages).

Socialistic solutions to problems in education, health care, and the economy at large are the inevitable outcome of our incomplete implementation and understanding of market capitalism. Take, for instance, the rancorous debate as to whether we should create a new public health insurance plan to compete with private plans. None of the proposals or counter proposals amount to anything more than alternate ways of manhandling health care resources toward one or another politically predetermined end. Accordingly, we find ourselves in the dilemma of choosing between equally real dangers. On the one hand, reduced payments and cost-cutting might do nothing but lower the quality and quantity of the available services, and, on the other hand, maintaining quality and quantity will eventually make health care completely unaffordable.

And here is what really gets me: apart from blind faith in the power of reduced payments to promote innovation, there is nary a word about how to set up a market infrastructure that will allow the invisible hand to do its work in bringing supply and demand efficiently into balance. Far from seeking ways in which costs can be reduced and profits enhanced at the same time, as they are in other industries, the automatic assumption in health care always seems to be that lower costs mean lower profits. We have always thought socialistically about health care, with economists, since Arrow, widely holding that health care is constitutionally incapable of sustaining a market economy. Hope that the economists are wrong appears to spring eternal, but who is doing the work to find a new way?

A new direction shows itself when we listen more closely to ourselves, and follow through on our basically valid intuitions. For instance, issues of sustainability, justice, and responsibility in the economic conversation employ the word “capital” to refer to a wide variety of resources essential to productivity, such as health, literacy, numeracy, community, and the air, water, and food services provided by nature.

The problem is that there seems to be little or no interest in figuring out how to transform this usage from an empty metaphor into a powerful tool. We similarly repeat ad nauseum the mantra, “you manage what you measure,” but almost nothing is being done to employ the highly advantageous features of advanced measurement theory and practice in the management of intangible forms of capital.

Better measurement of living capital is, however, absolutely essential to health care reform, entrepreneurial innovations in education, and to reinventing capitalism.  Instead of continuing to rely on highly variable local efforts at measuring and managing human, social, and natural capital, we need a broad program of capacity building focused on a metrological infrastructure of living capital, and its implementations.  If there is any one single blind spot that prevents us from fully learning the lessons of our recent economic disasters, it is the potential that new measurement technologies offer for reduced frictions and lower transaction costs in the intangible capital markets.

We know where to start, from two basic principles of market economics. First, we know the transaction costs are the most important costs in any market.  High transaction costs can strangle a market as the flow of capital is stifled. Second, we know that innovation, essential to product development, improvements, marketing, and enhanced profitability, is almost never accomplished by an individual working in isolation. Innovation requires an environment in which it is safe to play, to make mistakes, and through which new value can be immediately and decisively recognized for what it is.

How can living capital market frictions be reduced? For starters, we could focus on effecting order-of-magnitude improvements in the meaningfulness of the metrics we use for screening, diagnosis, research, and accountability. We can do whatever arithmetic we want with the numbers we have at hand, but most of the numbers that pass for measures of health, functionality, quality of life and care, etc. do not actually stand for something that adds up. The good news is that, again, the intuitions informing our efforts so far are largely valid, and have the ball rolling in the right direction.

How can better measurement advance the cause of innovation in health care? By providing a common language that all stakeholders can think and act in together, harmoniously. Research over the last 80 years has repeatedly proven the viability of a kind of a metric system for the things we measure with surveys, assessments, and tests. Such a system of universally uniform metrics would provide the common currency unifying the health care economy and establishing the basis for market self-organization. But contrary to our predominant metaphysical faith, scientifically proven results do not magically propagate themselves into the world. We have to invent and construct the systems we need.

Our efforts in this direction are stymied, as Tom Vanderbilt put it in the Times Sunday magazine on infrastructure, to the extent that we have “an inimical incuriosity” about the banal fundamentals of the systems that shape our world. We simply take dry technicalities for granted, and notice them only when they fail us. Our problem with intangibles measurement, then, is compounded by the fact that the infrastructure we are taking for granted is not just invisible or broken, it is nonexistent. Until we make the effort to build our capacity for managing health and other forms of living capital by creating reference standard common currencies for expressing, managing, and trading on their value, all of our efforts at health care reform–and at reinventing capitalism–will fall far short of what is possible.
William P. Fisher, Jr., Ph.D.
william@livingcapitalmetrics.com
http://www.LivingCapitalMetrics.com

We are what we measure.
It’s time we measured what we want to be.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

How Bad Measurement Stymies Health Care Reform Efforts

June 9, 2009

or

The Strange Absence of Measurement Awareness in the Debate over Health Care Reform

It is not as though measurement never comes up as a topic when advocates of one or another approach to health care reform have their say. But awareness of what measurement can and should do is strangely absent. No one at all speaks to what is most important about measurement, and how the essentials are missing in what passes for measurement in health care. And I’m just talking basics here. We can save for another time some of the especially relevant capacities and features of metric technologies as they have evolved over the last 80 years, and as they have been in use for over 30 years.

To live up to the full meaning of the term, measures have to do some very specific things. To keep things simple, all we need to do is consider how we use measures in something as everyday as shopping in the grocery store. The first thing we expect from measures are numbers that stand for something that adds up the way they do. The second thing measures have to do is to stay the same no matter where we go.

Currently popular methods of measurement in health care do not meet either of these expectations. Ratings from surveys and assessments, counts of events, and percentages of the time that something happens are natural and intuitive places from which to begin measurement, but these numbers do not and cannot live up to our expectations as to how measures behave. To look and act like real measures, these kinds of raw data must be evaluated and transformed in specific ways, using widely available and mathematically rigorous methodologies.

None of this is any news to researchers. The scientific literature is full of reports on the theory and practice of advanced measurement. The philosopher, Charles Sanders Peirce, described the mathematics of rigorous measurement 140 years ago. Louis Thurstone, an electrical engineer turned psychologist, took major steps toward a practical science of rigorous measurement in the 1920s. Health care admissions, graduation, and professional licensure and certification examinations have employed advanced measurement since the 1970s. There are a great many advantages that would be gained if the technologies used in health care’s own educational measurement systems were applied within health care itself.

Though we rarely stop to think about it, we all know that fair measures are essential to efficient markets. When different instruments measure in different units, market transactions are encumbered by the additional steps that must be taken to determine the value of what is being bought and sold. Health care is now so hobbled by its myriad varieties of measures that common product definitions seem beyond reach.

And we have lately been alerted to the way in which innovation is more often a product of a collective cognitive effort than it is of any one individual’s effort. For the wisdom of crowds to reach a critical mass at which creativity and originality take hold, we must have in place a common currency for the exchange of value, i.e., a universal, uniform metric calibrated so as to be traceable to a reference standard shared by all.

Since the publication of a seminal paper by Kenneth Arrow in the early 1960s, many economists have taken it for granted that health care is one industry in which common product definitions are impossible. The success of advanced measurement applications in health care research over the last 30 years contradicts that assumption.

It’s already been 14 years since I myself published a paper equating two different instruments for assessing physical functioning in physical medicine and rehabilitation. Two years later I published another paper showing that 10 different published articles reporting calibrations of four different functional assessments all showed the same calibration results for seven or eight similar items included on each instrument. What many will find surprising about this research is that consensus on the results was obtained across different samples of patients seen by different providers and rated by different clinicians on different instruments. What we have in this research is a basis for a generalized functional assessment metric.

Simply put, in that research, I showed how our two basic grocery store assumptions about measurement could be realized in the context of ratings assigned by clinicians to patients’ performances of basic physical activities and mobility skills. With measures that really add up and are as universally available as the measures we take for granted in the grocery store, we could have a system in which health care purchasers and consumers can make more informed decisions about the relationship between price and value. With such a system, quality improvement efforts could be coordinated at the point of care, on the basis of observations expressed in a familiar language.

Some years ago, quality improvement researchers raised the question as to why there are no health care providers who have yet risen to the challenge and redefined the industry relative to quality standards, in the manner that Toyota did for the automobile industry. There have, in fact, been many who tried, both before and since that question was asked.

Health care providers have failed in their efforts to emulate Toyota in large part because the numbers taken for measures in health care are not calibrated and maintained the way the automobile industry’s metrics are. It is ironic that something as important as measurement, something that receives so much lip service, should nonetheless be so widely skipped over and taken for granted.

What we need is a joint effort on the part of the National Institutes of Health and the National Institute of Standards and Technology focused on the calibration and maintenance of the metrics health care must have to get costs under control. We need to put our money and resources where our mouths are. We will be very glad we did when we see the kinds of returns on investment (40%-400% and more) that NIST reports for metrological improvement studies in other industries.

And Here It Is: The Next Major Technological Breakthrough

May 29, 2009

How It Will Transform Your Business and Your Life

We’ve all witnessed an amazing series of events in our lifetimes, and, hopefully, we’ve learned some important lessons over the years. In business, we’ve come to see that innovation is rarely the work of one person. When the crowd has the right tools and puts its mind to the task, nothing can stop it. We’re accordingly also learning the real truth of the fact that any firm’s greatest resource is its people—there is no more effective source of new efficiencies and whole new directions. Concern for social responsibility is no longer the exclusive domain of activists, since everyone is now attuned to the susceptibility of markets to unrestrained greed. And there are increasingly good reasons for thinking that perhaps we can reverse ongoing major environmental debacles and orient our systems to profits that are sustainable over the long term.

And in our personal lives, we’ve learned the vital importance of access to learning opportunities across the lifespan, access to health care, and caring relationships. Whether we call it spiritual or not, life is hardly worth living without a sense of wonder at the very existence of the universe and all the strange things inhabiting it.

We’ve learned a few things, then. Perhaps foremost among them is that we are going to have to adapt to the changes we ourselves bring about. And given the pace of change and the plain need to do better, we don’t hear anyone repeating Lord Kelvin’s famous opinion, from the end of the nineteenth century, that pretty much everything that can be discovered has been discovered. (Though isn’t there someone at Microsoft who could top the classics “No one will ever need more than 640k memory—or more than one browser tab”?) With everything that’s happened in the 100 years or so since Kelvin’s remark, one of the big lessons that has been learned is a certain humility, at least in that regard.

Change is in the air, that’s for sure, even if it doesn’t seem that there is any one particular form of it. But in fact there is an important new technology coming on line. It isn’t really new. Viewed narrowly, it has been taking shape for over 80 years, even though its root mathematical principles go back to Plato (like so many do). And, at least in retrospect, this new technology’s major features may seem very humdrum and mundane, they are so everyday.

So just what is going on? Speaking in Abu Dhabi on Monday, May 25, Nobel economist Paul Krugman suggested that economic recovery could come about in the wake of a new major technological breakthrough, one of the size and scope of the IT revolution of the 1990s. Other factors cited by Krugman as candidates for turning things around included more investment by major corporations, and new climate change regulations and policies.

Industry-wide systems of metrological reference standards for human, social, and natural capital fit the bill. They are a new technological breakthrough on the scale of the initial IT revolution. They would also be a natural outgrowth of existing IT systems and an extension of existing global trade standards. Such systems would also require large investments from major corporations, and would facilitate highly significant moves on climate change.

In addition, stepping beyond the solutions suggested by Krugman, systematic and objective methods of measuring living capital would help meet the widely recognized need for socially responsible and sustainable business practices. Better measurement will play a vital role in reducing transaction costs and making human, social, and natural capital markets more efficient. It will also be essential to fostering new forms of innovation, as the shared standards and common product definitions made possible by advanced measurement systems enable people to think and act together collectively in common languages.

Striking advances have been made in measurement practice in recent years. It is easy to assume that the assignment of numbers to observations suffices as measurement, and that there have been no developments worthy of note in measurement theory or practice for decades. Nothing could be further from the truth. You don’t know the first thing about what you don’t know about measurement.

I came into the study and use of mathematically rigorous measurement and instrument calibration methods from the history and philosophy of science. The principles that make rulers, weight scales, clocks, and thermometers as meaningful, convenient and practical as they are, and that drive engineering practices in high tech, for instance, are pretty well understood. What’s more, those principles have been successfully applied to tests, rating scales, and assessments for decades, primarily in high stakes graduation, admissions, and certification/licensure testing. Increasingly these principles are finding their way into health care and business.

The general public doesn’t know much about all of this because the math is pretty intense, the software is hard to use, and we have an ingrained cultural prejudice that says all we have to do is come up with numbers of some kind, and–voila!– we have measurement. Nothing could be further from the truth.

My goal in all of this is to figure out how to put tools that work in the hands of the people who need them. You don’t need a PhD in thermodynamics to read a thermometer, so we ought to be able to calibrate similar instruments for other things we want to measure. And the way transparency and accountability demands are converging with economics and technology, I think the time is ripe for new ideas properly presented.

A quick way to see the point is to recognize that fair and just measures have to represent something that adds up the way the numbers do. Numbers don’t just automatically do that. We invest huge resources in crafting good instruments in the natural sciences, but we assume anyone at all can put together a measure using counts of right answers or sums of ratings or percents of the time some event occurs. But none of these are measures. Numbers certainly always add up in the same way, but whether they are meaningful or not is a question that is rarely asked. The numbers we often take as measures of outcomes or results or processes almost never stand for something that adds up the way everyone thinks they do.

So, yes, I know we need metrics that are manageable, understandable, and relevant. And I know how quickly people’s eyes glaze over in face of what they think are irrelevant technicalities. But eyes also tend to glaze over when something unexpected and completely different is offered. True originality is not easily categorized or recognized for what it is. And when something is fundamentally different from what people are used to, it can be rejected just because it is more trouble to to make the transition to a new system than it is to remain with the existing system, no matter how dysfunctional it is.

And boy is the current way of developing and deploying business metrics dysfunctional! Do you know that the difference between 1 percent and 2 percent can represent 4-8 times the difference between 49 percent and 50 percent? Did you know that sometimes a 15% difference can stand for as much as or even a lot more than a 39% difference? Did you know that three markedly different percentage values—differences that vary by more than a standard error or even five—might actually stand for the same measured amount?

In my 25 years of experience in measurement, people often turn out to not understand what they think they understand. And they then also turn out to be amazed at what they learn when they take the trouble to put some time and care into crafting an instrument that really measures what they’re after.

For instance, did you know that there are mathematical ways of reducing data volume that not only involve no loss of information but that actually increase the amount of actionable value? Given the way we are swimming in seas of data that do not usually mean what we think they mean, being able to experimentally make sure things add up properly at the same time we reduce the volume of numbers we have to deal with seems to me to be an eminently practical aid to understanding and manageability.

Did you know that different sets of indicators or items can measure in a common metric? Or that a large bank of items can be adaptively administered, with the instrument individually tailored and customized for each respondent, organization, or situation, all without compromising the comparability of the measures?

These are highly practical things to be able to do. Markets live and die on shared product definitions and shared metrics. Innovation almost never happens as a result of one person’s efforts; it is almost always a result of activities coordinated through a network structured by a common language of reference standards. We are very far from having the markets and levels of innovation we need in large part because the quality of measurement in so many business applications is so poor. But that is going to change in very short order as those most banal of subjects, measurement and metrological systems, catch fire.