Archive for the ‘pay-for-performance’ Category

Subjectivity, Objectivity, Performance Measurement and Markets

April 23, 2011

Though he attributes his insight to a colleague (George Baker), Michael Jensen has once more succinctly stated a key point I’ve repeatedly tried to convey in my blog posts. As Jensen (2003, p. 397) puts it,

…any activity whose performance can be perfectly measured objectively does not belong inside the firm. If its performance can be adequately measured objectively it can be spun out of the firm and contracted for in a market transaction.

YES!! Though nothing is measured perfectly, my message has been a series of variations on precisely this theme. Well-measured property, services, products, and commodities in today’s economy are associated with scientific, legal and financial structures and processes that endow certain representations with meaningful indications of kind, amount, value and ownership. It is further well established that the ownership of the products of one’s creative endeavors is essential to economic advancement and the enlargement of the greater good. Markets could not exist without objective measures, and thus we have the central commercial importance of metric standards.

The improved measurement of service outcomes and performances is going to create an environment capable of supporting similar legal and financial indications of value and ownership. Many of the causes of today’s economic crises can be traced to poor quality information and inadequate measures of human, social, and natural value. Bringing publicly verifiable scientific data and methods to bear on the tuning of instruments for measuring these forms of value will make their harmonization much simpler than it ever could be otherwise. Social and environmental costs and value have been relegated to the marginal status of externalities because they have not been measured in ways that made it possible to bring them onto the books and into the models.

But the stage is being set for significant changes. Decades of research calibrating objective measures of a wide variety of performances and outcomes are inexorably leading to the creation of an intangible assets metric system (Fisher, 2009a, 2009b, 2011). Meaningful and rigorous individual-level universally available uniform metrics for each significant intangible asset (abilities, health, trustworthiness, etc.) will

(a) make it possible for each of us to take full possession, ownership, and management control of our investments in and returns from these forms of capital,

(b) coordinate the decisions and behaviors of consumers, researchers, and quality improvement specialists to better match supply and demand, and thereby

(c) increase the efficiency of human, social, and natural capital markets, harnessing the profit motive for the removal of wasted human potential, lost community coherence, and destroyed environmental quality.

Jensen’s observation emerges in his analysis of performance measures as one of three factors in defining the incentives and payoffs for a linear compensation plan (the other two being the intercept and the slope of the bonus line relating salary and bonus to the performance measure targets). The two sentences quoted above occur in this broader context, where Jensen (2003, pp. 396-397) states that,

…we must decide how much subjectivity will be involved in each performance measure. In considering this we must recognize that every performance measurement system in a firm must involve an important amount of subjectivity. The reason, as my colleague George Baker has pointed out, is that any activity whose performance can be perfectly measured objectively does not belong inside the firm. If its performance can be adequately measured objectively it can be spun out of the firm and contracted for in a market transaction. Thus, one of the most important jobs of managers, complementing objective measures of performance with managerial subjective evaluation of subtle interdependencies and other factors is exactly what most managers would like to avoid. Indeed, it is this factor along with efficient risk bearing that is at the heart of what gives managers and firms an advantage over markets.

Jensen is here referring implicitly to the point Coase (1990) makes regarding the nature of the firm. A firm can be seen as a specialized market, one in which methods, insights, and systems not generally available elsewhere are employed for competitive advantage. Products are brought to market competitively by being endowed with value not otherwise available. Maximizing that value is essential to the viability of the firm.

Given conflicting incentives and the mixed messages of the balanced scorecard, managers have plenty of opportunities for creatively avoiding the difficult task of maximizing the value of the firm. Jensen (2001) shows that attending to the “managerial subjective evaluation of subtle interdependencies” is made impossibly complex when decisions and behaviors are pulled in different directions by each stakeholder’s particular interests. Other research shows that even traditional capital structures are plagued by the mismeasurement of leverage, distress costs, tax shields, and the speed with which individual firms adjust their capital needs relative to leverage targets (Graham & Leary, 2010). The objective measurement of intangible assets surely seems impossibly complex to those familiar with these problems.

But perhaps the problems associated with measuring traditional capital structures are not so different from those encountered in the domain of intangible assets. In both cases, a particular kind of unjustified self-assurance seems always to attend the mere availability of numeric data. To the unpracticed eye, numbers seem to always behave the same way, no matter if they are rigorous measures of physical commodities, like kilowatts, barrels, or bushels, or if they are currency units in an accounting spreadsheet, or if they are percentages of agreeable responses to a survey question. The problem is that, when interrogated in particular ways with respect to the question of how much of something is supposedly measured, these different kinds of numbers give quite markedly different kinds of answers.

The challenge we face is one of determining what kind of answers we want to the questions we have to ask. Presumably, we want to ask questions and get answers pertinent to obtaining the information we need to manage life creatively, meaningfully, effectively and efficiently. It may be useful then, as a kind of thought experiment, to make a bold leap and imagine a scenario in which relevant questions are answered with integrity, accountability, and transparency.

What will happen when the specialized expertise of human resource professionals is supplanted by a market in which meaningful and comparable measures of the hireability, retainability, productivity, and promotability of every candidate and employee are readily available? If Baker and Jensen have it right, perhaps firms will no longer have employees. This is not to say that no one will work for pay. Instead, firms will contract with individual workers at going market rates, and workers will undoubtedly be well aware of the market value of their available shares of their intangible assets.

A similar consequence follows for the social safety net and a host of other control, regulatory, and policing mechanisms. But we will no longer be stuck with blind faith in the invisible hand and market efficiency, following the faith of those willing to place their trust and their futures in the hands of mechanisms they only vaguely understand and cannot control. Instead, aggregate effects on individuals, communities, and the environment will be tracked in publicly available and critically examined measures, just as stocks, bonds, and commodities are tracked now.

Previous posts in this blog explore the economic possibilities that follow from having empirically substantiated, theoretically predictable, and instrumentally mediated measures embodying broad consensus standards. What we will have for human, social, and natural capital will be the same kind of objective measures that have made markets work as well as they have thus far. It will be a whole new ball game when profits become tied to human, social, and environmental outcomes.


Coase, R. (1990). The firm, the market, and the law. Chicago: University of Chicago Press.

Fisher, W. P., Jr. (2009a, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009b). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (Tech. Rep. No. New Orleans:

Fisher, W. P., Jr. (2010, 22 November). Meaningfulness, measurement, value seeking, and the corporate objective function: An introduction to new possibilities. Available at

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), in press.

Graham, J. R., & Leary, M. T. (2010, 21 December). A review of empirical capital structure research and directions for the future. Available at

Jensen, M. C. (2001, Fall). Value maximization, stakeholder theory, and the corporate objective function. Journal of Applied Corporate Finance, 14(3), 8-21.

Jensen, M. C. (2003). Paying people to lie: The truth about the budgeting process. European Financial Management, 9(3), 379-406.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at
Permissions beyond the scope of this license may be available at

Assignment from Wired’s Predict What’s Next page: “Imagine the Future of Medical Bills”

March 20, 2010

William P. Fisher, Jr.
New Orleans, Louisiana
20 March 2010

Consider the following, formulated in response to Wired magazine’s 18.04 request for ideas on the future of medical bills, for possible use on the Predict What’s Next page. For background on the concepts presented here, see previous posts in this blog, such as

Visualize an online image of a Maiuetic Renaissance Bank’s Monthly Living Capital Stock, Investment, and Income Report. The report is shown projected as a vertical plane in the space above an old antique desk. Credits and debits to and from Mary Smith’s health capital account are listed, along with similar information on all of her capital accounts. Lying on the desk is a personalized MRB Living Capital Credit/Debit card, evidently somehow projecting the report from the eyes of Mary’s holographic image on it.

The report shows headings and entries for Mary Smith’s various capital accounts:

  • liquid (cash, checking and savings),
  • property (home, car, boat, rental, investments, etc.),
  • social capital (trust, honesty, commitment, loyalty, community building, etc.) credits/debits:
    • personal,
    • community’s,
    • employer’s,
    • regional,
    • national;
  • human capital:
    • literacy credits (shown in Lexiles;,
    • numeracy credits (shown in Quantiles;,
    • occupational credits (hireability, promotability, retainability, productivity),
    • health credits/debits (genetic, cognitive reasoning, physical function, emotional function, chronic disease management status, etc.); and
  • natural capital:
    • carbon credits/debits,
    • local and global air, water, ecological diversity, and environmental quality share values.

Example social capital credits/debits shown in the report might include volunteering to build houses in N’Awlins Ninth Ward, tutoring fifth-graders in math, jury duty, voting, writing letters to congress, or charitable donations (credits), on the one hand, or library fines, a parking ticket, unmaintained property, etc. (debits), on the other.

Natural capital credits might be increased or decreased depending on new efficiencies obtained in electrical grid or in power generation, a newly installed solar panel, or by a recent major industrial accident, environmental disaster, hurricane, etc.

Mary’s share of the current value of the overall Genuine National Product, or Happiness Index, is broken out by each major form of capital (liquid, property, social, human, natural).

The monetary values of credits are shown at the going market rates, alongside the changes from last month, last year, and three years ago.

One entry could be a deferred income and property tax amount, given a social capital investment level above a recommended minimum. Another entry would show new profit potentials expressed in proportions of investments wasted due to inefficiencies, with suggestions for how these can be reduced, and with time to investment recovery and amount of new social capital generated also indicated.

The health capital portion of the report is broken out in a magnified overlay. Mary’s physical and emotional function measures are shown by an arrow pointing at a level on a vertical ruler. Other arrows point at the average levels for people her age (globally, nationally, regionally, and locally), for women and women of different ages, living in different countries/cities, etc.

Mary’s diabetes-specific chronic disease management metric is shown at a high level, indicating her success in using diet and exercise to control her condition. Her life expectancy and lifetime earning potentials are shown, alongside comparable values for others.

Recent clinical visits for preventative diabetes and dental care would be shown as debits against one account and as an investment in her health capital account. The debits might be paid out of a sale of shares of stock from her quite high social or natural capital accounts, or from credits transferred from those to her checking account.

Cost of declining function in the next ten years, given typical aging patterns, shown as lower rates of new capital investment in her stock and lower ROIs.

Cost of maintaining or improving function, in terms of required investments of time and resources in exercise, equipment, etc. balanced against constant rate of new investments and ROI.

Also shown:

A footnote could read: Given your recent completion of post-baccalaureate courses in political economy and advanced living capital finance, your increased stocks of literacy, numeracy, and occupational capital qualify you for a promotion or new positions currently compensated at annual rates 17.7% higher than your current one. Watch for tweets and beams from new investors interested in your rising stock!

A warning box: We all pay when dead capital lies unleveragable in currencies expressed in ordinal or otherwise nonstandard metrics! Visit today to convert your unaccredited capital currencies into recognized value. (Not responsible for fraudulent misrepresentations of value should your credits prove incommensurable or counterfeit. Always check your vendor’s social capital valuations before investing in any stock offering. Go to for accredited capital metrics equating information, courses, texts, and consultants.)

Ad: Click here to put your occupational capital stock on the market now! Employers are bidding $$$, ¥¥¥ and €€€ on others at your valuation level!

Ad: You are only 110 Lexiles away from a literacy capital stock level on which others receive 23% higher investment returns! Enroll at now for your increased income tomorrow! (Past performance is not a guarantee of future results. Your returns may vary. Click here to see Bob’s current social capital valuations.)

Bottom line: Think global, act local! It is up to you to represent your shares in the global marketplace. Only you can demand the improvements you seek by shifting and/or intensifying your investments. Do so whenever you are dissatisfied with your own, your global and local business partners’, your community’s, your employer’s, your region’s, or your nation’s stock valuations.

For background on the concepts involved in this scenario, see:

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [].

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [].

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement (Elsevier), 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (Tech. Rep. No. New Orleans:

Fisher, W. P., Jr. (2010). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 11, in press [].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at
Permissions beyond the scope of this license may be available at

Graphic Illustrations of Why Scores, Ratings, and Percentages Are Not Measures, Part Two

July 2, 2009

Part One of this two-part blog offered pictures illustrating the difference between numbers that stand for something that adds up and those that do not. The uncontrolled variation in the numbers that pass for measures in health care, education, satisfaction surveys, performance assessments, etc. is analogous to the variation in weights and measures found in Medieval European markets. It is well established that metric uniformity played a vital role in the industrial and scientific revolutions of the nineteenth century. Metrology will inevitably play a similarly central role in the economic and scientific revolutions taking place today.

Clients and students often express their need for measures that are manageable, understandable, and relevant. But sometimes it turns out that we do not understand what we think we understand. New understandings can make what previously seemed manageable and relevant appear unmanageable and irrelevant. Perhaps our misunderstandings about measurement will one day explain why we have failed to innovate and improve as much as we could have.

Of course, there are statistical methods for standardizing scores and proportions that make them comparable across different normal distributions, but I’ve never once seen them applied to employee, customer, or patient survey results reported to business or hospital managers. They certainly are not used in determining comparable proficiency levels of students under No Child Left Behind. Perhaps there are consultants and reporting systems that make standardized z-scores a routine part of their practices, but even if they are, why should anyone willingly base their decisions on the assumption that normal distributions have been obtained? Why not use methods that give the same result no matter how scores are distributed?

To bring the point home, if statistical standardization is a form of measurement, why don’t we use the z-scores for height distributions instead of the direct measures of how tall we each are? Plainly, the two kinds of numbers have different applications. Somehow, though, we try to make do without the measures in many applications involving tests and surveys, with the unfortunate consequence of much lost information and many lost opportunities for better communication.

Sometimes I wonder, if we would give a test on the meaning of the scores, percentages, and logits discussed in Part One to managers, executives, and entrepreneurs, would many do any better on the parts they think they understand than on the parts they find unfamiliar? I suspect not. Some executives whose pay-for-performance bonuses are inflated by statistical accidents are going to be unhappy with what I’m going to say here, but, as I’ve been saying for years, clarifying financial implications will go a long way toward motivating the needed changes.

How could that be true? Well, consider the way we treat percentages. Imagine that three different hospitals see their patients’ percents agreement with a key survey item change as follows. Which one changed the most?


A. from 30.85% to 50.00%: a 19.15% change

B. from 6.68% to 15.87%: a 9.18% change

C. from 69.15% to 84.13%: a 14.99% change

As is illustrated in Figure 1 below, given that all three pairs of administrations of the survey are included together in the same measure distribution, it is likely that the three changes were all the same size.

In this scenario, all the survey administrations shared the same standard deviation in the underlying measure distribution that the key item’s percentage was drawn from, and they started from different initial measures. Different ranges in the measures are associated with different parts of the sample’s distribution, and so different numbers and percentages of patients are associated with the same amount of measured change. It is easy to see that 100-unit measured gains in the range of 50-150 or 1000-1100 on the horizontal axis would scarcely amount to 1% changes, but the same measured gain in the middle of the distribution could be as much as 25%.

Figure 1. Different Percents, Same Measures

Figure 1. Different Percentages, Same Measures

Figure 1 shows how the same measured gain can look wildly different when expressed as a percentage, depending on where the initial measure is positioned in the distribution. But what happens when percentage gains are situated in different distributions that have different patterns of variation?

More specifically, consider a situation in which three different hospitals see their percents agreement with a key survey item change as follows.

A. from 30.85% to 50.00%: a 19.15% change

B. from 30.85% to 50.00%: a 19.15% change

C. from 30.85% to 50.00%: a 19.15% change

Did one change more than the others? Of course, the three percentages are all the same, so we would naturally think that the three increases are all the same. But what if the standard deviations characterizing the three different hospitals’ score distributions are different?

Figure 2, below, shows that the three 19.15% changes could be associated with quite different measured gains. When the distribution is wider and the standard deviation is larger, any given percentage change will be associated with a larger measured change than in cases with narrower distributions and smaller standard deviations.

Same Percentage Gains, Different Measured Gains

Figure 2. Same Percentage Gains, Different Measured Gains

And if this is not enough evidence as to the foolhardiness of treating percentages as measures, bear with me through one more example. Imagine another situation in which three different hospitals see their percents agreement with a key survey item change as follows.

A. from 30.85% to 50.00%: a 19.15% change

B. from 36.96% to 50.00%: a 13.04% change

C. from 36.96% to 50.00%: a 13.04% change

Did one change more than the others? Plainly A obtains the largest percentage gain. But Figure 3 shows that, depending on the underlying distribution, A’s 19.15% gain might be a smaller measured change than either B’s or C’s. Further, B’s and C’s measures might not be identical, contrary to what would be expected from the percentages alone.

Figure 3. Percentages Completely at Odds with Measures

Figure 3. Percentages Completely at Odds with Measures

Now we have a fuller appreciation of the scope of the problems associated with the changing unit size illustrated in Part One. Though we think we understand percentages and insist on using them as something familiar and routine, the world that they present to us is as crazily distorted as a carnival funhouse. And we won’t even begin to consider how things look in the context of distributions skewed toward one end of the continuum or the other! There is similarly no point at all in going to bimodal or multimodal distributions (ones that have more than one peak). The vast majority of business applications employing scores, ratings, and percentages as measures do not take the underlying distribution into account. Given the problems that arise in optimal conditions (i.e., with a normal distribution), there is no need to belabor the issue with an enumeration of all the possible things that could be going wrong. Far better to simply move on and construct measurement systems that remain invariant across the different shapes of local data sets’ particular distributions.

How could we have gone so far in making these nonsensical numbers the focus of our attention? To put things back in perspective, we need to keep in mind the evolving magnitude of the problems we face. When Florence Nightingale was deploring the lack of any available indications of the effectiveness of her efforts, a little bit of flawed information was a significant improvement over no information. Ordinal, situation-specific numbers provided highly useful information when problems emerged in local contexts on a scale that could be comprehended and addressed by individuals and small groups.

We no longer live in that world. Today’s problems require kinds of information that must be more meaningful, precise, and actionable than ever before. And not only that, this information cannot remain accessible only to managers, executives, researchers, and data managers. It must be brought to bear in every transaction and information exchange in the industry.

Information has to be formatted in the common currency of uniform metrics to make it as fluid and empowering as possible. Would the auto industry have been able to bring off a quality revolution if every worker’s toolkit was calibrated in a different unit? Could we expect to coordinate schedules easily if we each had clocks scaled in different time units? Obviously not; why should we expect quality revolutions in health care and education when nearly all of our relevant metrics are incommensurable?

Management consultants realized decades ago that information creates a sense of responsibility in the person who possesses it. We cannot expect clinicians and teachers to take full responsibility for the outcomes they produce until they have the information they need to evaluate and improve them. Existing data and systems plainly are not up to the task.

The problem is far less a matter of complex or difficult issues than it is one of culture and priorities. It often takes less effort to remain in a dysfunctional rut and deal with massive inefficiencies than it does to get out of the rut and invent a new system with new potentials. Big changes tend to take place only when systems become so bogged down by their problems that new systems emerge simply out of the need to find some way to keep things in motion. These blogs are written in the hope that we might be able to find our way to new methods without suffering the catastrophes of total system failure. One might well imagine an entrepreneurially-minded consortium of providers, researchers, payors, accreditors, and patient advocates joining forces in small pilot projects testing out new experimental systems.

To know how much of something we’re getting for our money and whether its a fair bargain, we need to be able to compare amounts across providers, vendors, treatment options, teaching methods, etc. Scores summed from tests, surveys, or assessments, individual ratings, and percentages of a maximum possible score or frequency do not provide this information because they are not measures. Their unit sizes vary across individuals, collections of indicators (instruments), time, and space. The consequences of treating scores and percentages as measures are not trivial. We will eventually come to see that measurement quality is the primary source of the differences between the current health care and education systems’ regional variations and endlessly spiralling costs, on the one hand, and the geographically uniform quality, costs, and improvements in the systems we will create in the future.

Markets are dysfunctional when quality and costs cannot be evaluated in common terms by consumers, providers’ quality improvement specialists, researchers, accreditors, and payers. There are widespread calls for greater transparency in purchasing decisions, but transparency is not being defined and operationalized meaningfully or usefully. As currently employed, transparency refers to making key data available for public scrutiny. But these data are almost always expressed as scores, ratings, or percentages that are anything but transparent. In addition to not adding up, these data are also usually presented in indigestibly large volumes, and are not quality assessed.

All things considered, we’re doing amazingly well with our health care and education systems given the way we’ve hobbled ourselves with dysfunctional, incommensurable measures. And that gives us real cause for hope! What will we be able to accomplish when we really put our minds to measuring what we want to manage? How much better will we be able to do when entrepreneurs have the tools they need to innovate new efficiences? Who knows what we’ll be capable of when we have meaningful measures that stand for amounts that really add up, when data volumes are dramatically reduced to manageable levels, and when data quality is effectively assessed and improved?

For more on the problems associated with these kinds of percentages in the context of NCLB, see Andrew Dean Ho’s article in the August/September, 2008 issue of Educational Researcher, and Charles Murray’s “By the Numbers” column in the July 25, 2006 Wall Street Journal.

This is not the end of the story as to what the new measurement paradigm brings to bear. Next, I’ll post a table contrasting the features of scores, ratings, and percentages with those of measures. Until then, check out the latest issue of the Journal of Applied Measurement at, see what’s new in measurement software at or, or look into what’s up in the way of measurement research projects with the BEAR group at UC Berkeley (

Finally, keep in mind that we are what we measure. It’s time we measured what we want to be.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at
Permissions beyond the scope of this license may be available at