Archive for the ‘Education’ Category

Reimagining Capitalism Again, Part III: Reflections on Greider’s “Bold Ideas” in The Nation

September 10, 2011

And so, The Nation’s “Bold Ideas for a New Economy” is disappointing for not doing more to start from the beginning identified by its own writer, William Greider. The soul of capitalism needs to be celebrated and nourished, if we are to make our economy “less destructive and domineering,” and “more focused on what people really need for fulfilling lives.” The only real alternative to celebrating and nourishing the soul of capitalism is to kill it, in the manner of the Soviet Union’s failed experiments in socialism and communism.

The article speaks the truth, though, when it says there is no point in trying to persuade the powers that be to make the needed changes. Republicans see the market as it exists as a one-size-fits-all economic panacea, when all it can accomplish in its current incomplete state is the continuing externalization of anything and everything important about human, social, and environmental decency. For their part, Democrats do indeed “insist that regulation will somehow fix whatever is broken,” in an ever-expanding socialistic micromanagement of every possible exception to the rules that emerges.

To date, the president’s efforts at a nonpartisan third way amount only to vacillations between these opposing poles. The leadership that is needed, however, is something else altogether. Yes, as The Nation article says, capitalism needs to be made to serve the interests of society, and this will require deep structural change, not just new policies. But none of the contributors of the “bold ideas” presented propose deep structural changes of a kind that actually gets at the soul of capitalism. All of the suggestions are ultimately just new policies tweaking superficial aspects of the economy in mechanical, static, and very limited ways.

The article calls for “Democratizing reforms that will compel business and finance to share decision-making and distribute rewards more fairly.” It says the vision has different names but “the essence is a fundamental redistribution of power and money.” But corporate distortions of liability law, the introduction of boardroom watchdogs, and a tax on financial speculation do not by any stretch of the imagination address the root causes of social and environmental irresponsibility in business. They “sound like obscure technical fixes” because that’s what they are. The same thing goes for low-cost lending from public banks, the double or triple bottom lines of Benefit Corporations, new anti-trust laws, calls for “open information” policies, added personal stakes for big-time CEOs, employee ownership plans, the elimination of tax subsidies for, new standards for sound investing, new measures of GDP, and government guarantees of full employment.

All of these proposals sound like what ought to be the effects and outcomes of efforts addressing the root causes of capitalisms’ shortcomings. Instead, they are band aids applied to scratched fingers and arms when multiple by-pass surgery is called for. That is, what we need is to understand how to bring the spirit of capitalism to life in the new domains of human, social, and environmental interests, but what we’re getting are nothing but more of the same piecemeal ways of moving around the deck chairs on the Titanic.

There is some truth in the assertion that what really needs reinventing is our moral and spiritual imagination. As someone (Einstein or Edison?) is supposed to have put it, originality is simply a matter of having a source for an analogy no one else has considered. Ironically, the best model is often the one most taken for granted and nearest to hand. Such is the case with the two-sided scientific and economic effects of standardized units of measurement. The fundamental moral aspect here is nothing other than the Golden Rule, independently derived and offered in cultures throughout history, globally. Individualized social measurement is nothing if not a matter of determining whether others are being treated in the way you yourself would want to be treated.

And so, yes, to stress the major point of agreement with The Nation, “the new politics does not start in Washington.” Historically, at their best, governments work to keep pace with the social and technical innovations introduced by their peoples. Margaret Mead said it well a long time ago when she asserted that small groups of committed citizens are the only sources of real social change.

Not to be just one of many “advocates with bold imaginations” who wind up marginalized by the constraints of status quo politics, I claim my personal role in imagining a new economic future by tapping as deeply as I can into the positive, pre-existing structures needed for a transition into a new democratic capitalism. We learn through what we already know. Standards are well established as essential to commerce and innovation, but 90% of the capital under management in our economy—the human, social, and natural capital—lacks the standards needed for optimal market efficiency and effectiveness. An intangible assets metric system will be a vitally important way in which we extend what is right and good in the world today into new domains.

To conclude, what sets this proposal apart from those offered by The Nation and its readers hinges on our common agreement that “the most threatening challenge to capitalism is arguably the finite carrying capacity of the natural world.” The bold ideas proposed by The Nation’s readers respond to this challenge in ways that share an important feature in common: people have to understand the message and act on it. That fact dooms all of these ideas from the start. If we have to articulate and communicate a message that people then have to act on, we remain a part of the problem and not part of the solution.

As I argue in my “The Problem is the Problem” blog post of some months ago, this way of defining problems is itself the problem. That is, we can no longer think of ourselves as separate from the challenges we face. If we think we are not all implicated through and through as participants in the construction and maintenance of the problem, then we have not understood it. The bold ideas offered to date are all responses to the state of a broken system that seek to reform one or another element in the system when what we need is a whole new system.

What we need is a system that so fully embodies nature’s own ecological wisdom that the medium becomes the message. When the ground rules for economic success are put in place such that it is impossible to earn a profit without increasing stocks of human, social, and natural capital, there will be no need to spell out the details of a microregulatory structure of controlling new anti-trust laws, “open information” policies, personal stakes for big-time CEOs, employee ownership plans, the elimination of tax subsidies, etc. What we need is precisely what Greider reported from Innovest in his book: reliable, high quality information that makes human, social, and environmental issues matter financially. Situated in a context like that described by Bernstein in his 2004 The Birth of Plenty, with the relevant property rights, rule of law, scientific rationality, capital markets, and communications networks in place, it will be impossible to stop a new economic expansion of historic proportions.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part I: Reflections on Greider’s Soul of Capitalism

September 10, 2011

In his 2003 book, The Soul of Capitalism, William Greider wrote, “If capitalism were someday found to have a soul, it would probably be located in the mystic qualities of capital itself” (p. 94). The recurring theme in the book is that the resolution of capitalism’s deep conflicts must grow out as organic changes from the roots of capitalism itself.

In the book, Greider quotes Innovest’s Michael Kiernan as suggesting that the goal has to be re-engineering the DNA of Wall Street (p. 119). He says the key to doing this is good reliable information that has heretofore been unavailable but which will make social and environmental issues matter financially. The underlying problems of exactly what solid, high quality information looks like, where it comes from, and how it is created are not stated or examined, but the point, as Kiernan says, is that “the markets are pretty good at punishing and rewarding.” The objective is to use “the financial markets as an engine of reform and positive change rather than destruction.”

This objective is, of course, the focus of multiple postings in this blog (see especially this one and this one). From my point of view, capitalism indeed does have a soul and it is actually located in the qualities of capital itself. Think about it: if a soul is a spirit of something that exists independent of its physical manifestation, then the soul of capitalism is the fungibility of capital. Now, this fungibility is complex and ambiguous. It takes its strength and practical value from the way market exchange are represented in terms of currencies, monetary units that, within some limits, provide an objective basis of comparison useful for rewarding those capable of matching supply with demand.

But the fungibility of capital can also be dangerously misconceived when the rich complexity and diversity of human capital is unjustifiably reduced to labor, when the irreplaceable value of natural capital is unjustifiably reduced to land, and when the trust, loyalty, and commitment of social capital is completely ignored in financial accounting and economic models. As I’ve previously said in this blog, the concept of human capital is inherently immoral so far as it reduces real human beings to interchangeable parts in an economic machine.

So how could it ever be possible to justify any reduction of human, social, and natural value to a mere number? Isn’t this the ultimate in the despicable inhumanity of economic logic, corporate decision making, and, ultimately, the justification of greed? Many among us who profess liberal and progressive perspectives seem to have an automatic and reactionary prejudice of this kind. This makes these well-intentioned souls as much a part of the problem as those among us with sometimes just as well-intentioned perspectives that accept such reductionism as the price of entry into the game.

There is another way. Human, social, and natural value can be measured and made manageable in ways that do not necessitate totalizing reduction to a mere number. The problem is not reduction itself, but unjustified, totalizing reduction. Referring to all people as “man” or “men” is an unjustified reduction dangerous in the way it focuses attention only on males. The tendency to think and act in ways privileging males over females that is fostered by this sense of “man” shortchanges us all, and has happily been largely eliminated from discourse.

Making language more inclusive does not, however, mean that words lose the singular specificity they need to be able to refer to things in the world. Any given word represents an infinite population of possible members of a class of things, actions, and forms of life. Any simple sentence combining words into a coherent utterance then multiplies infinities upon infinities. Discourse inherently reduces multiplicities into texts of limited lengths.

Like any tool, reduction has its uses. Also like any tool, problems arise when the tool is allowed to occupy some hidden and unexamined blind spot from which it can dominate and control the way we think about everything. Critical thinking is most difficult in those instances in which the tools of thinking themselves need to be critically evaluated. To reject reduction uncritically as inherently unjustified is to throw the baby out with the bathwater. Indeed, it is impossible to formulate a statement of the rejection without simultaneously enacting exactly what is supposed to be rejected.

We have numerous ready-to-hand examples of how all reduction has been unjustifiably reduced to one homogenized evil. But one of the results of experiments in communal living in the 1960s and 1970s, as well as of the fall of the Soviet Union, was the realization that the centralized command and control of collectively owned community property cannot compete with the creativity engendered when individuals hold legal title to the fruits of their labors. If individuals cannot own the results of the investments they make, no one makes any investments.

In other words, if everything is owned collectively and is never reduced to individually possessed shares that can be creatively invested for profitable returns, then the system is structured so as to punish innovation and reward doing as little as possible. But there’s another way of thinking about the relation of the collective to the individual. The living soul of capitalism shows itself in the way high quality information makes it possible for markets to efficiently coordinate and align individual producers’ and consumers’ collective behaviors and decisions. What would happen if we could do that for human, social, and natural capital markets? What if “social capitalism” is more than an empty metaphor? What if capital institutions can be configured so that individual profit really does become the driver of socially responsible, sustainable economics?

And here we arrive at the crux of the problem. How do we create the high quality, solid information markets need to punish and reward relative to ethical and sustainable human, social, and environmental values? Well, what can we learn from the way we created that kind of information for property and manufactured capital? These are the questions taken up and explored in the postings in this blog, and in my scientific research publications and meeting presentations. In the near future, I’ll push my reflection on these questions further, and will explore some other possible answers to the questions offered by Greider and his readers in a recent issue of The Nation.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

New Opportunities for Job Creation and Prosperity

August 17, 2011

What can be done to create jobs and revive the economy? There is no simple, easy answer to this question. Creating busywork is nonsense. We need fulfilling occupations that meet the world’s demand for products and services. It is not easy to see how meaningful work can be systematically created on a broad scale. New energy efficiencies may lead to the cultivation of significant job growth, but it may be unwise to put all of our eggs in this one basket.

So how are we to solve this puzzle? What other areas in the economy might be ripe for the introduction of a new technology capable of supporting a wave of new productivity, like computers did in the 1980s, or the Internet in the 1990s? In trying to answer this question, simplicity and elegance are key factors in keeping things at a practical level.

For instance, we know we accomplish more working together as a team than as disconnected individuals. New jobs, especially new kinds of jobs, will have to be created via innovation. Innovation in science and industry is a team sport. So the first order of business in teaming up for job creation is to know the rules of the game. The economic game is played according to the rules of law embodied in property rights, scientific rationality, capital markets, and transportation/communications networks (see William Bernstein’s 2004 book, The Birth of Plenty). When these conditions are met, as they were in Europe and North America at the beginning of the nineteenth century, the stage is set for long term innovation and growth on a broad scale.

The second order of business is to identify areas in the economy that lack one or more of these four conditions, and that could reasonably be expected to benefit from their introduction. Education, health care, social services, and environmental management come immediately to mind. These industries are plagued with seemingly interminable inflationary spirals, which, no doubt, are at least in part caused by the inability of investors to distinguish between high and low performers. Money cannot flow to and reward programs producing superior results in these industries because they lack common product definitions and comparable measures of their results.

The problems these industries are experiencing are not specific to each of them in particular. Rather, the problem is a general one applicable across all industries, not just these. Traditionally, economic thinking focuses on three main forms of capital: land, labor, and manufactured products (including everything from machines, roads, and buildings to food, clothing, and appliances). Cash and credit are often thought of as liquid capital, but their economic value stems entirely from the access they provide to land, labor, and manufactured products.

Economic activity is not really, however, restricted to these three forms of capital. Land is far more than a piece of ground. What are actually at stake are the earth’s regenerative ecosystems, with the resources and services they provide. And labor is far more than a pair of skilled hands; people bring a complex mix of abilities, motivations, and health to bear in their work. Finally, this scheme lacks an essential element: the trust, loyalty, and commitment required for even the smallest economic exchange to take place. Without social capital, all the other forms of capital (human, natural, and manufactured, including property) are worthless. Consistent, sustainable, and socially responsible economic growth requires that all four forms of capital be made accountable in financial spreadsheets and economic models.

The third order of business, then, is to ask if the four conditions laying out the rules for the economic game are met in each of the four capital domains. The table below suggests that all four conditions are fully met only for manufactured products. They are partially met for natural resources, such as minerals, timber, fisheries, etc., but not at all for nature’s air and water purification systems or broader genetic ecosystem services.

 Table

Existing Conditions Relevant to Conceiving a New Birth of Plenty, by Capital Domains

Human

Social

Natural

Manufactured

Property rights

No

No

Partial

Yes

Scientific rationality

Partial

Partial

Partial

Yes

Capital markets

Partial

Partial

Partial

Yes

Transportation & communication networks

Partial

Partial

Partial

Yes

That is, no provisions exist for individual ownership of shares in the total available stock of air and water, or of forest, watershed, estuary, and other ecosystem service outcomes. Nor do any individuals have free and clear title to their most personal properties, the intangible abilities, motivations, health, and trust most essential to their economic productivity. Aggregate statistics are indeed commonly used to provide a basis for policy and research in human, social, and natural capital markets, but falsifiable models of individually applicable unit quantities are not widely applied. Scientifically rational measures of our individual stocks of intangible asset value will require extensive use of these falsifiable models in calibrating the relevant instrumentation.

Without such measures, we cannot know how many shares of stock in these forms of capital we own, or what they are worth in dollar terms. We lack these measures, even though decades have passed since researchers first established firm theoretical and practical foundations for them. And more importantly, even when scientifically rational individual measures can be obtained, they are never expressed in terms of a unit standardized for use within a given market’s communications network.

So what are the consequences for teams playing the economic game? High performance teams’ individual decisions and behaviors are harmonized in ways that cannot otherwise be achieved only when unit amounts, prices, and costs are universally comparable and publicly available. This is why standard currencies and exchange rates are so important.

And right here we have an insight into what we can do to create jobs. New jobs are likely going to have to be new kinds of jobs resulting from innovations. As has been detailed at length in recent works such as Surowiecki’s 2004 book, The Wisdom of Crowds, innovation in science and industry depends on standards. Standards are common languages that enable us to multiply our individual cognitive powers into new levels of collective productivity. Weights and measures standards are like monetary currencies; they coordinate the exchange of value in laboratories and businesses in the same way that dollars do in the US economy.

Applying Bernstein’s four conditions for economic growth to intangible assets, we see that a long term program for job creation then requires

  1. legislation establishing human, social, and natural capital property rights, and an Intangible Assets Metrology System;
  2. scientific research into consensus standards for measuring human, social, and natural capital;
  3. venture capital educational and marketing programs; and
  4. distributed information networks and computer applications through which investments in human, social, and natural capital can be tracked and traded in accord with the rule of law governing property rights and in accord with established consensus standards.

Of these four conditions, Bernstein (p. 383) points to property rights as being the most difficult to establish, and the most important for prosperity. Scientific results are widely available in online libraries. Capital can be obtained from investors anywhere. Transportation and communications services are available commercially.

But valid and verifiable means of representing legal title to privately owned property is a problem often not yet solved even for real estate in many Third World and former communist countries (see De Soto’s 2000 book, The Mystery of Capital). Creating systems for knowing the quality and quantity of educational, health care, social, and environmental service outcomes is going to be a very difficult process. It will not be impossible, however, and having the problem identified advances us significantly towards new economic possibilities.

We need leaders able and willing to formulate audacious goals for new economic growth from ideas such as these. We need enlightened visionaries able to see our potentials from a new perspective, and who can reflect our new self-image back at us. When these leaders emerge—and they will, somewhere, somehow—the imaginations of millions of entrepreneurial thinkers and actors will be fired, and new possibilities will unfold.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Science, Public Goods, and the Monetization of Commodities

August 13, 2011

Though I haven’t read Philip Mirowski’s new book yet (Science-Mart: Privatizing American Science. Cambridge, MA: Harvard University Press, 2011), a statement in the cover blurb given at Amazon.com got me thinking. I can’t help but wonder if there is another way of interpreting neoliberal ideology’s “radically different view of knowledge and discovery: [that] the fruits of scientific investigation are not a public good that should be freely available to all, but are commodities that could be monetized”?

Corporations and governments are not the only ones investing in research and new product development, and they are not the only ones who could benefit from the monetization of the fruits of scientific investigation. Individuals make these investments as well, and despite ostensible rights to private ownership, no individuals anywhere have access to universally comparable, uniformly expressed, and scientifically valid information on the quantity or quality of the literacy, health, community, or natural capital that is rightfully theirs. They accordingly also then do not have any form of demonstrable legal title to these properties. In the same way that corporations have successfully advanced their economic interests by seeing that patent and intellectual property laws were greatly strengthened, so, too, ought individuals and communities advance their economic interests by, first, expanding the scope of weights and measures standards to include intangible assets, and second, by strengthening laws related to the ownership of privately held stocks of living capital.

The nationalist and corporatist socialization of research will continue only as long as social capital, human capital, and natural capital are not represented in the universally uniform common currencies and transparent media that could be provided by an intangible assets metric system. When these forms of capital are brought to economic life in fungible measures akin to barrels, bushels, or kilowatts, then they will be monetized commodities in the full capitalist sense of the term, ownable and purchasable products with recognizable standard definitions, uniform quantitative volumes, and discernable variations in quality. Then, and only then, will individuals gain economic control over their most important assets. Then, and only then, will we obtain the information we need to transform education, health care, social services, and human and natural resource management into industries in which quality is appropriately rewarded. Then, and only then, will we have the means for measuring genuine progress and authentic wealth in ways that correct the insufficiencies of the GNP/GDP indexes.

The creation of efficiently functioning markets for all forms of capital is an economic, political, and moral necessity (see Ekins, 1992 and others). We say we manage what we measure, but very little effort has been put into measuring (with scientific validity and precision in universally uniform and accessible aggregate terms) 90% of the capital resources under management: human abilities, motivations, and health; social commitment, loyalty, and trust; and nature’s air and water purification and ecosystem services (see Hawken, Lovins, & Lovins, 1999, among others). All human suffering, sociopolitical discontent, and environmental degradation are rooted in the same common cause: waste (see Hawken, et al., 1999). To apply lean thinking to removing the wasteful destruction of our most valuable resources, we must measure these resources in ways that allow us to coordinate and align our decisions and behaviors virtually, at a distance, with no need for communicating and negotiating the local particulars of the hows and whys of our individual situations. For more information on these ideas, search “living capital metrics” and see works like the following:

Ekins, P. (1992). A four-capital model of wealth creation. In P. Ekins & M. Max-Neef (Eds.), Real-life economics: Understanding wealth creation (pp. 147-15). London: Routledge.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Hawken, P., Lovins, A., & Lovins, H. L. (1999). Natural capitalism: Creating the next industrial revolution. New York: Little, Brown, and Co.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to Actor-Network-Theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Miller, P., & O’Leary, T. (2007). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Debt, Revenue, and Changing the Way Washington Works: The Greatest Entrepreneurial Opportunity of Our Time

July 30, 2011

“Holding the line” on spending and taxes does not make for a fundamental transformation of the way Washington works. Simply doing less of one thing is just a small quantitative change that does nothing to build positive results or set a new direction. What we need is a qualitative metamorphosis akin to a caterpillar becoming a butterfly. In contrast with this beautiful image of natural processes, the arguments and so-called principles being invoked in the sham debate that’s going on are nothing more than fights over where to put deck chairs on the Titanic.

What sort of transformation is possible? What kind of a metamorphosis will start from who and where we are, but redefine us sustainably and responsibly? As I have repeatedly explained in this blog, my conference presentations, and my publications, with numerous citations of authoritative references, we already possess all of the elements of the transformation. We have only to organize and deploy them. Of course, discerning what the resources are and how to put them together is not obvious. And though I believe we will do what needs to be done when we are ready, it never hurts to prepare for that moment. So here’s another take on the situation.

Infrastructure that supports lean thinking is the name of the game. Lean thinking focuses on identifying and removing waste. Anything that consumes resources but does not contribute to the quality of the end product is waste. We have enormous amounts of wasteful inefficiency in many areas of our economy. These inefficiencies are concentrated in areas in which management is hobbled by low quality information, where we lack the infrastructure we need.

Providing and capitalizing on this infrastructure is The Greatest Entrepreneurial Opportunity of Our Time. Changing the way Washington (ha! I just typed “Wastington”!) works is the same thing as mitigating the sources of risk that caused the current economic situation. Making government behave more like a business requires making the human, social, and natural capital markets more efficient. Making those markets more efficient requires reducing the costs of transactions. Those costs are determined in large part by information quality, which is a function of measurement.

It is often said that the best way to reduce the size of government is to move the functions of government into the marketplace. But this proposal has never been associated with any sense of the infrastructural components needed to really make the idea work. Simply reducing government without an alternative way of performing its functions is irresponsible and destructive. And many of those who rail on and on about how bad or inefficient government is fail to recognize that the government is us. We get the government we deserve. The government we get follows directly from the kind of people we are. Government embodies our image of ourselves as a people. In the US, this is what having a representative form of government means. “We the people” participate in our society’s self-governance not just by voting, writing letters to congress, or demonstrating, but in the way we spend our money, where we choose to live, work, and go to school, and in every decision we make. No one can take a breath of air, a drink of water, or a bite of food without trusting everyone else to not carelessly or maliciously poison them. No one can buy anything or drive down the street without expecting others to behave in predictable ways that ensure order and safety.

But we don’t just trust blindly. We have systems in place to guard against those who would ruthlessly seek to gain at everyone else’s expense. And systems are the point. No individual person or firm, no matter how rich, could afford to set up and maintain the systems needed for checking and enforcing air, water, food, and workplace safety measures. Society as a whole invests in the infrastructure of measures created, maintained, and regulated by the government’s Department of Commerce and the National Institute for Standards and Technology (NIST). The moral importance and the economic value of measurement standards has been stressed historically over many millennia, from the Bible and the Quran to the Magna Carta and the French Revolution to the US Constitution. Uniform weights and measures are universally recognized and accepted as essential to fair trade.

So how is it that we nonetheless apparently expect individuals and local organizations like schools, businesses, and hospitals to measure and monitor students’ abilities; employees’ skills and engagement; patients’ health status, functioning, and quality of care; etc.? Why do we not demand common currencies for the exchange of value in human, social, and natural capital markets? Why don’t we as a society compel our representatives in government to institute the will of the people and create new standards for fair trade in education, health care, social services, and environmental management?

Measuring better is not just a local issue! It is a systemic issue! When measurement is objective and when we all think together in the common language of a shared metric (like hours, volts, inches or centimeters, ounces or grams, degrees Fahrenheit or Celsius, etc.), then and only then do we have the means we need to implement lean strategies and create new efficiencies systematically. We need an Intangible Assets Metric System.

The current recession in large part was caused by failures in measuring and managing trust, responsibility, loyalty, and commitment. Similar problems in measuring and managing human, social, and natural capital have led to endlessly spiraling costs in education, health care, social services, and environmental management. The problems we’re experiencing in these areas are intimately tied up with the way we formulate and implement group level decision making processes and policies based in statistics when what we need is to empower individuals with the tools and information they need to make their own decisions and policies. We will not and cannot metamorphose from caterpillar to butterfly until we create the infrastructure through which we each can take full ownership and control of our individual shares of the human, social, and natural capital stock that is rightfully ours.

We well know that we manage what we measure. What counts gets counted. Attention tends to be focused on what we’re accountable for. But–and this is vitally important–many of the numbers called measures do not provide the information we need for management. And not only are lots of numbers giving us low quality information, there are far too many of them! We could have better and more information from far fewer numbers.

Previous postings in this blog document the fact that we have the intellectual, political, scientific, and economic resources we need to measure and manage human, social, and natural capital for authentic wealth. And the issue is not a matter of marshaling the will. It is hard to imagine how there could be more demand for better management of intangible assets than there is right now. The problem in meeting that demand is a matter of imagining how to start the ball rolling. What configuration of investments and resources will start the process of bursting open the chrysalis? How will the demand for meaningful mediating instruments be met in a way that leads to the spreading of the butterfly’s wings? It is an exciting time to be alive.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Number lines, counting, and measuring in arithmetic education

July 29, 2011

Over the course of two days spent at a meeting on mathematics education, a question started to form in my mind, one I don’t know how to answer, and to which there may be no answer. I’d like to try to formulate what’s on my mind in writing, and see if it’s just nonsense, a curiosity, some old debate that’s been long since resolved, issues too complex to try to use in elementary education, or something we might actually want to try to do something about.

The question stems from my long experience in measurement. It is one of the basic principles of the field that counting and measuring are different things (see the list of publications on this, below). Counts don’t behave like measures unless the things being counted are units of measurement established as equal ratios or intervals that remain invariant independent of the local particulars of the sample and instrument.

Plainly, if you count two groups of small and large rocks or oranges, the two groups can have the same number of things and the group with the larger things will have more rock or orange than the group with the smaller things. But the association of counting numbers and arithmetic operations with number lines insinuates and reinforces to the point of automatic intuition the false idea that numbers always represent quantity. I know that number lines are supposed to represent an abstract continuum but I think it must be nearly impossible for children to not assume that the number line is basically a kind of ruler, a real physical thing that behaves much like a row of same size wooden blocks laid end to end.

This could be completely irrelevant if the distinction between “How many?” and “How much?” is intensively taught and drilled into kids. Somehow I think it isn’t, though. And here’s where I get to the first part of my real question. Might not the universal, early, and continuous reinforcement of this simplistic equating of number and quantity have a lot to do with the equally simplistic assumption that all numeric data and statistical analysis is somehow quantitative? We count rocks or fish or sticks and call the resulting numbers quantities, and so we do the same thing when we count correct answers or ratings of “Strongly Agree.”

Though that counting is a natural and obvious point from which to begin studying whether something is quantitatively measurable, there are no defined units of measurement in the ordinal data gathered up from tests and surveys. The difference between any two adjacent scores varies depending on which two adjacent scores are compared. This has profound implications for the inferences we make and for our ability to think together as a field about our objects of investigation.

Over the last 30 years and more, we have become increasingly sensitized to the way our words prefigure our expectations and color our perceptions. This struggle to say what we mean and to not prejudicially exclude others from recognition as full human beings is admirable and good. But if that is so, why is it then that we nonetheless go on unjustifiably reducing the real characteristics of people’s abilities, health, performances, etc. to numbers that do not and cannot stand for quantitative amounts? Why do we keep on referring to counts as quantities? Why do we insist on referring to inconstant and locally dependent scores as measures? And why do we refuse to use the readily available methods we have at our disposal to create universally uniform measures that consistently represent the same unit amount always and everywhere?

It seems to me that the image of the number line as a kind of ruler is so indelibly impressed on us as a habit of thought that it is very difficult to relinquish it in favor of a more abstract model of number. Might it be important for us to begin to plant the seeds for more sophisticated understandings of number early in mathematics education? I’m going to wonder out loud about this to some of my math education colleagues…

Cooper, G., & Humphry, S. M. (2010). The ontological distinction between units and entities. Synthese, pp. DOI 10.1007/s11229-010-9832-1.

Wright, B. D. (1989). Rasch model from counting right answers: Raw scores as sufficient statistics. Rasch Measurement Transactions, 3(2), 62 [http://www.rasch.org/rmt/rmt32e.htm].

Wright, B. D. (1993). Thinking with raw scores. Rasch Measurement Transactions, 7(2), 299-300 [http://www.rasch.org/rmt/rmt72r.htm].

Wright, B. D. (1994, Autumn). Measuring and counting. Rasch Measurement Transactions, 8(3), 371 [http://www.rasch.org/rmt/rmt83c.htm].

Wright, B. D., & Linacre, J. M. (1989). Observations are always ordinal; measurements, however, must be interval. Archives of Physical Medicine and Rehabilitation, 70(12), 857-867 [http://www.rasch.org/memo44.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

A Second Simple Example of Measurement’s Role in Reducing Transaction Costs, Enhancing Market Efficiency, and Enables the Pricing of Intangible Assets

March 9, 2011

The prior post here showed why we should not confuse counts of things with measures of amounts, though counts are the natural starting place to begin constructing measures. That first simple example focused on an analogy between counting oranges and measuring the weight of oranges, versus counting correct answers on tests and measuring amounts of ability. This second example extends the first by, in effect, showing what happens when we want to aggregate value not just across different counts of some one thing but across different counts of different things. The point will be, in effect, to show how the relative values of apples, oranges, grapes, and bananas can be put into a common frame of reference and compared in a practical and convenient way.

For instance, you may go into a grocery store to buy raspberries and blackberries, and I go in to buy cantaloupe and watermelon. Your cost per individual fruit will be very low, and mine will be very high, but neither of us will find this annoying, confusing, or inconvenient because your fruits are very small, and mine, very large. Conversely, your cost per kilogram will be much higher than mine, but this won’t cause either of us any distress because we both recognize the differences in the labor, handling, nutritional, and culinary value of our purchases.

But what happens when we try to purchase something as complex as a unit of socioeconomic development? The eight UN Millennium Development Goals (MDGs) represent a start at a systematic effort to bring human, social, and natural capital together into the same economic and accountability framework as liquid and manufactured capital, and property. But that effort is stymied by the inefficiency and cost of making and using measures of the goals achieved. The existing MDG databases (http://data.un.org/Browse.aspx?d=MDG), and summary reports present overwhelming numbers of numbers. Individual indicators are presented for each year, each country, each region, and each program, goal by goal, target by target, indicator by indicator, and series by series, in an indigestible volume of data.

Though there are no doubt complex mathematical methods by which a philanthropic, governmental, or NGO investor might determine how much development is gained per million dollars invested, the cost of obtaining impact measures is so high that most funding decisions are made with little information concerning expected returns (Goldberg, 2009). Further, the percentages of various needs met by leading social enterprises typically range from 0.07% to 3.30%, and needs are growing, not diminishing. Progress at current rates means that it would take thousands of years to solve today’s problems of human suffering, social disparity, and environmental quality. The inefficiency of human, social, and natural capital markets is so overwhelming that there is little hope for significant improvements without the introduction of fundamental infrastructural supports, such as an Intangible Assets Metric System.

A basic question that needs to be asked of the MDG system is, how can anyone make any sense out of so much data? Most of the indicators are evaluated in terms of counts of the number of times something happens, the number of people affected, or the number of things observed to be present. These counts are usually then divided by the maximum possible (the count of the total population) and are expressed as percentages or rates.

As previously explained in various posts in this blog, counts and percentages are not measures in any meaningful sense. They are notoriously difficult to interpret, since the quantitative meaning of any given unit difference varies depending on the size of what is counted, or where the percentage falls in the 0-100 continuum. And because counts and percentages are interpreted one at a time, it is very difficult to know if and when any number included in the sheer mass of data is reasonable, all else considered, or if it is inconsistent with other available facts.

A study of the MDG data must focus on these three potential areas of data quality improvement: consistency evaluation, volume reduction, and interpretability. Each builds on the others. With consistent data lending themselves to summarization in sufficient statistics, data volume can be drastically reduced with no loss of information (Andersen, 1977, 1999; Wright, 1977, 1997), data quality can be readily assessed in terms of sufficiency violations (Smith, 2000; Smith & Plackner, 2009), and quantitative measures can be made interpretable in terms of a calibrated ruler’s repeatedly reproducible hierarchy of indicators (Bond & Fox, 2007; Masters, Lokan, & Doig, 1994).

The primary data quality criteria are qualitative relevance and meaningfulness, on the one hand, and mathematical rigor, on the other. The point here is one of following through on the maxim that we manage what we measure, with the goal of measuring in such a way that management is better focused on the program mission and not distracted by accounting irrelevancies.

Method

As written and deployed, each of the MDG indicators has the face and content validity of providing information on each respective substantive area of interest. But, as has been the focus of repeated emphases in this blog, counting something is not the same thing as measuring it.

Counts or rates of literacy or unemployment are not, in and of themselves, measures of development. Their capacity to serve as contributing indications of developmental progress is an empirical question that must be evaluated experimentally against the observable evidence. The measurement of progress toward an overarching developmental goal requires inferences made from a conceptual order of magnitude above and beyond that provided in the individual indicators. The calibration of an instrument for assessing progress toward the realization of the Millennium Development Goals requires, first, a reorganization of the existing data, and then an analysis that tests explicitly the relevant hypotheses as to the potential for quantification, before inferences supporting the comparison of measures can be scientifically supported.

A subset of the MDG data was selected from the MDG database available at http://data.un.org/Browse.aspx?d=MDG, recoded, and analyzed using Winsteps (Linacre, 2011). At least one indicator was selected from each of the eight goals, with 22 in total. All available data from these 22 indicators were recorded for each of 64 countries.

The reorganization of the data is nothing but a way of making the interpretation of the percentages explicit. The meaning of any one country’s percentage or rate of youth unemployment, cell phone users, or literacy has to be kept in context relative to expectations formed from other countries’ experiences. It would be nonsense to interpret any single indicator as good or bad in isolation. Sometimes 30% represents an excellent state of affairs, other times, a terrible one.

Therefore, the distributions of each indicator’s percentages across the 64 countries were divided into ranges and converted to ratings. A lower rating uniformly indicates a status further away from the goal than a higher rating. The ratings were devised by dividing the frequency distribution of each indicator roughly into thirds.

For instance, the youth unemployment rate was found to vary such that the countries furthest from the desired goal had rates of 25% and more(rated 1), and those closest to or exceeding the goal had rates of 0-10% (rated 3), leaving the middle range (10-25%) rated 2. In contrast, percentages of the population that are undernourished were rated 1 for 35% or more, 2 for 15-35%, and 3 for less than 15%.

Thirds of the distributions were decided upon only on the basis of the investigator’s prior experience with data of this kind. A more thorough approach to the data would begin from a finer-grained rating system, like that structuring the MDG table at http://mdgs.un.org/unsd/mdg/Resources/Static/Products/Progress2008/MDG_Report_2008_Progress_Chart_En.pdf. This greater detail would be sought in order to determine empirically just how many distinctions each indicator can support and contribute to the overall measurement system.

Sixty-four of the available 336 data points were selected for their representativeness, with no duplications of values and with a proportionate distribution along the entire continuum of observed values.

Data from the same 64 countries and the same years were then sought for the subsequent indicators. It turned out that the years in which data were available varied across data sets. Data within one or two years of the target year were sometimes substituted for missing data.

The data were analyzed twice, first with each indicator allowed its own rating scale, parameterizing each of the category difficulties separately for each item, and then with the full rating scale model, as the results of the first analysis showed all indicators shared strong consistency in the rating structure.

Results

Data were 65.2% complete. Countries were assessed on an average of 14.3 of the 22 indicators, and each indicator was applied on average to 41.7 of the 64 country cases. Measurement reliability was .89-.90, depending on how measurement error is estimated. Cronbach’s alpha for the by-country scores was .94. Calibration reliability was .93-.95. The rating scale worked well (see Linacre, 2002, for criteria). The data fit the measurement model reasonably well, with satisfactory data consistency, meaning that the hypothesis of a measurable developmental construct was not falsified.

The main result for our purposes here concerns how satisfactory data consistency makes it possible to dramatically reduce data volume and improve data interpretability. The figure below illustrates how. What does it mean for data volume to be drastically reduced with no loss of information? Let’s see exactly how much the data volume is reduced for the ten item data subset shown in the figure below.

The horizontal continuum from -100 to 1300 in the figure is the metric, the ruler or yardstick. The number of countries at various locations along that ruler is shown across the bottom of the figure. The mean (M), first standard deviation (S), and second standard deviation (T) are shown beneath the numbers of countries. There are ten countries with a measure of just below 400, just to the left of the mean (M).

The MDG indicators are listed on the right of the figure, with the indicator most often found being achieved relative to the goals at the bottom, and the indicator least often being achieved at the top. The ratings in the middle of the figure increase from 1 to 3 left to right as the probability of goal achievement increases as the measures go from low to high. The position of the ratings in the middle of the figure shifts from left to right as one reads up the list of indicators because the difficulty of achieving the goals is increasing.

Because the ratings of the 64 countries relative to these ten goals are internally consistent, nothing but the developmental level of the country and the developmental challenge of the indicator affects the probability that a given rating will be attained. It is this relation that defines fit to a measurement model, the sufficiency of the summed ratings, and the interpretability of the scores. Given sufficient fit and consistency, any country’s measure implies a given rating on each of the ten indicators.

For instance, imagine a vertical line drawn through the figure at a measure of 500, just above the mean (M). This measure is interpreted relative to the places at which the vertical line crosses the ratings in each row associated with each of the ten items. A measure of 500 is read as implying, within a given range of error, uncertainty, or confidence, a rating of

  • 3 on debt service and female-to-male parity in literacy,
  • 2 or 3 on how much of the population is undernourished and how many children under five years of age are moderately or severely underweight,
  • 2 on infant mortality, the percent of the population aged 15 to 49 with HIV, and the youth unemployment rate,
  • 1 or 2 the poor’s share of the national income, and
  • 1 on CO2 emissions and the rate of personal computers per 100 inhabitants.

For any one country with a measure of 500 on this scale, ten percentages or rates that appear completely incommensurable and incomparable are found to contribute consistently to a single valued function, developmental goal achievement. Instead of managing each separate indicator as a universe unto itself, this scale makes it possible to manage development itself at its own level of complexity. This ten-to-one ratio of reduced data volume is more than doubled when the total of 22 items included in the scale is taken into account.

This reduction is conceptually and practically important because it focuses attention on the actual object of management, development. When the individual indicators are the focus of attention, the forest is lost for the trees. Those who disparage the validity of the maxim, you manage what you measure, are often discouraged by the the feeling of being pulled in too many directions at once. But a measure of the HIV infection rate is not in itself a measure of anything but the HIV infection rate. Interpreting it in terms of broader developmental goals requires evidence that it in fact takes a place in that larger context.

And once a connection with that larger context is established, the consistency of individual data points remains a matter of interest. As the world turns, the order of things may change, but, more likely, data entry errors, temporary data blips, and other factors will alter data quality. Such changes cannot be detected outside of the context defined by an explicit interpretive framework that requires consistent observations.

-100  100     300     500     700     900    1100    1300
|-------+-------+-------+-------+-------+-------+-------|  NUM   INDCTR
1                                 1  :    2    :  3     3    9  PcsPer100
1                         1   :   2    :   3            3    8  CO2Emissions
1                    1  :    2    :   3                 3   10  PoorShareNatInc
1                 1  :    2    :  3                     3   19  YouthUnempRatMF
1              1   :    2   :   3                       3    1  %HIV15-49
1            1   :   2    :   3                         3    7  InfantMortality
1          1  :    2    :  3                            3    4  ChildrenUnder5ModSevUndWgt
1         1   :    2    :  3                            3   12  PopUndernourished
1    1   :    2   :   3                                 3    6  F2MParityLit
1   :    2    :  3                                      3    5  DebtServExpInc
|-------+-------+-------+-------+-------+-------+-------|  NUM   INDCTR
-100  100     300     500     700     900    1100    1300
                   1
       1   1 13445403312323 41 221    2   1   1            COUNTRIES
       T      S       M      S       T

Discussion

A key element in the results obtained here concerns the fact that the data were about 35% missing. Whether or not any given indicator was actually rated for any given country, the measure can still be interpreted as implying the expected rating. This capacity to take missing data into account can be taken advantage of systematically by calibrating a large bank of indicators. With this in hand, it becomes possible to gather only the amount of data needed to make a specific determination, or to adaptively administer the indicators so as to obtain the lowest-error (most reliable) measure at the lowest cost (with the fewest indicators administered). Perhaps most importantly, different collections of indicators can then be equated to measure in the same unit, so that impacts may be compared more efficiently.

Instead of an international developmental aid market that is so inefficient as to preclude any expectation of measured returns on investment, setting up a calibrated bank of indicators to which all measures are traceable opens up numerous desirable possibilities. The cost of assessing and interpreting the data informing aid transactions could be reduced to negligible amounts, and the management of the processes and outcomes in which that aid is invested would be made much more efficient by reduced data volume and enhanced information content. Because capital would flow more efficiently to where supply is meeting demand, nonproducers would be cut out of the market, and the effectiveness of the aid provided would be multiplied many times over.

The capacity to harmonize counts of different but related events into a single measurement system presents the possibility that there may be a bright future for outcomes-based budgeting in education, health care, human resource management, environmental management, housing, corrections, social services, philanthropy, and international development. It may seem wildly unrealistic to imagine such a thing, but the return on the investment would be so monumental that not checking it out would be even crazier.

A full report on the MDG data, with the other references cited, is available on my SSRN page at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1739386.

Goldberg, S. H. (2009). Billions of drops in millions of buckets: Why philanthropy doesn’t advance social progress. New York: Wiley.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

A Simple Example of How Better Measurement Creates New Market Efficiencies, Reduces Transaction Costs, and Enables the Pricing of Intangible Assets

March 4, 2011

One of the ironies of life is that we often overlook the obvious in favor of the obscure. And so one hears of huge resources poured into finding and capitalizing on opportunities that provide infinitesimally small returns, while other opportunities—with equally certain odds of success but far more profitable returns—are completely neglected.

The National Institute for Standards and Technology (NIST) reports returns on investment ranging from 32% to over 400% in 32 metrological improvements made in semiconductors, construction, automation, computers, materials, manufacturing, chemicals, photonics, communications and pharmaceuticals (NIST, 2009). Previous posts in this blog offer more information on the economic value of metrology. The point is that the returns obtained from improvements in the measurement of tangible assets will likely also be achieved in the measurement of intangible assets.

How? With a little bit of imagination, each stage in the development of increasingly meaningful, efficient, and useful measures described in this previous post can be seen as implying a significant return on investment. As those returns are sought, investors will coordinate and align different technologies and resources relative to a roadmap of how these stages are likely to unfold in the future, as described in this previous post. The basic concepts of how efficient and meaningful measurement reduces transaction costs and market frictions, and how it brings capital to life, are explained and documented in my publications (Fisher, 2002-2011), but what would a concrete example of the new value created look like?

The examples I have in mind hinge on the difference between counting and measuring. Counting is a natural and obvious thing to do when we need some indication of how much of something there is. But counting is not measuring (Cooper & Humphry, 2010; Wright, 1989, 1992, 1993, 1999). This is not some minor academic distinction of no practical use or consequence. It is rather the source of the vast majority of the problems we have in comparing outcome and performance measures.

Imagine how things would be if we couldn’t weigh fruit in a grocery store, and all we could do was count pieces. We can tell when eight small oranges possess less overall mass of fruit than four large ones by weighing them; the eight small oranges might weigh .75 kilograms (about 1.6 pounds) while the four large ones come in at 1.0 kilo (2.2 pounds). If oranges were sold by count instead of weight, perceptive traders would buy small oranges and make more money selling them than they could if they bought large ones.

But we can’t currently arrive so easily at the comparisons we need when we’re buying and selling intangible assets, like those produced as the outcomes of educational, health care, or other services. So I want to walk through a couple of very down-to-earth examples to bring the point home. Today we’ll focus on the simplest version of the story, and tomorrow we’ll take up a little more complicated version, dealing with the counts, percentages, and scores used in balanced scorecard and dashboard metrics of various kinds.

What if you score eight on one reading test and I score four on a different reading test? Who has more reading ability? In the same way that we might be able to tell just by looking that eight small oranges are likely to have less actual orange fruit than four big ones, we might also be able to tell just by looking that eight easy (short, common) words can likely be read correctly with less reading ability than four difficult (long, rare) words can be.

So let’s analyze the difference between buying oranges and buying reading ability. We’ll set up three scenarios for buying reading ability. In all three, we’ll imagine we’re comparing how we buy oranges with the way we would have to go about buying reading ability today if teachers were paid for the gains made on the tests they administer at the beginning and end of the school year.

In the first scenario, the teachers make up their own tests. In the second, the teachers each use a different standardized test. In the third, each teacher uses a computer program that draws questions from the same online bank of precalibrated items to construct a unique test custom tailored to each student. Reading ability scenario one is likely the most commonly found in real life. Scenario three is the rarest, but nonetheless describes a situation that has been available to millions of students in the U.S., Australia, and elsewhere for several years. Scenarios one, two and three correspond with developmental levels one, three, and five described in a previous blog entry.

Buying Oranges

When you go into one grocery store and I go into another, we don’t have any oranges with us. When we leave, I have eight and you have four. I have twice as many oranges as you, but yours weigh a kilo, about a third more than mine (.75 kilos).

When we paid for the oranges, the transaction was finished in a few seconds. Neither one of us experienced any confusion, annoyance, or inconvenience in relation to the quality of information we had on the amount of orange fruits we were buying. I did not, however, pay twice as much as you did. In fact, you paid more for yours than I did for mine, in direct proportion to the difference in the measured amounts.

No negotiations were necessary to consummate the transactions, and there was no need for special inquiries about how much orange we were buying. We knew from experience in this and other stores that the prices we paid were comparable with those offered in other times and places. Our information was cheap, as it was printed on the bag of oranges or could be read off a scale, and it was very high quality, as the measures were directly comparable with measures from any other scale in any other store. So, in buying oranges, the impact of information quality on the overall cost of the transaction was so inexpensive as to be negligible.

Buying Reading Ability (Scenario 1)

So now you and I go through third grade as eight year olds. You’re in one school and I’m in another. We have different teachers. Each teacher makes up his or her own reading tests. When we started the school year, we each took a reading test (different ones), and we took another (again, different ones) as we ended the school year.

For each test, your teacher counted up your correct answers and divided by the total number of questions; so did mine. You got 72% correct on the first one, and 94% correct on the last one. I got 83% correct on the first one, and 86% correct on the last one. Your score went up 22%, much more than the 3% mine went up. But did you learn more? It is impossible to tell. What if both of your tests were easier—not just for you or for me but for everyone—than both of mine? What if my second test was a lot harder than my first one? On the other hand, what if your tests were harder than mine? Perhaps you did even better than your scores seem to indicate.

We’ll just exclude from consideration other factors that might come to bear, such as whether your tests were significantly longer or shorter than mine, or if one of us ran out of time and did not answer a lot of questions.

If our parents had to pay the reading teacher at the end of the school year for the gains that were made, how would they tell what they were getting for their money? What if your teacher gave a hard test at the start of the year and an easy one at the end of the year so that you’d have a big gain and your parents would have to pay more? What if my teacher gave an easy test at the start of the year and a hard one at the end, so that a really high price could be put on very small gains? If our parents were to compare their experiences in buying our improved reading ability, they would have a lot of questions about how much improvement was actually obtained. They would be confused and annoyed at how inconvenient the scores are, because they are difficult, if not impossible, to compare. A lot of time and effort might be invested in examining the words and sentences in each of the four reading tests to try to determine how easy or hard they are in relation to each other. Or, more likely, everyone would throw their hands up and pay as little as they possibly can for outcomes they don’t understand.

Buying Reading Ability (Scenario 2)

In this scenario, we are third graders again, in different schools with different reading teachers. Now, instead of our teachers making up their own tests, our reading abilities are measured at the beginning and the end of the school year using two different standardized tests sold by competing testing companies. You’re in a private suburban school that’s part of an independent schools association. I’m in a public school along with dozens of others in an urban school district.

For each test, our parents received a report in the mail showing our scores. As before, we know how many questions we each answered correctly, and, unlike before, we don’t know which particular questions we got right or wrong. Finally, we don’t know how easy or hard your tests were relative to mine, but we know that the two tests you took were equated, and so were the two I took. That means your tests will show how much reading ability you gained, and so will mine.

We have one new bit of information we didn’t have before, and that’s a percentile score. Now we know that at the beginning of the year, with a percentile ranking of 72, you performed better than 72% of the other private school third graders taking this test, and at the end of the year you performed better than 76% of them. In contrast, I had percentiles of 84 and 89.

The question we have to ask now is if our parents are going to pay for the percentile gain, or for the actual gain in reading ability. You and I each learned more than our peers did on average, since our percentile scores went up, but this would not work out as a satisfactory way to pay teachers. Averages being averages, if you and I learned more and faster, someone else learned less and slower, so that, in the end, it all balances out. Are we to have teachers paying parents when their children learn less, simply redistributing money in a zero sum game?

And so, additional individualized reports are sent to our parents by the testing companies. Your tests are equated with each other, and they measure in a comparable unit that ranges from 120 to 480. You had a starting score of 235 and finished the year with a score of 420, for a gain of 185.

The tests I took are comparable and measure in the same unit, too, but not the same unit as your tests measure in. Scores on my tests range from 400 to 1200. I started the year with a score of 790, and finished at 1080, for a gain of 290.

Now the confusion in the first scenario is overcome, in part. Our parents can see that we each made real gains in reading ability. The difficulty levels of the two tests you took are the same, as are the difficulties of the two tests I took. But our parents still don’t know what to pay the teacher because they can’t tell if you or I learned more. You had lower percentiles and test scores than I did, but you are being compared with what is likely a higher scoring group of suburban and higher socioeconomic status students than the urban group of disadvantaged students I’m compared against. And your scores aren’t comparable with mine, so you might have started and finished with more reading ability than I did, or maybe I had more than you. There isn’t enough information here to tell.

So, again, the information that is provided is insufficient to the task of settling on a reasonable price for the outcomes obtained. Our parents will again be annoyed and confused by the low quality information that makes it impossible to know what to pay the teacher.

Buying Reading Ability (Scenario 3)

In the third scenario, we are still third graders in different schools with different reading teachers. This time our reading abilities are measured by tests that are completely unique. Every student has a test custom tailored to their particular ability. Unlike the tests in the first and second scenarios, however, now all of the tests have been constructed carefully on the basis of extensive data analysis and experimental tests. Different testing companies are providing the service, but they have gone to the trouble to work together to create consensus standards defining the unit of measurement for any and all reading test items.

For each test, our parents received a report in the mail showing our measures. As before, we know how many questions we each answered correctly. Now, though we don’t know which particular questions we got right or wrong, we can see typical items ordered by difficulty lined up in a way that shows us what kind of items we got wrong, and which kind we got right. And now we also know your tests were equated relative to mine, so we can compare how much reading ability you gained relative to how much I gained. Now our parents can confidently determine how much they should pay the teacher, at least in proportion to their children’s relative measures. If our measured gains are equal, the same payment can be made. If one of us obtained more value, then proportionately more should be paid.

In this third scenario, we have a situation directly analogous to buying oranges. You have a measured amount of increased reading ability that is expressed in the same unit as my gain in reading ability, just as the weights of the oranges are comparable. Further, your test items were not identical with mine, and so the difficulties of the items we took surely differed, just as the sizes of the oranges we bought did.

This third scenario could be made yet more efficient by removing the need for creating and maintaining a calibrated item bank, as described by Stenner and Stone (2003) and in the sixth developmental level in a prior blog post here. Also, additional efficiencies could be gained by unifying the interpretation of the reading ability measures, so that progress through high school can be tracked with respect to the reading demands of adult life (Williamson, 2008).

Comparison of the Purchasing Experiences

In contrast with the grocery store experience, paying for increased reading ability in the first scenario is fraught with low quality information that greatly increases the cost of the transactions. The information is of such low quality that, of course, hardly anyone bothers to go to the trouble to try to decipher it. Too much cost is associated with the effort to make it worthwhile. So, no one knows how much gain in reading ability is obtained, or what a unit gain might cost.

When a school district or educational researchers mount studies to try to find out what it costs to improve reading ability in third graders in some standardized unit, they find so much unexplained variation in the costs that they, too, raise more questions than answers.

In grocery stores and other markets, we don’t place the cost of making the value comparison on the consumer or the merchant. Instead, society as a whole picks up the cost by funding the creation and maintenance of consensus standard metrics. Until we take up the task of doing the same thing for intangible assets, we cannot expect human, social, and natural capital markets to obtain the efficiencies we take for granted in markets for tangible assets and property.

References

Cooper, G., & Humphry, S. M. (2010). The ontological distinction between units and entities. Synthese, pp. DOI 10.1007/s11229-010-9832-1.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2003). Measurement and communities of inquiry. Rasch Measurement Transactions, 17(3), 936-8 [http://www.rasch.org/rmt/rmt173.pdf].

Fisher, W. P., Jr. (2004, October). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-54.

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [http://www.livingcapitalmetrics.com/images/FisherJAM05.pdf].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009b). NIST Critical national need idea White Paper: Metrological infrastructure for human, social, and natural capital (Tech. Rep., http://www.livingcapitalmetrics.com/images/FisherNISTWhitePaper2.pdf). New Orleans: LivingCapitalMetrics.com.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), in press.

NIST. (2009, 20 July). Outputs and outcomes of NIST laboratory research. Available: http://www.nist.gov/director/planning/studies.cfm (Accessed 1 March 2011).

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Williamson, G. L. (2008). A text readability continuum for postsecondary readiness. Journal of Advanced Academics, 19(4), 602-632.

Wright, B. D. (1989). Rasch model from counting right answers: Raw scores as sufficient statistics. Rasch Measurement Transactions, 3(2), 62 [http://www.rasch.org/rmt/rmt32e.htm].

Wright, B. D. (1992, Summer). Scores are not measures. Rasch Measurement Transactions, 6(1), 208 [http://www.rasch.org/rmt/rmt61n.htm].

Wright, B. D. (1993). Thinking with raw scores. Rasch Measurement Transactions, 7(2), 299-300 [http://www.rasch.org/rmt/rmt72r.htm].

Wright, B. D. (1999). Common sense for measurement. Rasch Measurement Transactions, 13(3), 704-5  [http://www.rasch.org/rmt/rmt133h.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

 

One of the ironies of life is that we often overlook the obvious in favor of the obscure. And so one hears of huge resources poured into finding and capitalizing on opportunities that provide infinitesimally small returns, while other opportunities—with equally certain odds of success but far more profitable returns—are completely neglected.

The National Institute for Standards and Technology (NIST) reports returns on investment ranging from 32% to over 400% in 32 metrological improvements made in semiconductors, construction, automation, computers, materials, manufacturing, chemicals, photonics, communications and pharmaceuticals (NIST, 2009). Previous posts in this blog offer more information on the economic value of metrology. The point is that the returns obtained from improvements in the measurement of tangible assets will likely also be achieved in the measurement of intangible assets.

How? With a little bit of imagination, each stage in the development of increasingly meaningful, efficient, and useful measures described in this previous post can be seen as implying a significant return on investment. As those returns are sought, investors will coordinate and align different technologies and resources relative to a roadmap of how these stages are likely to unfold in the future, as described in this previous post. But what would a concrete example of the new value created look like?

The examples I have in mind hinge on the difference between counting and measuring. Counting is a natural and obvious thing to do when we need some indication of how much of something there is. But counting is not measuring (Cooper & Humphry, 2010; Wright, 1989, 1992, 1993, 1999). This is not some minor academic distinction of no practical use or consequence. It is rather the source of the vast majority of the problems we have in comparing outcome and performance measures.

Imagine how things would be if we couldn’t weigh fruit in a grocery store, and all we could do was count pieces. We can tell when eight small oranges possess less overall mass of fruit than four large ones by weighing them; the eight small oranges might weigh .75 kilograms (about 1.6 pounds) while the four large ones come in at 1.0 kilo (2.2 pounds). If oranges were sold by count instead of weight, perceptive traders would buy small oranges and make more money selling them than they could if they bought large ones.

But we can’t currently arrive so easily at the comparisons we need when we’re buying and selling intangible assets, like those produced as the outcomes of educational, health care, or other services. So I want to walk through a couple of very down-to-earth examples to bring the point home. Today we’ll focus on the simplest version of the story, and tomorrow we’ll take up a little more complicated version, dealing with the counts, percentages, and scores used in balanced scorecard and dashboard metrics of various kinds.

What if you score eight on one reading test and I score four on a different reading test? Who has more reading ability? In the same way that we might be able to tell just by looking that eight small oranges are likely to have less actual orange fruit than four big ones, we might also be able to tell just by looking that eight easy (short, common) words can likely be read correctly with less reading ability than four difficult (long, rare) words can be.

So let’s analyze the difference between buying oranges and buying reading ability. We’ll set up three scenarios for buying reading ability. In all three, we’ll imagine we’re comparing how we buy oranges with the way we would have to go about buying reading ability today if teachers were paid for the gains made on the tests they administer at the beginning and end of the school year.

In the first scenario, the teachers make up their own tests. In the second, the teachers each use a different standardized test. In the third, each teacher uses a computer program that draws questions from the same online bank of precalibrated items to construct a unique test custom tailored to each student. Reading ability scenario one is likely the most commonly found in real life. Scenario three is the rarest, but nonetheless describes a situation that has been available to millions of students in the U.S., Australia, and elsewhere for several years. Scenarios one, two and three correspond with developmental levels one, three, and five described in a previous blog entry.

Buying Oranges

When you go into one grocery store and I go into another, we don’t have any oranges with us. When we leave, I have eight and you have four. I have twice as many oranges as you, but yours weigh a kilo, about a third more than mine (.75 kilos).

When we paid for the oranges, the transaction was finished in a few seconds. Neither one of us experienced any confusion, annoyance, or inconvenience in relation to the quality of information we had on the amount of orange fruits we were buying. I did not, however, pay twice as much as you did. In fact, you paid more for yours than I did for mine, in direct proportion to the difference in the measured amounts.

No negotiations were necessary to consummate the transactions, and there was no need for special inquiries about how much orange we were buying. We knew from experience in this and other stores that the prices we paid were comparable with those offered in other times and places. Our information was cheap, as it was printed on the bag of oranges or could be read off a scale, and it was very high quality, as the measures were directly comparable with measures from any other scale in any other store. So, in buying oranges, the impact of information quality on the overall cost of the transaction was so inexpensive as to be negligible.

Buying Reading Ability (Scenario 1)

So now you and I go through third grade as eight year olds. You’re in one school and I’m in another. We have different teachers. Each teacher makes up his or her own reading tests. When we started the school year, we each took a reading test (different ones), and we took another (again, different ones) as we ended the school year.

For each test, your teacher counted up your correct answers and divided by the total number of questions; so did mine. You got 72% correct on the first one, and 94% correct on the last one. I got 83% correct on the first one, and 86% correct on the last one. Your score went up 22%, much more than the 3% mine went up. But did you learn more? It is impossible to tell. What if both of your tests were easier—not just for you or for me but for everyone—than both of mine? What if my second test was a lot harder than my first one? On the other hand, what if your tests were harder than mine? Perhaps you did even better than your scores seem to indicate.

We’ll just exclude from consideration other factors that might come to bear, such as whether your tests were significantly longer or shorter than mine, or if one of us ran out of time and did not answer a lot of questions.

If our parents had to pay the reading teacher at the end of the school year for the gains that were made, how would they tell what they were getting for their money? What if your teacher gave a hard test at the start of the year and an easy one at the end of the year so that you’d have a big gain and your parents would have to pay more? What if my teacher gave an easy test at the start of the year and a hard one at the end, so that a really high price could be put on very small gains? If our parents were to compare their experiences in buying our improved reading ability, they would have a lot of questions about how much improvement was actually obtained. They would be confused and annoyed at how inconvenient the scores are, because they are difficult, if not impossible, to compare. A lot of time and effort might be invested in examining the words and sentences in each of the four reading tests to try to determine how easy or hard they are in relation to each other. Or, more likely, everyone would throw their hands up and pay as little as they possibly can for outcomes they don’t understand.

Buying Reading Ability (Scenario 2)

In this scenario, we are third graders again, in different schools with different reading teachers. Now, instead of our teachers making up their own tests, our reading abilities are measured at the beginning and the end of the school year using two different standardized tests sold by competing testing companies. You’re in a private suburban school that’s part of an independent schools association. I’m in a public school along with dozens of others in an urban school district.

For each test, our parents received a report in the mail showing our scores. As before, we know how many questions we each answered correctly, and, as before, we don’t know which particular questions we got right or wrong. Finally, we don’t know how easy or hard your tests were relative to mine, but we know that the two tests you took were equated, and so were the two I took. That means your tests will show how much reading ability you gained, and so will mine.

But we have one new bit of information we didn’t have before, and that’s a percentile score. Now we know that at the beginning of the year, with a percentile ranking of 72, you performed better than 72% of the other private school third graders taking this test, and at the end of the year you performed better than 76% of them. In contrast, I had percentiles of 84 and 89.

The question we have to ask now is if our parents are going to pay for the percentile gain, or for the actual gain in reading ability. You and I each learned more than our peers did on average, since our percentile scores went up, but this would not work out as a satisfactory way to pay teachers. Averages being averages, if you and I learned more and faster, someone else learned less and slower, so that, in the end, it all balances out. Are we to have teachers paying parents when their children learn less, simply redistributing money in a zero sum game?

And so, additional individualized reports are sent to our parents by the testing companies. Your tests are equated with each other, so they measure in a comparable unit that ranges from 120 to 480. You had a starting score of 235 and finished the year with a score of 420, for a gain of 185.

The tests I took are comparable and measure in the same unit, too, but not the same unit as your tests measure in. Scores on my tests range from 400 to 1200. I started the year with a score of 790, and finished at 1080, for a gain of 290.

Now the confusion in the first scenario is overcome, in part. Our parents can see that we each made real gains in reading ability. The difficulty levels of the two tests you took are the same, as are the difficulties of the two tests I took. But our parents still don’t know what to pay the teacher because they can’t tell if you or I learned more. You had lower percentiles and test scores than I did, but you are being compared with what is likely a higher scoring group of suburban and higher socioeconomic status students than the urban group of disadvantaged students I’m compared against. And your scores aren’t comparable with mine, so you might have started and finished with more reading ability than I did, or maybe I had more than you. There isn’t enough information here to tell.

So, again, the information that is provided is insufficient to the task of settling on a reasonable price for the outcomes obtained. Our parents will again be annoyed and confused by the low quality information that makes it impossible to know what to pay the teacher.

Buying Reading Ability (Scenario 3)

In the third scenario, we are still third graders in different schools with different reading teachers. This time our reading abilities are measured by tests that are completely unique. Every student has a test custom tailored to their particular ability. Unlike the tests in the first and second scenarios, however, now all of the tests have been constructed carefully on the basis of extensive data analysis and experimental tests. Different testing companies are providing the service, but they have gone to the trouble to work together to create consensus standards defining the unit of measurement for any and all reading test items.

For each test, our parents received a report in the mail showing our measures. As before, we know how many questions we each answered correctly. Now, though we don’t know which particular questions we got right or wrong, we can see typical items ordered by difficulty lined up in a way that shows us what kind of items we got wrong, and which kind we got right. And now we also know your tests were equated relative to mine, so we can compare how much reading ability you gained relative to how much I gained. Now our parents can confidently determine how much they should pay the teacher, at least in proportion to their children’s relative measures. If our measured gains are equal, the same payment can be made. If one of us obtained more value, then proportionately more should be paid.

In this third scenario, we have a situation directly analogous to buying oranges. You have a measured amount of increased reading ability that is expressed in the same unit as my gain in reading ability, just as the weights of the oranges are comparable. Further, your test items were not identical with mine, and so the difficulties of the items we took surely differed, just as the sizes of the oranges we bought did.

This third scenario could be made yet more efficient by removing the need for creating and maintaining a calibrated item bank, as described by Stenner and Stone (2003) and in the sixth developmental level in a prior blog post here. Also, additional efficiencies could be gained by unifying the interpretation of the reading ability measures, so that progress through high school can be tracked with respect to the reading demands of adult life (Williamson, 2008).

Comparison of the Purchasing Experiences

In contrast with the grocery store experience, paying for increased reading ability in the first scenario is fraught with low quality information that greatly increases the cost of the transactions. The information is of such low quality that, of course, hardly anyone bothers to go to the trouble to try to decipher it. Too much cost is associated with the effort to make it worthwhile. So, no one knows how much gain in reading ability is obtained, or what a unit gain might cost.

When a school district or educational researchers mount studies to try to find out what it costs to improve reading ability in third graders in some standardized unit, they find so much unexplained variation in the costs that they, too, raise more questions than answers.

But we don’t place the cost of making the value comparison on the consumer or the merchant in the grocery store. Instead, society as a whole picks up the cost by funding the creation and maintenance of consensus standard metrics. Until we take up the task of doing the same thing for intangible assets, we cannot expect human, social, and natural capital markets to obtain the efficiencies we take for granted in markets for tangible assets and property.

References

Cooper, G., & Humphry, S. M. (2010). The ontological distinction between units and entities. Synthese, pp. DOI 10.1007/s11229-010-9832-1.

NIST. (2009, 20 July). Outputs and outcomes of NIST laboratory research. Available: http://www.nist.gov/director/planning/studies.cfm (Accessed 1 March 2011).

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Williamson, G. L. (2008). A text readability continuum for postsecondary readiness. Journal of Advanced Academics, 19(4), 602-632.

Wright, B. D. (1989). Rasch model from counting right answers: Raw scores as sufficient statistics. Rasch Measurement Transactions, 3(2), 62 [http://www.rasch.org/rmt/rmt32e.htm].

Wright, B. D. (1992, Summer). Scores are not measures. Rasch Measurement Transactions, 6(1), 208 [http://www.rasch.org/rmt/rmt61n.htm].

Wright, B. D. (1993). Thinking with raw scores. Rasch Measurement Transactions, 7(2), 299-300 [http://www.rasch.org/rmt/rmt72r.htm].

Wright, B. D. (1999). Common sense for measurement. Rasch Measurement Transactions, 13(3), 704-5  [http://www.rasch.org/rmt/rmt133h.htm].

Measurement, Metrology, and the Birth of Self-Organizing, Complex Adaptive Systems

February 28, 2011

On page 145 of his book, The Mathematics of Measurement: A Critical History, John Roche quotes Charles de La Condamine (1701-1774), who, in 1747, wrote:

‘It is quite evident that the diversity of weights and measures of different countries, and frequently in the same province, are a source of embarrassment in commerce, in the study of physics, in history, and even in politics itself; the unknown names of foreign measures, the laziness or difficulty in relating them to our own give rise to confusion in our ideas and leave us in ignorance of facts which could be useful to us.’

Roche (1998, p. 145) then explains what de La Condamine is driving at, saying:

“For reasons of international communication and of civic justice, for reasons of stability over time and for accuracy and reliability, the creation of exact, reproducible and well maintained international standards, especially of length and mass, became an increasing concern of the natural philosophers of the seventeenth and eighteenth centuries. This movement, cooperating with a corresponding impulse in governing circles for the reform of weights and measures for the benefit of society and trade, culminated in late eighteenth century France in the metric system. It established not only an exact, rational and international system of measuring length, area, volume and mass, but introduced a similar standard for temperature within the scientific community. It stimulated a wider concern within science to establish all scientific units with equal rigour, basing them wherever possible on the newly established metric units (and on the older exact units of time and angular measurement), because of their accuracy, stability and international availability. This process gradually brought about a profound change in the notation and interpretation of the mathematical formalism of physics: it brought about, for the first time in the history of the mathematical sciences, a true union of mathematics and measurement.”

As it was in the seventeenth and eighteenth centuries for physics, so it has also been in the twentieth and twenty-first for the psychosocial sciences. The creation of exact, reproducible and well maintained international standards is a matter of increasing concern today for the roles they will play in education, health care, the work place, business intelligence, and the economy at large.

As the economic crises persist and perhaps worsen, demand for common product definitions and for interpretable, meaningful measures of impacts and outcomes in education, health care, social services, environmental management, etc. will reach a crescendo. We need an exact, rational and international system of measuring literacy, numeracy, health, motivations, quality of life, community cohesion, and environmental quality, and we needed it fifty years ago. We need to reinvigorate and revive a wider concern across the sciences to establish all scientific units with equal rigor, and to have all measures used in research and practice based wherever possible on consensus standard metrics valued for their accuracy, stability and availability. We need to replicate in the psychosocial sciences the profound change in the notation and interpretation of the mathematical formalism of physics that occurred in the eighteenth and nineteenth centuries. We need to extend the true union of mathematics and measurement from physics to the psychosocial sciences.

Previous posts in this blog speak to the persistent invariance and objectivity exhibited by many of the constructs measured using ability tests, attitude surveys, performance assessments, etc. A question previously raised in this blog concerning the reproductive logic of living meaning deserves more attention, and can be productively explored in terms of complex adaptive functionality.

In a hierarchy of reasons why mathematically rigorous measurement is valuable, few are closer to the top of the list than facilitating the spontaneous self-organization of networks of agents and actors (Latour, 1987). The conception, gestation, birthing, and nurturing of complex adaptive systems constitute a reproductive logic for sociocultural traditions. Scientific traditions, in particular, form mature self-identities via a mutually implied subject-object relation absorbed into the flow of a dialectical give and take, just as economic systems do.

Complex adaptive systems establish the reproductive viability of their offspring and the coherence of an ecological web of meaningful relationships by means of this dialectic. Taylor (2003, pp. 166-8) describes the five moments in the formation and operation of complex adaptive systems, which must be able

  • to identify regularities and patterns in the flow of matter, energy, and information (MEI) in the environment (business, social, economic, natural, etc.);
  • to produce condensed schematic representations of these regularities so they can be identified as the same if they are repeated;
  • to form reproductively interchangeable variants of these representations;
  • to succeed reproductively by means of the accuracy and reliability of the representations’ predictions of regularities in the MEI data flow; and
  • adaptively modify and reorganize representations by means of informational feedback from the environment.

All living systems, from bacteria and viruses to plants and animals to languages and cultures, are complex adaptive systems characterized by these five features.

In the history of science, technologically-embodied measurement facilitates complex adaptive systems of various kinds. That history can be used as a basis for a meta-theoretical perspective on what measurement must look like in the social and human sciences. Each of Taylor’s five moments in the formation and operation of complex adaptive systems describes a capacity of measurement systems, in that:

  • data flow regularities are captured in initial, provisional instrument calibrations;
  • condensed local schematic representations are formed when an instrument’s calibrations are anchored at repeatedly observed, invariant values;
  • interchangeable nonlocal versions of these invariances are created by means of instrument equating, item banking, metrological networks, and selective, tailored, adaptive instrument administration;
  • measures read off inaccurate and unreliable instruments will not support successful reproduction of the data flow regularity, but accurate and reliable instruments calibrated in a shared common unit provide a reference standard metric that enhances communication and reproduces the common voice and shared identity of the research community; and
  • consistently inconsistent anomalous observations provide feedback suggesting new possibilities for as yet unrecognized data flow regularities that might be captured in new calibrations.

Measurement in the social sciences is in the process of extending this functionality into practical applications in business, education, health care, government, and elsewhere. Over the course of the last 50 years, measurement research and practice has already iterated many times through these five moments. In the coming years, a new critical mass will be reached in this process, systematically bringing about scale-of-magnitude improvements in the efficiency of intangible assets markets.

How? What does a “data flow regularity” look like? How is it condensed into a a schematic and used to calibrate an instrument? How are local schematics combined together in a pattern used to recognize new instances of themselves? More specifically, how might enterprise resource planning (ERP) software (such as SAP, Oracle, or PeopleSoft) simultaneously provide both the structure needed to support meaningful comparisons and the flexibility needed for good fit with the dynamic complexity of adaptive and generative self-organizing systems?

Prior work in this area proposes a dual-core, loosely coupled organization using ERP software to build social and intellectual capital, instead of using it as an IT solution addressing organizational inefficiencies (Lengnick-Hall, Lengnick-Hall, & Abdinnour-Helm, 2004). The adaptive and generative functionality (Stenner & Stone, 2003) provided by probabilistic measurement models (Rasch, 1960; Andrich, 2002, 2004; Bond & Fox, 2007; Wilson, 2005; Wright, 1977, 1999) makes it possible to model intra- and inter-organizational interoperability (Weichhart, Feiner, & Stary, 2010) at the same time that social and intellectual capital resources are augmented.

Actor/agent network theory has emerged from social and historical studies of the shared and competing moral, economic, political, and mathematical values disseminated by scientists and technicians in a variety of different successful and failed areas of research (Latour, 2005). The resulting sociohistorical descriptions ought be translated into a practical program for reproducing successful research programs. A metasystem for complex adaptive systems of research is implied in what Roche (1998) calls a “true union of mathematics and measurement.”

Complex adaptive systems are effectively constituted of such a union, even if, in nature, the mathematical character of the data flows and calibrations remains virtual. Probabilistic conjoint models for fundamental measurement are poised to extend this functionality into the human sciences. Though few, if any, have framed the situation in these terms, these and other questions are being explored, explicitly and implicitly, by hundreds of researchers in dozens of fields as they employ unidimensional models for measurement in their investigations.

If so, might then we be on the verge of a yet another new reading and writing of Galileo’s “book of nature,” this time restoring the “loss of meaning for life” suffered in Galileo’s “fateful omission” of the means by which nature came to be understood mathematically (Husserl, 1970)? The elements of a comprehensive, mathematical, and experimental design science of living systems appear on the verge of providing a saturated solution—or better, a nonequilbrium thermodynamic solution—to some of the infamous shortcomings of modern, Enlightenment science. The unity of science may yet be a reality, though not via the reductionist program envisioned by the positivists.

Some 50 years ago, Marshall McLuhan popularized the expression, “The medium is the message.” The special value quantitative measurement in the history of science does not stem from the mere use of number. Instruments are media on which nature, human or other, inscribes legible messages. A renewal of the true union of mathematics and measurement in the context of intangible assets will lead to a new cultural, scientific, and economic renaissance. As Thomas Kuhn (1977, p. 221) wrote,

“The full and intimate quantification of any science is a consummation devoutly to be wished. Nevertheless, it is not a consummation that can effectively be sought by measuring. As in individual development, so in the scientific group, maturity comes most surely to those who know how to wait.”

Given that we have strong indications of how full and intimate quantification consummates a true union of mathematics and measurement, the time for waiting is now past, and the time to act has come. See prior blog posts here for suggestions on an Intangible Assets Metric System, for resources on methods and research, for other philosophical ruminations, and more. This post is based on work presented at Rasch meetings several years ago (Fisher, 2006a, 2006b).

References

Andrich, D. (2002). Understanding resistance to the data-model relationship in Rasch’s paradigm: A reflection for the next generation. Journal of Applied Measurement, 3(3), 325-59.

Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Fisher, W. P., Jr. (2006a, Friday, April 28). Complex adaptive functionality via measurement. Presented at the Midwest Objective Measurement Seminar, M. Lunz (Organizer), University of Illinois at Chicago.

Fisher, W. P., Jr. (2006b, June 27-9). Measurement and complex adaptive functionality. Presented at the Pacific Rim Objective Measurement Symposium, T. Bond & M. Wu (Organizers), The Hong Kong Institute of Education, Hong Kong.

Husserl, E. (1970). The crisis of European sciences and transcendental phenomenology: An introduction to phenomenological philosophy (D. Carr, Trans.). Evanston, Illinois: Northwestern University Press (Original work published 1954).

Kuhn, T. S. (1977). The function of measurement in modern physical science. In T. S. Kuhn, The essential tension: Selected studies in scientific tradition and change (pp. 178-224). Chicago: University of Chicago Press. [(Reprinted from Kuhn, T. S. (1961). Isis, 52(168), 161-193.]

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Lengnick-Hall, C. A., Lengnick-Hall, M. L., & Abdinnour-Helm, S. (2004). The role of social and intellectual capital in achieving competitive advantage through enterprise resource planning (ERP) systems. Journal of Engineering Technology Management, 21, 307-330.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Roche, J. (1998). The mathematics of measurement: A critical history. London: The Athlone Press.

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Taylor, M. C. (2003). The moment of complexity: Emerging network culture. Chicago: University of Chicago Press.

Weichhart, G., Feiner, T., & Stary, C. (2010). Implementing organisational interoperability–The SUddEN approach. Computers in Industry, 61, 152-160.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

How bad will the financial crises have to get before…?

April 30, 2010

More and more states and nations around the world face the possibility of defaulting on their financial obligations. The financial crises are of epic historical proportions. This is a disaster of the first order. And yet, it is so odd–we have the solutions and preventative measures we need at our finger tips, but no one knows about them or is looking for them.

So,  I am persuaded to once again wonder if there might now be some real interest in the possibilities of capitalizing on

  • measurement’s well-known capacity for reducing transaction costs by improving information quality and reducing information volume;
  • instruments calibrated to measure in constant units (not ordinal ones) within known error ranges (not as though the measures are perfectly precise) with known data quality;
  • measures made meaningful by their association with invariant scales defined in terms of the questions asked;
  • adaptive instrument administration methods that make all measures equally precise by targeting the questions asked;
  • judge calibration methods that remove the person rating performances as a factor influencing the measures;
  • the metaphor of transparency by calibrating instruments that we really look right through at the thing measured (risk, governance, abilities, health, performance, etc.);
  • efficient markets for human, social, and natural capital by means of the common currencies of uniform metrics, calibrated instrumentation, and metrological networks;
  • the means available for tuning the instruments of the human, social, and environmental sciences to well-tempered scales that enable us to more easily harmonize, orchestrate, arrange, and choreograph relationships;
  • our understandings that universal human rights require universal uniform measures, that fair dealing requires fair measures, and that our measures define who we are and what we value; and, last but very far from least,
  • the power of love–the back and forth of probing questions and honest answers in caring social intercourse plants seminal ideas in fertile minds that can be nurtured to maturity and Socratically midwifed as living meaning born into supportive ecologies of caring relations.

How bad do things have to get before we systematically and collectively implement the long-established and proven methods we have at our disposal? It is the most surreal kind of schizophrenia or passive-aggressive avoidance pathology to keep on tormenting ourselves with problems for which we have solutions.

For more information on these issues, see prior blogs posted here, the extensive documentation provided, and http://www.livingcapitalmetrics.com.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.