© 2008, 2009 KnowledgeToTheMax

 

 

Knowledge through logic

 

_______________________________________________________________

 

­

Offerings of KnowledgeToTheMax

 

Builder of ultra-optimized scientific models

 

 ______________________________________________________________________________________________________________________

This message targets, and should be of interest to: science policy makers; funders of scientific studies; scientific investigators; scientists; philosophers; educators; laymen.

________________________________________________________________________________________________

 

Contacting us

Summary  Theoretical basis  Remembering  Conflict  Battle!  Empirical basis

  Implications for science  Applications  Entropy minimax  Offerings  News  Free lecture  Bibliography

 

Summary

KnowledgeToTheMax is one of the few firms in the world with the resources that are necessary for construction of an ultra-optimized scientific model (aka scientific theory). Under ultra-optimization, each of the many inferences made by the model is optimized. Models built by this methodology consistently excel, sometimes to an astounding degree. Aristotle’s syllogisms, thermodynamics and the theory of communication were, in effect, built by ultra-optimization. The model that revolutionized meteorology was built by ultra-optimization. Applications have been made in medicine, engineering and throughout the sciences. One result from ultra-optimization is for the maximum possible knowledge to be created from fixed scientific resources.

Ultra-optimization addresses the fundamental problem of logic. When an inference must be selected for being made by a model, there are alternatives for being made. Logic postulates the existence of principles of reasoning that discriminate the one correct alternative from the many incorrect ones. Solid evidence supports the contention that the principles of reasoning are ultra-optimization.

The keys to an understanding of ultra-optimization are the ideas of measure, inferences, optimization and missing information. Logic and measure theory jointly imply the existence of a measure that is the unique measure of inferences. The existence and uniqueness imply that the one correct alternative may be discriminated from the many incorrect alternatives by measurement. In the optimization of an inference, that alternative is judged correct whose measure is greatest or least; all other alternatives are judged incorrect.

After the discovery of ultra-optimization 4 decades ago, it was overlooked, misunderstood or rejected as contrary to self-interest by virtually all scientists, philosophers and educators. A result from this phenomenon is for the vast majority of models in use today to have been built illogically. The illogical models are susceptible to making incorrect inferences. Bad consequences, including unnecessary wars, financial collapses and deaths to loved ones, follow from the incorrect inferences. Science policy continues to foster this situation by overlooking the illogic.

KnowledgeToTheMax is dedicated to benefiting mankind by eliminating the incorrect inferences and maximizing the knowledge through logic. The firm would welcome the opportunity to discuss how it might use its skills in doing so for the benefit of you, your organization or the people who depend on your organization. Please be in touch.

 

 

Background

Given the premise that “a swan was observed and it was white,” one is unjustified in concluding that “all swans are white,” for unobserved swans may not be white. How can one infer the descriptions of unobserved, real objects from the descriptions of observed, real objects? This is the most basic of questions for science. This question is asked by the problem of induction. Today, few scientists, philosophers, educators or laymen can answer this question.

Induction is the process by which one builds a scientific model. The modifier “scientific” on “model” signifies that such a model may provide “scientia,” the Latin word for “demonstrable knowledge.” The word “scientia” comes to us as the English word “science.” Informally, “science” and “knowledge” are synonymous.

In science, a description of a set of real objects is called a “state” of these objects. Cloudy is an example of a state; it describes a region of the Earth. A complete set of alternate descriptions is called a “state-space” for these objects, The set {cloudy, not cloudy} is an example of a state-space; it provides alternate descriptions for a region of the Earth.

An “inference” is an extrapolation from a state in a so-called “observed state-space” of a set of objects to a state in a so-called “unobserved state-space” of the same set of objects. For example, it is an extrapolation from the state in the observed state-space {cloudy, not cloudy} to the state in the unobserved state-space {rain in the next 24 hours, no rain in the next 24 hours} in reference to a region of the Earth.

At a given time, an object is described by a single state of a particular state-space. For example, a region of the Earth is described by the state cloudy or the state not cloudy in the state-space {cloudy, not cloudy}.

Because it is observed, the state in an observed state-space is certain. Because it is unobserved, the state in an unobserved state-space is uncertain.

A scientific “model” is a procedure for making inferences. It is by making inferences that such a model provides knowledge.

In the construction of a model, the builder faces the problem of selecting the inferences that will be made from a much larger set of alternatives for being made. Some of these inferences are “correct”; they should be made by the model. The remaining inferences are “incorrect”; they should not be made by the model. How can the builder of a model discriminate the “correct” from the “incorrect” inferences?

Logic postulates the existence of principles that discriminate correct from incorrect inferences. The principles that discriminate are called “the principles of reasoning.” Logic is the science of these principles. The problem of induction is to discover the principles of reasoning.

For the deductive logic, the principles of reasoning are as established 23 centuries ago by Aristotle; they state that an inference is correct if it conforms to a syllogism; otherwise, it is incorrect. However, in building a model, one employs the inductive logic. How to generalize from the principles of reasoning for the deductive logic to the principles of reasoning for logic as a whole is the problem of induction. Does this problem have a solution?

The historical record reveals that work toward a solution begins at about the time of Aristotle; however, a solution proves elusive. Writing 21 centuries after Aristotle, the philosopher David Hume observes that people build models by “habits of mind” rather than by logical principles. Hume’s colleague Immanuel Kant describes induction as the “scandal of philosophy.”

While the problem of induction remains unsolved, there is an ethical issue for scientists. People use inferences made by models in making decisions on issues of importance to them, including life or death issues. Thus, the builder of a model is duty bound to ensure that this model makes no incorrect inferences. However, in lieu of a solution to the problem of induction, the builder lacks logical means for discriminating between correct and incorrect inferences!

In lieu of these means, models of infinite number are consistent with the empirical data. How can the builder of a model select from among them that model which will be published for use in making decisions?

Science has a crying need for a solution to the problem of induction. However, through most of the scientific age, this problem lacks a solution. A result is for models to be extremely susceptible to making incorrect inferences. Bad consequences, including unnecessary wars, financial collapses and deaths, follow from the incorrect inferences. Of necessity, science policy makers overlook the illogic.

Then, six decades ago, the engineer-mathematician Claude Shannon discovers an essential element of a solution to the problem of induction. It is the existence of a measure that is the unique measure of the inferences which are made by a model. The measure of an inference is the missing information in this inference for a deductive conclusion.

The existence and uniqueness of Shannon’s measure imply that correct can be discriminated from incorrect inferences by measurement! Under some circumstances, the correct inference is the one of least measure. Under other circumstances, the correct inference is the one of greatest measure. Shannon’s discovery becomes the basis for optimization of the designs of modern electronic communications systems. Fruits from optimization include HDTV and noiseless voice messaging from half way around the world.

Two decades after Shannon’s discovery, the engineer-lawyer-physicist Ronald Christensen completes this solution to the problem of induction. The solution is the pair of principles of reasoning called “ultra-optimization.” No competing solution arises. Every indication is that this 23 centuries old problem is now solved.

The cybernetics and systems science communities embrace this solution. However, the vast majority of science policy makers, scientists, philosophers, educators and laymen overlook or misunderstand the solution or reject it as contrary to self-interest. Builders of models continue in the tradition of building them illogically. Illogically built models continue to make incorrect inferences. Bad consequences continue to follow from the incorrect inferences. Unnecessarily, science policy makers continue to overlook the illogic. In the consideration of grant applications and of articles submitted for publication in peer reviewed journals, it remains acceptable to overlook illogic in the construction of a model.

However, a tiny fraction of scientific investigators learn of ultra-optimization and employ it in their research. One is the founder of KnowledgeToTheMax, Terry Oldberg. Thus, 4 decades after the discovery of ultra-optimization, it comes to pass that Oldberg’s firm KnowledgeToTheMax is one of the few firms in the world with the necessary skills for teaching about ultra-optimization or building an ultra-optimized model.

 

Theoretical basis

The following introduction to the theoretical basis for ultra-optimization is lengthy and complicated. A person in a hurry or with limited interest could save time by skipping to the empirical basis. A cost from skipping would be to limit one’s understanding of how: a) ultra-optimization solves the problem of induction and b) the alternatives to ultra-optimization fail to solve this problem. If you elect to continue, you may get lost. If this happens, one way to recover would be to skip to the empirical basis and to pick up the theoretical thread later, with the help of a tutor. To climb the learning curve by reading the literature would not be cost-effective. 

Ultra-optimization arises from generalization of the deductive logic. An axiom of the deductive logic states that every proposition has a variable called its “truth-value”; the value of the truth-value is true or it is false. In reality, though, one observes that a proposition may be true in a proportion of instances lying in the interval between 0% and 100%; this proportion is called the “probability” of the proposition. The deductive logic may be generalized for conformity to this reality; in the generalization, the axiom that every proposition has a truth-value is replaced by the axiom that every proposition has a probability. When generalized in this way, the logic is called the “probabilistic logic.” The probabilistic reduces to the deductive logic when the values of the probabilities of propositions are restricted to 0% and 100%; here, 0% corresponds to false for the truth-value and 100% to true.

A state is an example of a proposition. That it is an example of a proposition ties the notion of a state to logic, for the relationships among propositions are a topic of logic.

Under the probabilistic logic, as every state is a proposition, every state has a probability. For example, the state rain in the next 24 hours has a probability.

It can be proved that the probabilistic logic implies the existence of a measure that is the unique measure of inferences. This measure is called “Shannon’s measure of information,” after the person who first describes it, Claude Shannon. Shannon’s measure of an inference is the missing information in this inference for a deductive conclusion. Going forward, Shannon’s measure of an inference is called the “missing information.” In the literature, Shannon’s measure of an inference is often called the “entropy” or “conditional entropy” of this inference. The “entropy” of thermodynamics is Shannon’s measure of a particular kind of inference.

Within the domain of validity of the deductive logic, the missing information is nil. Within the domain of validity of the inductive logic, the missing information is not-nil. Thus, whether there is missing information distinguishes the domains of validity of the two branches of logic; this, evidently, is the answer to the age-old question of the essential difference between the inductive and the deductive logic.

The existence and uniqueness of Shannon’s measure imply that the probabilistic logic supports ultra-optimization. In the ultra-optimization of a model, each of the many inferences made by this model is optimized. In the optimization of an inference, the complete set of alternate inferences is formed, by variation of the compositions of the alternatives. That alternative is deemed correct whose measure is (depending upon the type of inference) least or greatest. All other alternatives are deemed incorrect.

This line of thinking yields a pair of principles of reasoning. To gain an understanding of how these principles apply in general requires absorption of many details. We’ll avoid a portion of these details by looking at a simplified example. In the example, we imagine that you, the reader, have been engaged to build a model that predicts whether there will be rain in your locality in the next 24 hours. One of several unobserved state-spaces for your model is {rain in the next 24 hours, no rain in the next 24 hours}. We’ll designate this state-space by O. Each element of O is the outcome of a statistical event.

A model has one or more independent variables. Your model has one independent variable; we’ll designate it by the symbol I.  I is the state-space {D, S, R}, where D signifies the state dropping barometric pressure, S the state stable barometric pressure and R the state rising barometric pressure.

As you may recall, the state-space O is unobserved. Inferences are made to O from an observed state-space; let this state-space be designated by C. If C contains two or more states, it is apt to call these states “conditions,” for they are conditions on the space formed by the model’s independent variables.

The elements of the independent variable I of your model are the ways in which a state in C can occur. Each state in C is either a state in I or is abstracted from several such states by placing these states in a disjunction (logical OR statement). An example of an abstracted state is S OR R. The state S OR R is said to be “abstracted” from the state S and from the state R because the description supplied by S OR R is removed from the differences in the descriptions provided by S and by R.

Under the rules established in the previous paragraph, there are five alternatives for the observed state-space C. They are: {D, S, R}, {D, S OR R}, {S, D OR R}, {R, D OR S} and {D OR S OR R}.

Each of the five alternatives for C is associated with a different inference from the observed state-space C to the unobserved state-space O. Thus, there are five alternatives for the inference that will be made by the model. Under the postulate of logic, one of these possibilities is correct. In building your model, how will you discriminate the one correct alternative from the four incorrect alternatives?

Under ultra-optimization, the correct alternative is the one for which Shannon’s measure is minimal; this alternative minimizes the missing information about the state in O, given the state in C. This is the first of the two principles of reasoning under ultra-optimization. In making this a principle of reasoning, one takes the position that “logic” and the “probabilistic logic” are synonymous. We’ll take a look at the empirical basis for this contention later.

To identify the one correct alternative, one needs to compute Shannon’s measure of each of the alternatives. To compute it, one needs first to assign a numerical value to the probability of each state in C and to the probability of each state in O, given each state in C; this value is a real number lying in the interval between 0 and 1. In making this assignment, one needs a solution to the so-called “inverse problem.” This problem results from the fact that experimental science gives rise to frequency ratios. The builder of a model needs to invert this relationship such that values are assigned to probabilities, given frequency ratios.

For the sake of illustration, we’ll focus on the problem of how to assign a value to the probability of the state S OR R in the observed state-space C. Let us suppose that an experiment is run in support of your model. In this experiment, 500 statistical events are observed. In 200 of these events the state S OR C is observed. By definition, the frequency ratio of S is 200 : 500. The related quantity called the “relative frequency” is the quotient of the two frequencies, namely 0.4.

The most popular solution to the inverse problem is the so-called “straight rule.” Under this rule, one assigns the relative frequency of a state to the probability of this state. Thus, in our example, one assigns 0.4 to the probability of S OR R.

The straight rule is quite unsuitable for service as the second principle of reasoning, for when it is used in conjunction with the first principle of reasoning, the automatic result is for the model to fail! The cause of failure is the neglect, under the straight rule, of missing information. Neglect of missing information is the mistake made by the person who, having observed a white swan, concludes that “all swans are white.”

The next most popular solution to the inverse problem is called the “Bayes-Laplace inverse probability theorem.” As it is derived from probability theory, one subscribes to this theorem in subscribing to the probabilistic logic. However, while the theorem itself is logical, the manner in which the theorem is conventionally implemented is objectionable for being illogical. It is illogical for identifying more than one inference, among a number of alternatives, as the correct inference. As you may recall, logic postulates that a single inference among the alternatives is the correct inference.

The second principle of reasoning employs the Bayes-Laplace theorem, but provides a logical implementation for it. This implementation employs maximization of the missing information as the principle that discriminates the one correct inference from the many incorrect inferences; the maximization is under constraints, implemented mathematically, that express the available information. Usually, the net effect of the constraints is to push the missing information downward toward its minimum value of 0.

That the missing information is maximized implies that the missing information possesses a maximum. It possesses a maximum if and only if the states in the unobserved state-space are “irreducible” in the sense that that these states are at the least level of abstraction. The irreducible states are called “the ways in which a state can occur.”

Now, we’ll conduct a thought experiment in which a number of independent statistical events are observed. In our experiment, separate counts are kept of events in which the state is S and in which the state is R. The results from this experiment remain hidden from you, the model builder. While you’ll lack detailed results from the experiment, you’ll nonetheless discover facts in reference to these results that will prove quite useful.

In one observed event, it is a fact that the relative frequency of the state S, given that this state is S OR R, will be 0 OR 1. In two observed events, the relative frequency will be 0 OR ½ OR 1. In three observed events, the relative frequency will be 0 OR 1/3 OR 2/3 OR 1. In N observed events, the relative frequency will be 0 OR 1/N OR 2/N OR …OR 1.

Now, let N increase without limit. The relative frequency becomes known as the “limiting relative frequency.” The limiting relative frequency will be 0 OR 1/N OR 2/N OR …OR 1.

The limiting relative frequency may be any one of the numbers in the sequence 0, 1/N, 2/N,…,1. Each number in the sequence is a “limiting relative frequency possibility.” The set {0, 1/N, 2/N,…,1} of limiting relative frequency possibilities is an example of an unobserved state-space. Your model makes an inference to this unobserved state-space from the observed state-space { 0 OR 1/N OR 2/N OR …OR 1 } . Note the close relationship between the unobserved and the observed state-spaces. The single element of the observed state-space is abstracted from the elements of the unobserved state-space.

Let the unobserved state-space {0, 1/N, 2/N,…,1} be designated by F. Each element of F is irreducible and the corresponding observed state-space contains a single element. It follows that a maximum exists in the missing information about F. Thus, values may be assigned to the set P(F) of probabilities of the elements of F by maximizing the missing information, under constraints. It is apt to reference P(F) as a spectral density function, where P(.) represents the spectral density and F represents the associated frequency. P(.) is the proportion in a statistical ensemble while F is the proportion in a statistical population.

Now let us imagine that an experiment is conducted in which the results become known to you, the model builder. In this experiment, the count n(S) is kept of events in which the state is S and the count n(R) is kept of events in which the state is R. The frequency ratio of the state S, given that the state is S OR R, is n(S) : [n(S) + n(R)]. The relative frequency of S given S OR R is n(S) / [n(S) + n(R)].

With the availability to you of the frequency ratio data, there are two constraints on maximization of the missing information about the limiting relative frequency. They are: 1) the frequency ratio data and 2) noise. Acting through the Bayes-Laplace theorem, the frequency ratio data reduce the missing information about the limiting relative frequency, with the result that the function P(F) peaks about a particular limiting relative frequency. The noise increases the missing information about the limiting relative frequency, making the function P(F) more ambiguous about the limiting relative frequency..

The objection to the manner in which the Bayes-Laplace theorem is conventionally implemented relates to an input to the theorem. It is the spectral density function which we’ll designate by Pʹ(F). Traditionally, but misleadingly, Pʹ(F) is called the “prior distribution” while P(F) is called the “posterior distribution.” Like P(F), Pʹ(F) maps the set of limiting relative frequency possibilities F to a set of probabilities. In the absence of frequency ratio data, P(F) is identical to Pʹ(F). Otherwise, the two functions differ.

The objection arises when Pʹ(F) is taken to be independent of the frequency ratio data. If Pʹ(F) is not grounded in frequency ratio data, critics argue, the selection of this function must be arbitrary. If it is arbitrary, though, more than one such function exists. A result from the arbitrariness is for several different assignments of numerical values to be made to P(F). Each assignment makes a different inference. Each of the several inferences is implied to be correct. That more than one inference is implied to be correct violates the postulate of logic.

Under ultra-optimization, Pʹ(F) is determined by frequency ratio data. Maximization of the missing information, under the constraints, assigns unique sets of values to Pʹ(F) and P(F); that these assignments are unique conforms the methods of assignment to the postulate of logic, thus eliminating the usual objection to use of the Bayes-Laplace theorem. Using the terminology of communications engineering, it is apt to call the assignment to Pʹ(F), the spectral density function of the “noise” and the assignment to P(F), the spectral density function of the “signal plus noise.”

As has been shown, maximization of the missing information about the limiting relative frequency of S OR R, under the constraints, yields the function P(F), where F is the set of limiting relative frequency possibilities for the state S given that the state is S OR R and P is the associated set of probabilities. What we need, though, is the assignment of a value to the probability that the state is S given that the state is S OR R. By a theorem from probability theory, this value is the expected (average) value of F.

We have completed our derivation of a procedure for assignment of a value to the probability that the state is S, given that the state is S OR R. A similar derivation yields a procedure for assignment of a value to the probability that the state is S, given that the state is S OR D.

In the next step under the second principle of reasoning, a value is assigned to the probability that the state is S. It can be shown that this value is:

1 / [ 1/P(S given S OR R) + 1/P(S given S OR D) – 1 ]

where P(S given S OR R) designates the probability that the state is S, given that it is S OR R and P(S given S OR D) designates the probability that the state is S, given that it is S OR D. A similar derivation yields a procedure for assignment of a value to the probability that the state is R. and to the probability that the state is D.

In the final step, under the second principle of reasoning, one assigns a value to the probability that the state is S OR R. As S and R are mutually exclusive, this value is the sum of the values assigned to the probability of S and to the probability of R.

 

Remembering

An aid to remembering what you’ve learned is to translate key mathematical concepts into English prose. It is apt to call the information about the state in O, given the state in C, the “knowledge”; in minimizing the missing information about the state in O, given the state in C, the first principle of reasoning maximizes the knowledge. In maximizing the missing information about the limiting relative frequency, under the constraints, the second principle of reasoning satisfies the principle that has been called “honesty in inferences,” for the effect is to avoid presumption of information not possessed about the limiting relative frequency. Thus, one can translate the first principle of reasoning to

Maximize the knowledge!

and the second principle to:

Keep the inferences honest!

 

When a model is built under ultra-optimization, the elements of the state-space C are abstractions whose definitions satisfy the principles of reasoning. Under the theory of knowledge of Ronald Christensen, such an abstraction is what one means by the word “pattern.” Thus, an effect of ultra-optimization is “pattern discovery” and the maximum possible knowledge is created by pattern discovery. To put this more poetically:

Pattern discovery is motivated by the quest for knowledge.

The latter statement is an English translation of Christensen’s theory of knowledge. As we’ve seen, Christensen’s theory has solid theoretical support. As we’ll see later, this theory has solid empirical support.

 

Conflict

A conclusion reached earlier conflicts with a belief said typically to be held by scientists. According to people who have studied the matter, scientists typically believe that the probability of a state is the limiting relative frequency of this state; this is the belief called “frequentism.” However, the conclusion was reached that the probability of a state is the expected value of the limiting relative frequency of this state. The expected value of the limiting relative frequency is the equivalent of the limiting relative frequency if and only if the missing information about the limiting relative frequency is nil.

Frequentism arose in response to the illogic of the conventional implementation of the Bayes-Laplace theorem. Believers in frequentism avoided this illogic by various stratagems for eliminating the influence of the so-called “prior distribution” Pʹ(F) upon the assignment of a value to the probability of a state. This was accomplished via solutions to the inverse problem that made Pʹ(F) superfluous; one of these solutions was the straight rule.

From the perspective offered by ultra-optimization, it can be seen that frequentism assumed the noise out of existence, for the so-called “prior distribution” Pʹ(F) is more aptly described as “the spectral density function of the noise.” Lacking the noise, information was presumed which was missing. A result from presumption of the missing information was for models built under frequentism to be prolific generators of incorrect inferences.

After its founding in the 19th century, frequentism went on to dominate the field of mathematical statistics in the 20th century. That it remains highly influential may well account for the observation that scientists typically believe in frequentism.

With the perspective offered by ultra-optimization, though, it becomes clear that scientific research is a battle against noise. Ultra-optimized models consistently excel because they wage this battle in the most efficient possible manner. By assuming noise out of existence, frequentism sets up a situation in which it is impossible to wage this battle in any coherent fashion.

 

Battle!

If one believes a myth that inhabits the literature of scientific methodology, a battle rages between believers in frequentism, the so-called “frequentists,” and believers in the alternate methodology called “Bayesianism,” the so-called “Bayesians.” According to the myth, this battle never ends, for frequentists endlessly point out the undoubted flaws in Bayesianism while Bayesians endlessly point out the undoubted flaws in frequentism. Underlying the myth is the assumption that there is not an alternative to Bayesianism or frequentism.

Is this assumption correct? In demystifying this issue, it is helpful to identify terminology that may be misleading. As the theorem that was left to us by Thomas Bayes is rooted in probability theory, there is nothing wrong with this theorem so long as the premises of probability theory hold true. The parties to the mythological battle do not dispute the truthfulness of these premises. The flaws of Bayesianism are disassociated with Bayes’s theorem but associated with the necessity for a so-called “prior distribution.” In the mythological battle, frequentists accuse Bayesians of selecting this distribution arbitrarily. Given that frequentists are accurate in doing so, Bayesians select this distribution arbitrarily. However, it is misleading to call all opponents of frequentism “Bayesians,” for it is possible to employ Bayes’ theorem without selecting the prior distribution arbitrarily. In fact, selection without arbitrariness is a feature of ultra-optimization.

The mythological battle between the frequentists and the Bayesians pits one flawed methodology against another. The assumption supporting this mythology, that there is no alternative to frequentism or Bayesianism, is false. The methodology called “ultra-optimization” exists as an alternative and it contains none of the shortcomings of the mythological combatants.  

 

Empirical basis

A scientific theory called “Christensen’s theory of knowledge” is associated with ultra-optimization. This theory predicts that models built under ultra-optimization consistently excel.

Models built under ultra-optimization include these:

o   the syllogisms of the deductive logic,

o   the theory of fair gambling devices,

o   The theory of heat called thermodynamics and,

o   the theory of communication.

Tests of these models against alternatives have frequently been conducted, over periods ranging from decades to millennia. Data from tests of additional models are presented in the journal article entitled “Entropy Minimax Multivariate Statistical Modeling – II Applications” (Int. J. General Systems, 1986, Vol 12, 227-305). All of the empirical evidence is consistent with Christensen’s theory.

 

Implications for science

The strengths of the theoretical and empirical bases for ultra-optimization imply that:

o   the problem of induction is solved,

o   “logic” is synonymous with the “probabilistic logic,”

o   Shannon’s is the measure of inferences,

o   the principles of reasoning are ultra-optimization,

o   the theory of knowledge is Christensen’s,

o   scientific research is a battle against noise and,

o   by assuming noise out of existence, the widely held belief called “frequentism” obviates the possibility of conducting research in a coherent fashion.

 

Applications

Questions addressed by construction of models under ultra-optimization include:

o   whether a drug will be found to retard lymphoid leukemia, lymphocytic leukemia or melanocarcinoma in mice, based on this drug’s physical, chemical and biological features,

 

o   whether a patient will be found to have heart disease subsequent to his/her electrocardiogram, ECG,

 

o   whether an ECG waveform indicates a normal or an abnormal heartbeat,

 

o   which features of an ECG or other waveform contain the most information about outcomes,

 

o   whether a biopsy will reveal prostate cancer, conditioned on a patient’s level of prostate specific antigen, PSA, plus the values of other independent variables,

 

o   whether a biopsy will reveal cervical cancer, based on spectral analysis of data from tissue fluorescence,

 

o   whether a biopsy will reveal breast cancer, based on electrical potentials produced by the patient’s heart beats,

 

o   whether patients with lymphoma, chronic granulocytic leukemia or prostate cancer have high or low survival risks,

 

o   whether patients surgically treated for coronary artery disease have high or low survival risks, based on catheterization and clinical data,

 

o   whether a paroled prison inmate will return to prison,

 

o   whether depression and related psychological states are related to early childhood memories,

 

o   whether nuclear reactor fuel will be sufficiently deformed under accident conditions to obstruct coolant flow,

 

o   whether nuclear reactor fuel will be found to be leaking radioactive substances if removed from a reactor and tested,

 

o   whether a gasoline storage tank will be found to be leaking a carcinogen into an aquifer or an explosive into adjacent basements, if dug up and tested and,

 

o   how a photographic or video image should be classified as to type.

 

Decisions that have been supported by models built under ultra-optimization include:

o   the course of treatment for non-Hodgkin’s lymphoma,

 

o   the course of treatment for disorders of the cervical spine,

 

o   which factors, in addition to PSA, improve the reliability of prostate cancer diagnosis,

 

o   which factors (now referenced in medicine as International Prognostic Indices, IPIs) indicate high risk for patients with lymphoma, chronic granulocytic leukemia or prostate cancer, for consideration in treatment selection,

 

o   whether to submit a request for approval of a diagnostic technique for breast cancer to the U.S. Food and Drug Administration, FDA,

 

o   whether to submit a request for approval of a diagnostic technique for cervical cancer to the FDA,

 

o   whether the U.S. Nuclear Regulatory Commission should require further research before certifying that nuclear reactors are adequately safe from loss of coolant accidents,

 

o   whether to restart a nuclear reactor containing parts that might fail in service,

 

o   whether to suspend licensing of nuclear reactors,

 

o   when to replace a leakage-prone gasoline storage tank,

 

o   the level of water that should be kept behind a dam, in light of the long range forecast for precipitation,

 

o   how an electric utility should plan for demand for air conditioning and,

 

o   whether an electric utility’s rate should be adjusted, in light of the long range forecast for precipitation.

 

Factors discovered by ultra-optimization are embedded in the medical standard for the classification of patients with non-Hodgkin’s lymphoma.

 

Entropy minimax

In the literature, ultra-optimization by minimization or maximization of the missing information in each inference made by a model is usually referenced by the phrase “entropy minimax”; this is the descriptor given to ultra-optimization by its developer, Ronald Christensen. This document employs “ultra-optimization” in preference to “entropy minimax” and “missing information” in preference to “entropy.” It does this on the basis of evidence that this usage communicates more effectively to a wide audience. There is evidence of confusion among non-scientists and many scientists about what is meant by “entropy.”

 

Offerings

In coming up to speed on ultra-optimization and in building models, it is highly cost effective to get outside help. KnowledgeToTheMax supplies this market with services that include:

o   conduct of seminars,

o   consultancy on science policy,

o   consultancy on curriculum reform in higher education,

o   management of theoretical aspects of scientific studies and,

o   construction of ultra-optimized models.

In cases in which there is a degree of mechanistic understanding of the phenomenon being modeled, KnowledgeToTheMax can augment the supply of information with inputs from a mechanistic model.

A portion of the technology used in building an ultra-optimized model is proprietary. In constructing a model, KnowledgeToTheMax operates under a technology sharing agreement with the developer of ultra-optimization, Dr. Ronald Christensen. In doing so, KnowledgeToTheMax brings Christensen’s 4 decades of experience to bear on the problem of how to ultra-optimize the construction of a model. Under its agreement with Christensen, the ability of KnowledgeToTheMax to build models for firms in the for-profit sector is subject to Christensen’s approval. The firm has unrestricted license to build models for firms in the not-for-profit sector.

 

News

On Oct. 23, 2008, Terry Oldberg of KnowledgeToTheMax presented a lecture entitled “Information Theory: Maximizing Knowledge” to a meeting of the American Nuclear Society in San Francisco, California.

On Nov. 20, 2008, Oldberg presented a lecture entitled “Maximizing Knowledge” to a meeting of the American Chemical Society in Santa Clara, California. The announcement for the meeting is posted here.

On Feb. 11, 2009, Oldberg presented a lecture entitled “Maximizing Knowledge” to a meeting of the American Society for Quality in Santa Clara, California.

On May 7, 2009, Oldberg will present a lecture entitled “Maximizing Knowledge” to a meeting of the American Institute of Chemical Engineers in Berkeley, California.

 

Free lecture

KnowledgeToTheMax offers a free one hour lecture on the topic of ”Maximizing knowledge” to groups of 10 or more people in the San Francisco Bay Area of California. It offers the same lecture to groups outside the Bay Area for a fee.

 

Bibliography

A bibliography is available by clicking here. The literature is large and not user friendly. Proofs of key theorems are absent. Hence, it would be far more cost effective to engage a tutor than to attempt to climb the learning curve unaided.

 

Contacting us

For further information, please contact the owner-operator of KnowledgeToTheMax, Terry Oldberg. He may be reached at or terry@KnowledgeToTheMax.com or 1-650-947-0811 (Los Altos Hills, California).

TOP

 

a