© 2008, 2009 KnowledgeToTheMax

Shannon’s measure of information
One can
optimize an inference because of the existence of the unique measure of
inferences; this measure is called “Shannon’s measure of information.” The
existence of Shannon’s measure follows from the precepts of probability theory.
The uniqueness follows from the precepts of measure theory.
Under the
precepts of measure theory, associated with every
measure is the collection of sets which are measurable by it. The collection of
sets which are measurable by Shannon’s measure contains the pair of state-spaces that participate in
making an inference. The state-space from which this inference is made is
called the “observed state-space.” The state-space
to which this inference is made is called the “unobserved
state-space.”
Let one of
the two state-spaces be designated by the variable X and the other by the variable Y.
By the precepts of measure theory, the collection also contains the set
difference X – Y, the set difference Y –
X and the intersection of the two
state spaces.
The set
difference X – Y is an inference from a state in the observed state-space Y to a state in the unobserved
state-space X. Similarly, the set
difference Y – X is an inference from a state in the observed state-space X to a state in the unobserved
state-space X. Shannon’s measure of
either inference is the missing information in
this inference for a deductive conclusion.
Shannon’s
measure of the intersection of X with
Y is the information about the state
in the unobserved state-space Y,
given the state in the observed state-space X. Conversely, it is the
information about the state in the unobserved state-space X, given the state in the state-space Y. That Shannon’s measure of the intersection of X with Y is nil implies that X and
Y are statistically independent.
Shannon’s measure
of either inference is called the “conditional entropy” or the “entropy.” It is
called the “conditional entropy” if X
and Y intersect; otherwise, it is
called the “entropy.” Shannon’s measure of the intersection of X with Y is called the “mutual information.”
The
conditional entropy, entropy and mutual information are examples of
mathematical functions. Formulae for these functions are readily available via
a Web search.