QUANTITATIVE MEASURE OF INTELLIGENCE

 

 

CONTENTS:

 

1 ESTABLISHMENT OF SCIENTIFIC METHOD

2 MEASUREMENT THEORY

3 INFORMATION THEORY

4 INTELLIGENCE THEORY

5 USING INTELLIGENCE THEORY

 

1 ESTABLISHMENT OF SCIENTIFIC METHOD

Quoted from the book Principles of Physics by Marion & Hornyak

1 Scientific method is to base conclusions on the results of observations and experiments, analyzed by logic and reason.

2 Scientific method is not a prescription for learning scientific truth.

3 The ultimate answer to any question concerning natural phenomena must be the result of experimental measurement.

4 A theory attempts to explain why nature behaves as it does.

5 To construct a theory, we introduce certain UNEXPLAINED fundamental concepts e.g. energy, time, space and electric charge.

6 Laws of physics tell how things behave in terms of the theory.

7 Theories are judged based on predictive power, comprehensiveness and simplicity.

Engineering choice of theory: in addition to 7, it must also allow an engineer to minimise cost (in dollars) in solving a problem.

 

2 MEASUREMENT THEORY

Example units of measurements:

 

One unit of length, metre, is defined as the length of the path travelled by light in a vacuum in 1/299792458 second.

 

Error of measurement is defined as (perceived quantity measured) - (true quantity)

 

Error > 0

 

3 INFORMATION THEORY

The amount of information, in bits, I(xi), of an event xi is defined as -log2P(xi).

P(xi) is the probability of xi.

The above statement is called the Information Theory.

 

A unit symbol is defined as a unique pattern representing an event.

 

The information content of the occurrence of a sequence of symbols {x0...xN}, S, is  .

The statement above is the application of information theory to coding.

 

4 INTELLIGENCE THEORY

PREPARATORY DEFINITIONS:

Definition (1):

The information I(xi) of an event X=xi with probability P(xi) is

Definition (2):

In order to have autonomous operation, the machine must have memory to store instructions. The number of instructions in such machine is its program size, Z.

Definition (3):

A unit instruction at time t, a t, is an indivisible event in time, and can be stored in one unit of memory (the unit of the memory must be sufficient to store all possible instruction sizes). The number of unit-instructions which can be stored is Z.

 

Definition (4):

The intelligence content, H, of a sequence of instructions from time t=0 to T, is the total information content of that occurrence of instructions, I(a t).

Equation (3) represents the joint probability of events as the product of the probability of each event.

 

IMPLICATIONS:

A truth table (or any highly parallel network structures such as Neural Networks), which gives a definite output for a fixed input pattern, has zero information value.

An autonomous machine which executes instructions in a predictable manner (instructions such as unconditional branches) has zero intelligence because the information content of all possible events (execution of instructions) is zero.

 

Definitions 1-4 imply that each instruction has equal importance. Therefore an instruction that executes a complex pattern matching in one cycle has the same weight as an instruction that does nothing (NOP). This result comes from equation (2). A program (a sequence of instructions) may have a large intelligence content but zero information processing ability.

Please note that we are only measuring the intelligence content of a machine, not its throughput. A stupid machine can be very productive indeed.

Intelligence can be thought of as a resource which should be used at the correct circumstances, similar to energy and time. There must be a time, space and intelligence relationship whereby we can increase intelligence only to increase the time of processing (number of sequential instructions) but decreasing the space requirement (less hardware). For the same information processing rate, increasing parallelism in the program (by utilising more hardware) results in the reduction in the number of sequential instructions (implying a reduction in intelligence). Of course we can design circumstances whereby we optimise all these relationships where we can get the maximum intelligence while reducing time by maximising the number of random conditional branches.

 

An analogy is a student trying to take his examination. He has 2 options open to him. One is to remember as much as possible so that he can just regurgitate facts with little intelligence. The other is to concentrate on the deductive reasoning by remembering only key concepts from which he can recover information or even invent alternative solutions. The second method can be called the more intelligent method whereas the first one is more of a reflex action. The throughput is the number of correct answers, which may be the same for both cases.

A stupid machine may look intelligent if it is controlled by pseudo-random sequence generators which is common in Spread Spectrum systems. However pseudo-random is actually predictable. The degree of randomness (or the inverse, which is the predictability), is dependent on the period of the sequences. The pattern of the sequences can only be detected if we are able to sample twice the period of one complete cycle of sequences. The information content measurable is dependent on the sampling period that the observer can make.

If we have full access to the algorithm of the code generation, we would discover that the intelligence of the pseudo-random pattern generator is virtually zero.

If the observer has no access, he must try to obtain as many samples as he can. The information content that he has measured of the pseudo-random generator is his perceived intelligence of the pseudo-random generator, which must be wrong if the observer has not broken the code generation algorithm. Definitions 1-4 does not fail to quantify the intelligence of the pseudo-random generator, it is just that the sampling process is not sufficient.

 

5 USING INTELLIGENCE THEORY

The development of this measure of intelligence is due to efforts in identifying the critical instructions for microprocessors. Let us apply these definitions to a typical general purpose program running on a typical microprocessor.

Let us assume that only and all conditional branches are truly random, there are b P of them, where P is the number of instructions which have been executed and each branch may choose B # Z equally likely addresses. For a DLX3 microprocessor B=2. For man, B is very large. B determines the capacity for intelligence. For the same microprocessor DLX, b are 0.20 and 0.05 for Free Software Foundation's GNU C Compiler and Spice respectively.

 

 

E in equation (4) is the average intelligence(entropy) of each instruction measured at each instruction execution. For simple problems (requiring less B), the rate of intelligence of a machine is higher than man.

We can conclude that a C compiler program uses more intelligence per step(b ) than Spice which is mainly a numerical program, and total intelligence depends only on b , B and P.

The intelligence observed is the execution of programs. A machine may be capable of executing conditional branches (intelligence bearing instructions) but if the program it executes does not utilise that intelligence, the machine may look stupid.

The amount of intelligence that can be used per unit time depends on the speed of the machine.

 

If a microprocessor is to be designed for highly intelligent programs, such as expert systems, it must be optimised for minimal pipeline flushes on conditional jumps and capable of conditional jumping to any instruction. We now have a concrete guideline in designing microprocessors.

This measure should reinforce our intuition about our intelligence versus reflex action.

Pattern recognition is just a reflex action after a period of training. Initially a lot of intelligence is required to incorporate knowledge into our memory. After the initial training period, we require less intelligence. Parallel algorithms are not the ultimate solution. They still need intelligent pre and post processors which therefore must be sequential.

Although a human being has a lot of organs that exploits parallelism, our consciousness is sequential. It seems as though there is a master computer which is sequential (Von Neumann Machine) which controls other distributed processors of various degrees of intelligence. This argues the case for SIMD supercomputers which may have slave MIMD machines.

 

References

        1 M. W. SHIELDS 'An Introduction to Automata Theory' Blackwell Scientific Publications (1987)

2 G. F. LUGER, W. A. STUBBLEFIELD 'Artificial Intelligence and the Design of Expert Systems' The Benjamin/Cummings Publishing Company, Inc. (1989)

3 J. L. HENNESSY, D. A. PATTERSON 'Computer Architecture A Quantitative Approach' Morgan Kaufmann Publishers, Inc. (1990)

 

INFORMATION THEORY OF PROCESSOR

 

OBJECT (Vocabulary) 6 symbol

RULE (Grammar) 6 relationship between symbols

MEANING (Semantic) 6 symbol

 

Every symbol representing either OBJECT, RULE or MEANING must be unique.

 

 

 

MEANING EXTRACTION PROCESS

 

OBJECT (Symbol) 6 MEANING (Symbol)

 

Compiler Terms

 

lexical analysis

 

9

 

syntax analysis

 

9

 

semantic analysis

 

Operations on symbols

 

convert text to unique objects by giving unique symbols

 

 

find relationship between objects and assigning unique symbols to each relationship

 

convert relationship symbols to predefined actions