Neural Networks are a different paradigm for computing:
· Von Neumann machines are based on the processing/memory abstraction of human information processing.
- Neural networks are based on the parallel architecture of animal brains.
Neural networks are a form of multiprocessor computer system, with
· Simple processing elements
· A high degree of interconnection
· Simple scalar messages
- Adaptive interaction between elements
A biological neuron may have as many as 10,000 different inputs, and may send its output (the presence or absence of a short-duration spike) to many other neurons. Neurons are wired up in a 3-dimensional pattern. Real brains, however, are orders of magnitude more complex than any artificial neural network so far considered.
A neural network is composed of a series of processing elements or nodes each having only one output signal. Each output signal fans out to form links called interconnects between it and the other nodes in the system. Each nodes processing depends only on the current input signal being received by that node, and the value of some constants in the small memory of amount of memory local to that node. Each node can include feedback signals that helps to correct its processing as a system of whole encounters new information that alters the way pattern recognition should proceed.
Biological research on how our brain works has inspired the design of artificial neural networks. Our brain is a network consisting of approximately 2.5 billion simple processors called neurons connected to one another through branch like structures called axons and dendrites. Synapses connect the dendrites and axons of one neuron to those of another. These synapses may stimulate or inhibit signals travelling between neurons. Each neuron is connected to a number of neurons and some of the connecting synapses may contribute to an excitatory response from other neuron while other connecting synapses may inhibit such a response. If the total excitatory contribution to a particular neuron exceeds a certain threshold level, then neuron fires, producing an output signal. If excitatory contribution is below the threshold level, the neuron produces no output. The output signal is sent to other neurons via the synapses and these neurons produce their own firing action. In this way external stimuli travel through the brain.
Fundamental concept of neural network
A neural network is a computational model that is a directed graph composed of nodes (sometimes referred as units or neurons) and connections between nodes. With each node, is associated a number refereed to as nodes activation. Similarly, with each connection in the network a number is also associated called its weights. There are some special nodes with their activation externally set, called input nodes. There may be some nodes that distinguished as output nodes. Each nodes activation is based on the activation of nodes that have connection directed at it and weights on those connections. A rule that updates the activation is called update rule. Typically, all the connections should be updated simultaneously. Thus a neural network is a parallel model. Neural network has a parallel-distributed architecture with large number of nodes and connections. Each connection points from one and is associated with a weight.
Building of a neural network involves the following tasks.
· Determine the network properties: Network topology (connectivity), type of connections, the order of connections and weight range.
· Determine the node properties: The activation range and activation (transfer) function
· Determine the system dynamics: The weight initialisation scheme, the activation calculation formula and learning rule.
Network properties: The topology of the neural network refers to its framework as well as its interconnection scheme. The framework is often specified by the number of layers (or slabs) and the number of nodes per layer. The type of layers include:
Input layer: The nodes in it are called input units, which encode the instance presented to the network processing.
Hidden layer:The nodes in it are called hidden units, which are not directly observable and hence hidden. They provide non-linearity for the network.
Output layer: The nodes in it are called output units, which encode the possible concepts (value) to be assigned to the instance under consideration. For example, each output unit represents a class of objects.
Input units do not process information, they simply distribute information to other units. Schematically input units are drawn as circles as distinguished from processing elements like hidden units and output units which are drawn as squares in the diagram. According to the interconnection schemes, a network can be either feed forward or recurrent and its connections are either symmetrical or asymmetrical. Their definitions are given below:
Feedforward networks - all connections point in one direction, i.e. from input towards output
Recurrent networks - these are the feedback connections or loops, from output layer to input layer.
Symmetrical connections - if there is a connection pointing from node i to node j, then there is also a connection from node j to node i, and weight associated with two connections are equal or notationally
Wji = Wij
Asymmetrical connections - if connections are not symmetrical as defined above, then they are asymmetric connections. Interlayer connection - a connection between nodes in different layers is called an interlayer connection.
Intralayer connection - a connection between nodes within same layers is called an intralayer connection.
Self connection - a connection pointing from a node to itself is called a self connection.
Superalayer connection - a connection between nodes in distant (non-adjacent) layers is called a superalayer connection.
The term connectivity refers to how nodes are connected. For example, full connectivity means that every node in the layer is connected to every node in adjacent layer. A high order connection is a connection that combines inputs from more than one node often by multiplication. The order of NN is the order of highest order connection. The number of input layer determine the order of connection. NN are assumed to be first order unless mentioned otherwise.
Node properties: The activation level of nodes can be discrete (e.g. 0 and 1) or continuous across a range (e.g. 0,1) or unrestricted. This depends on the activation function (transfer). If it is a hard limiting function, the activation levels are 0 and 1. For sigmoid function, the activation levels are limited to a continuous range of reals (0,1). Figure 3 shows the sigmoid function. F(x) = 1/1+e-x
System dynamics: Weight initialisation is specific to a particular neural network models chosen. However, in many cases, initial weights are just randomised to a small real numbers.
Learning rule is one of the most important attributes to specify for a NN. The learning rule determines how to adopt connection weight in order to optimise network performance. It calculates how to calculate weight adjustment during each training cycle. However, rule is suspended after training is complete. When a NN is used to solve a problem, the solution lies in the activation level of output units. For example, suppose a NN is implemented for classifying fruits into lemons, oranges and apples. The network has three output units representing three kinds respectively.
Given an unknown fruit, we want to classify it, so we present the characteristic of fruit to the network. The information is received by input layer and propagated forward. If the output corresponding to class apple reaches the maximal activation, then the class assigned to the fruit is apple. From this example, the inference behaviour of a NN involves how to compute the activation level across the network. However, one should note that training a NN involves the same activation, since we need to know the actual activation levels and desired activation levels so as to calculate the error, which are used as the basis of weight adjustment.
The activation levels of input units need not be calculated since they are given. Those of hidden and output units are calculated according to the activation function used, provided that, it is a sigmoid function. The activation level (0j) of output unit j is calculated by
0j = 1/ (1 + e-(jiXi - Qj))
where Xi is the input from unit i, Wji is the weight on the connection from unit i to j and Qj is the threshold on unit j.
Inference and learning