Research Works (Kunihiko Fukushima)

Visual Pattern Recognition by the Neocognitron

The neocognitron is a neural network model proposed by Fukushima in 1979. Its architecture was suggested by neurophysiological findings on the visual systems (classical hierarchy-hypothesis by Hubel and Wiesel). It acquires an ability to robustly recognize visual patterns through learning.

To improve the recognition rate of the neocognitron and to make it closer to the biological visual system, several modifications have been applied. An improved version of the neocognitron, for example, shows a recognition rate of more than 99.7% for unlearned blind test patterns sampled randomly from a large database (ETL1) of handwritten digits.

**Handwritten digit recognition by a neocognitron.**

Reference

K. Fukushima: "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position", Biological Cybernetics, 36[4], pp. 193-202 (April 1980).
http://dx.doi.org/10.1007/BF00344251
K. Fukushima: "Artificial vision by multi-layered neural networks: neocognitron and its advances", Neural Networks, 37, pp. 103-119 (Jan. 2013).
http://dx.doi.org/10.1016/j.neunet.2012.09.016

Recognition and Restoration of Partly Occluded Patterns

Even though a pattern is partly occluded by other objects, the occluded pattern can easily be recognized if the occluding objects are visible. If the occluding objects are not visible, however, the recognition becomes very difficult.

We have proposed a hypothesis explaining why a pattern is easier to recognize when it is occluded by visible objects than by invisible opaque objects. A neural network model is constructed based on this hypothesis.

We then added backward paths of signals to the model, and endowed the network with an ability to restore missing portions of the occluded patterns, as well as recognizing the occluded patterns.

A neural network model that can recognize partly occluded patterns and restore missing portions of the occluded patterns.
Examples of restored patterns (W₀) from partly occluded stimulus patterns (U₀). Restoration can be performed successfully even from patterns that have not been learned before.

Reference

K. Fukushima: "Recognition of partly occluded patterns: a neural network model", Biological Cybernetics, 84[4], pp. 251-259 (2001).
http://dx.doi.org/10.1007/s004220000210
K. Fukushima: "Restoring partly occluded patterns: a neural network model", Neural Networks, 18[1], pp. 33-43 (2005).
http://dx.doi.org/10.1016/j.neunet.2004.05.001

Neural Network Model for Extracting Optic Flow

In the MST area of the monkey brain, there are cells that respond selectively to specific motions of a large area of the visual field, such as rotation or expansion/contraction. They respond steadily even when the location of the center of optic flow shifts on the retina. They are thought to analyze optic flows of the retinal images. We have constructed a neural network model for these cells.

The model performs processing similar to mathematical operations called rot and div in the vector field analysis. It is a hierarchical multilayered network consisting of retina, layer V1, layers MT and layer MST. Each MT cell extracts relative velocity between two adjoining small visual fields, and an MST cell adds the response of many MT cells to extract a specific optic flow.

The difference in type of optic flows extracted by MST cells can be created simply by the difference in relative locations between inhibitory and excitatory areas in the receptive fields of the preceding MT cells.

A response of the model to a random-dot pattern rotating counter-clockwise.
Large responses are elicited from rotation-selective MST-cells in the far right of the figure.

Reference

K. Tohyama, K. Fukushima: "Neural network model for extracting optic flow", Neural Networks, 18[5/6], pp. 549-556 (2005).
http://dx.doi.org/10.1016/j.neunet.2005.06.039

Neural Network for Extraction Symmetry Axes

This is an artificial neural network that extracts axes of symmetry from visual patterns. The input patterns can be plane figures, complicated line drawings or gray-scaled natural images taken by CCD cameras.

The network can extract symmetry axes without being affected by small amount of asymmetry caused by slight deformation, variation in brightness or various kinds of noise. Even if an input pattern has two or more symmetry axes, they all can be extracted.

Axes of symmetry extracted from various input patterns.
Symmetry axes are extracted correctly even from gray-scaled images that have complicated textures without being affected by small amount of deformation.

In various kinds of visual information processing, a blurring operation, if appropriately controlled, greatly increase robustness against deformations and various kinds of noise, and reduces computational cost.

Our network checks conditions of symmetry, not directly from the oriented edges, but from a blurred version of them. The use of blur, not only reduces the computational cost greatly, but also largely increases tolerance to deformation of input patterns. It is important to get blurred signals, however, not directly from an input image, but from the oriented edges. Although information of edge locations becomes ambiguous after the blurring operation, most of important features of the original image can still remain stable. If the input image is directly blurred, however, most of the important features in the image will be lost.

The robustness of the neocognitron for visual pattern recognition is aslo obtained by a blurring operation.

Reference

K. Fukushima: "Use of non-uniform spatial blur for image comparison: symmetry axis extraction", Neural Networks, 18[1], pp. 23-32 (2005).
http://dx.doi.org/10.1016/j.neunet.2004.08.001