A Performance Analysis of the CNS-1 on Large, Dense Backpropagation Networks

TitleA Performance Analysis of the CNS-1 on Large, Dense Backpropagation Networks
Publication TypeTechnical Report
Year of Publication1993
AuthorsMüller SM
Other Numbers834
Keywordsbackpropagation, CNS, parallelization, performance analysis, run time model

We determine in this study the sustained performance of the CNS-1 during training and evaluation of large multilayered feedforward neural networks. Using a sophisticated coding, the 128-node machine would achieve up to 111 GCPS and 22 GCUPS. During recall the machine would achieve 87% of the peak multiply-accumulate performance. The training of large nets is less efficient than the recall but only by a factor of 1.5 to 2.The benchmark is parallelized and the machine code is optimized before analyzing the performance. Starting from an optimal parallel algorithm, CNS specific optimizations still reduce the run time by a factor of 4 for recall and by a factor of 3 for training. Our analysis also yields some strategies for code optimization.The CNS-1 is still in design, and therefore we have to model the run time behavior of the memory system and the interconnection network. This gives us the option of changing some parameters of the CNS-1 system in order to analyze their performance impact.

Bibliographic Notes

ICSI Technical Report TR-93-046

Abbreviated Authors

S. M. Müller

ICSI Publication Type

Technical Report