A Performance Analysis of the CNS-1 on Large, Dense Backpropagation Networks
Title | A Performance Analysis of the CNS-1 on Large, Dense Backpropagation Networks |
Publication Type | Technical Report |
Year of Publication | 1993 |
Authors | Müller, S. M. |
Other Numbers | 834 |
Keywords | backpropagation, CNS, parallelization, performance analysis, run time model |
Abstract | We determine in this study the sustained performance of the CNS-1 during training and evaluation of large multilayered feedforward neural networks. Using a sophisticated coding, the 128-node machine would achieve up to 111 GCPS and 22 GCUPS. During recall the machine would achieve 87% of the peak multiply-accumulate performance. The training of large nets is less efficient than the recall but only by a factor of 1.5 to 2.The benchmark is parallelized and the machine code is optimized before analyzing the performance. Starting from an optimal parallel algorithm, CNS specific optimizations still reduce the run time by a factor of 4 for recall and by a factor of 3 for training. Our analysis also yields some strategies for code optimization.The CNS-1 is still in design, and therefore we have to model the run time behavior of the memory system and the interconnection network. This gives us the option of changing some parameters of the CNS-1 system in order to analyze their performance impact. |
URL | http://www.icsi.berkeley.edu/ftp/global/pub/techreports/1993/tr-93-046.pdf |
Bibliographic Notes | ICSI Technical Report TR-93-046 |
Abbreviated Authors | S. M. Müller |
ICSI Publication Type | Technical Report |