Parallel Optimizations: Advanced Constructs and Compiler Optimizations for a Parallel, Object Oriented, Shared Memory Language Running on a Distributed System

TitleParallel Optimizations: Advanced Constructs and Compiler Optimizations for a Parallel, Object Oriented, Shared Memory Language Running on a Distributed System
Publication TypeTechnical Report
Year of Publication1997
AuthorsFleiner, C.
Other Numbers1081
Abstract

Today's processors provide more and more processing power, yet there are many applications whose processing demand cannot be met by a single processor in the near future, besides, the demand for more processing power seems to increase at least as fast as the speed of new processors and the only way to complete such calculation-intensive programs is to execute them on many processors at once. The history of parallel computers of the last several years suggests that the distributed, parallel computer model will gain widespread acceptance as the most important one. In this model a computer consists of several node, each with its own processors and memory, As such a computer does not offer one global memory space, but rather a separate memory per node (distributed memory), it is no longer possible to directly use the shared memory programming paradigm. However, as it is generally easier to program with shared memory rather than using message based communications, several new languages and language extensions that simulate shared memory have been suggested. Such a parallel, distributed language has not only to provide special support for managing parallelism and synchronization, the specification and implementation of the language has to address the issue of distributed memory as well. One of the most important issues is the selection of the memory consistency model, which defines when writes of one node are observed by the other nodes of the distributed computer. Many vital optimizations used by compilers for serial languages are often not possible if the memory model is too restrictive, but a weaker memory model makes the language harder to use. This thesis discusses several problems and solutions for such languages. It uses the language pSather, an object oriented, parallel language developed at the International Computer Science Institute in Berkeley as an example. A very flexible synchronization construct including different implementations of it, is introduced that allows the user to define new synchronization primitives, and avoids deadlocks and starvations in many common cases. Several memory consistency models and their implications for programmers and the compiler, especially regarding optimizations, are discussed. The effect of several optimizations (adaptations of optimizations used in serial compilers and special parallel optimizations) and their implementation will be shown. The effect of those optimizations will be measured by using test programs written in pSather. The results clearly indicate that a weaker memory model is necessary to achieve the desired efficiency and speedup, even though usage of the language becomes less convenient. However, pSather offers some constructs that solve some of the problems.

URLhttp://www.icsi.berkeley.edu/ftp/global/pub/techreports/1997/tr-97-014.pdf
Bibliographic Notes

ICSI Technical Report TR-97-014

Abbreviated Authors

C. Fleiner

ICSI Publication Type

Technical Report