In the process of porting the program I found that Sather iterators have a lot of slick possibilities; I might have been able to collapse the loops a little more if I study the code longer, but I don't have time.
The performance of the Sather version was a little worse than the C++ version (about 8%) but it was competitive.
-sheldon white- :^/
sheldon@amc.com