Report on the BSP Worldwide Meeting HPCN 1996 Brussels, 17 April


Introduction to BSP

A small but select band of BSP developers and users gathered in the Mercator Room of the Palais de Congres, Brussels at 2 pm on Wednesday 17 April 1996. There had not been time to get the meeting fully integrated into the programme of the HPCN conference. The regulations controlling posting of notices within the building - official notice boards only, nothing on doors, walls or free-standing signposts - limited the impact to those who came to Brussels actively looking for the meeting site. We must learn from this how to do it better in Pittsburgh ....

The first part of the meeting was designed to introduce BSP to newcomers. I gave a short introduction to BSP to an audience that, with a few welcome exceptions, didn't need it. The talk was entitled "Frequently Asked Questions about BSP" but did not follow the question/answer format of a normal FAQ. Maybe one day we'll get round to making a proper FAQ for the BSP WW web pages - it is certainly needed.


How I use BSP

This was followed by a fascinating set of talks from BSP users.

Tim Lanfear (British Aerospace, Sowerby Research Centre, UK) extolled the simplicity of BSP and gave a powerful argument for the cost modelling approach. He described the development of an application in CFD for the Caesar Project. When the actual speed- ups on a Cray T3D were compared with the forecast produced by the BSP cost model, there was a considerable discrepancy: the performance did scale with the number of processors, but not nearly as well as expected. As Tim said, in the absence of a cost model, he would have had little option but to accept the results as a fact of life. He knew better. Using the performance profiler being developed by Jon Hill in Oxford, he was able to identify the reason for the poor performance: an inefficient reduce function. When this was re- coded, the observed speed-ups lay smack on top of the cost model predictions - indeed the figures were so close that Tim was sure that no-one would believe the story!

Antoine le Hyaric (Oxford Parallel, UK) described his work on implementing the NAS benchmarks in BSP. The new generation of benchmarks from NASA are to be made available in two flavours: MPI and HPF (the High Performance Fortran version has yet to appear). As distinct from previous NAS benchmarks which "prescribed" the calculations in a "paper and pencil" form, leaving it to the implementor to interpret the algorithms in whatever way was best for the system being benchmarked - a freedom which was exploited in quite excessive ways - the new set is issued in source code form and the degree of change to the code is limited: no more than 5% of the code lines can be changed if the benchmark is to be recognised; this limit is monitored automatically by the test harness and the degree of change is reported with the results. Antoine described the key communication features in each benchmark and explained how he had created a BSP version. He then showed some results comparing the performance of the BSP versions with the MPI versions on an 8 processor IBM SP2. The MPI implementation on the IBM is known to be especially efficient so it was pleasing to see that the BSP versions performed just as well as the MPI ones, slightly better in some cases, slightly worse in others. Antoine will publish full details, including results for other architectures, in a paper which he is writing with Jon Hill ("Benchmarking BSP: predictable high-performance communication"). Bill McColl said that the conclusion we should draw from this work was that there was no performance penalty to be paid for using BSP; the gains in simplicity, avoidance of deadlock and predictability were obtained at zero cost. Rob Bisseling suggested that Antoine should develop BSP cost models of the NAS Benchmarks to assist in the interpretation of reported results. This was agreed.

Richard Miller (Miller Research Limited, UK) then gave an interesting introduction to the subject of population genetics in general and the topic of linkage analysis in which susceptibilities to given diseases could be traced in families. The particular program he had been working on (VITESSE: O'Connell JR ; Weeks DE. The VITESSE algorithm for rapid exact multilocus linkage analysis via genotype set-recoding and fuzzy inheritance. Nature Genetics 1995 Dec;11:402-8 - Dept. of Human Genetics, Univ. of Pittsburgh) involved the calculation of sums of large numbers of products of probabilities in non-overlapping segments of the population. The calculation was embarrassingly parallel but the natural sub-units of calculation varied considerably in size which led to problems with load balancing. Richard described a number of approaches to the problem. In one, he randomised the distribution of work and was able to show linear speed-ups to 64 Cray T3D processors. He considered the introduction of a "BSP Superstep Timer" where the clock tick synchronisation was taken as an opportunity to balance load. He finally reached the conclusion that the best way to organise what is essentially a farming operation, is to use synchronous message passing on distributed memory systems and, in the case of shared memory multiprocessors, to use locks on shared variables.

After the refreshment break, Martin van Gijzen (Dept. of Maths, University of Utrecht, NL) described the application of BSP programming to the simulation of ocean flows using triangular finite elements on grids that were either regular (land and ocean grid points) or irregular (only ocean grid points). The heart of the calculation is a pre-conditioned conjugate gradient solution of a Poisson equation describing the propagation of pollution from a small source (e.g. in the centre of the Pacific Ocean). Martin originally programmed this calculation for a Cray T3D using the SHMEM instructions to get highly efficient communications. At the time he thought this approach, with some help from Cray Research staff, was relatively straightforward and very effective. He had subsequently been persuaded to try the Oxford BSP Library and reported that this was even easier and it had allowed him to develop the parallelisation for networks of SUN workstations on Ethernet connections. The ability to use the same simple programming interface on different systems was a major benefit.


The Development of BSP Standards

Richard Miller said there was a need to establish a more formal structure for BSP Worldwide if it was, in any sense, to act as the ratifying, regulating body for standardisation. The active debate on the merits of the proposal put up by Hill & McColl had been healthy and had led to improvements, but it was unclear how decisions on a standard were to be taken. The existence of only one proposal was not, in any case, a satisfactory basis for standardisation.

Bill McColl said that it was open to anyone to put forward alternative proposals. The debate had allowed other views to be published. The second round proposal incorporated a considered response, by an augmented team of authors, to the suggestions made in the debate. Not all of the ideas expressed in the debate had been adopted but a rationale for the choices made had been included. The current e-mail group contained the names of people with very different levels of interest in the development of standards. Many were not in a position, for example, to commit their employers to support of any given proposal but could only offer a personal view. Given that there had not, to date, been any widespread dissent to the second round proposal, perhaps the appropriate course of action was for the current proposal to be implemented in a stable and supported library and offered for trial on the widest possible basis. This would enable practitioners, in particular, but all interested parties to make more informed judgements as to its merits. This is what the BSPlib authors planned to do.

After considerable discussion it was agreed that:

1. The current BSPlib proposal will be modified in the light of any comments received by the end of April 1996 that, in the judgement of the proposal authors, improved the quality of the proposal. The revised version would then be implemented by the authors and made available to the community.

2. A proposal for a management structure for BSP Worldwide will be prepared and issued to members of bsp-all for comment and agreement. (This proposal should ensure that responsibility for, and control of, the affairs of the organisation and its publications (including web pages) do not rest by default with Oxford Parallel. The management structure should also provide the basis for taking formal decisions on behalf of the organisation (e.g. formal approval of standards for BSP).)


Next Meeting and Other Matters

Arrangements for future meetings were briefly discussed. A Tutorial Session in BSP Programming would be offered for Supercomputing'96. Members were urged to submit papers on BSP for the main conference. Oxford Parallel was attempting to set up demonstrations of BSP applications. A Birds of a Feather style meeting of BSP Worldwide would be organised.

The major objective should be to get the highest possible profile for BSP and BSP Worldwide at the conference.

Arrangements for 1997 were deferred to the next meeting.

It was agreed that a News Group of the form comp.parallel.bsp should be established and that news of BSP activities should be posted on general HPC and parallel News Groups such as comp.sys.super, comp.parallel

Bob McLatchie

20 May 1996


For further information, please contact Bob McLatchie.


Last updated May 22, 1996

Go to BSP Worldwide News Page