In 1997, we started the GRAPE-6 project. It's a five-year project funded by JSPS (Japan Society for the Promotion of Science), and the planned total budget is about 500 M JYE.
The GRAPE-6 is essentially a scaled-up version of GRAPE-4(Makino et al. ), with the peak speed exceeding 100 Tflops. It will consist of around 3000 pipeline chips, each with the peak speed of 40 Gflops. In comparison, GRAPE-4 consists of 1700 pipeline chips, each with 600 Mflops. The increase of a factor of 100 in speed is achieved by integrating six pipelines into one chip (GRAPE-4 chip has one pipeline which needs three cycles to calculate the force from one particle) and using 3--4 times higher clock frequency. The advance of the device technology (from to ) made these improvements possible. Figure 4 shows the first sample of the processor chip delivered in early 1999. The six pipeline units are visible.
Figure 4: The GRAPE-6 processor chip.
Figures 5 and 6 shows the processor board with 16 processor chips and the prototype four-board system. This four-board system has the theoretical peak speed of 2.1 Tflops, and has achieved the sustained speed of 1.1 Tflops for the simulation of 1 million-body system.
Figure 5: The processor board of the GRAPE-6 with 16 processor chips. Two processor chips are mounted on modules, on which four memory chips are also mounted. One board houses eight modules.
Figure 6: The prototype system with four processor board.
GRAPE-6 will be completed by the year 2001. We plan to make small version of GRAPE-6 (peak speed of around one teraflops) commercially available by that time. We've found that the commercial availability of small machines is essential to maximize the scientific outcome from GRAPE hardwares.
Compared to GRAPE-4, GRAPE-6 will give us 100 times more computer power. For simulation of star clusters for the relaxation timescale, this means a factor of five increase in the number of particles we can handle. For short simulations, the increase would be a factor of 10. In the case where we can use tree algorithms, in principle a factor of 100 increase is possible if the host computer has a sufficiently large memory. Table 1 gives a rough idea of what is currently possible with GRAPE-4 and what will be possible soon with GRAPE-6.
Table 1: Particle Number of Simulation Feasible on GRAPE-4 and 6