Datasets / Novel Supercomputing Approaches for High Performance Linear Algebra Using FPGAs Project


Novel Supercomputing Approaches for High Performance Linear Algebra Using FPGAs Project

Published By National Aeronautics and Space Administration

Issued almost 10 years ago

US
beta

Summary

Type of release
a service or API for accessing open data

Data Licence
Not Applicable

Content Licence
Creative Commons CCZero

Verification
automatically awarded

Description

We propose to develop novel FPGA-based algorithmic technology that will enable unprecedented computational power for the solution of large sparse linear equation systems. In Phase I, we will develop a prototype of a non-von-Neumann linear equation solver equipped with our technology, and demonstrate an intermediate milestone for its operational speedup and performance gains using at least two of the CFD problems in the NAS benchmark. Phase I will also deliver a clear technology roadmap in terms of algorithmic and architectural innovations needed to bring the project to success by the end of Phase II. Four mission-critical areas to the success of an FPGA-based non-von-Neuman system within a von-Neumann-based supercomputing environment are identified, namely (1) portability; (2) ease of use; (3) algorithmic speed balance between von-Neumann and non-von-Neumann components; and (4) communication speed. Innovative architectural and algorithmic methods aimed at boosting system effectiveness through each one of the four areas are proposed. In particular, we propose the use of "portability wrappers" to enable wide portability at both hardware and software levels, software drivers in the form of an API for ease of use from a C and/or Fortran environment, innovative reconfigurable computing algorithms and bit structure optimizations suited to the LU factorization problem for speed, and a novel algorithmic technique within the reconfigurable computing paradigm that effectively eliminates the communication bottleneck, typical of multi-system distributed algorithms, for the LU factorization problem. The performance attainable with a single FPGA will be comparable to that of a 1,000-node commodity cluster, while exhibiting reductions of one to two orders of magnitude in both cost and power consumption.