Licencja
Performance evaluation of parallel computing and Big Data processing with Java and PCJ library
Abstrakt (EN)
In this paper, we present PCJ (Parallel Computing in Java), a novel tool for scalable high-performance computing and big data processing in Java. PCJ is Java library implementing PGAS (Partitioned Global Address Space) programming paradigm. It allows for the easy and feasible development of computational applications as well as Big Data processing. The use of Java brings HPC and Big Data type of processing together and enables running on the different types of hardware. In particular, the high scalability and good performance of PCJ applications have been demonstrated using Cray XC40 systems. We present performance and scalability of PCJ library measured on Cray XC40 systems with standard benchmarks such as ping-pong, broadcast, and random access. We describe parallelization of example applications of different characteristics including FFT and 2D stencil. Results for standard Big Data benchmarks such as word count are presented. In all cases, measured performance and scalability confirm that PCJ is a good tool to develop parallel applications of different type.