Licencja
PCJ--Java Library for Highly Scalable HPC and Big Data Processing
Abstrakt (EN)
PCJ is a Java library for scalable high performance and computing and Big Data processing. The library implements the partitioned global address space (PGAS) model. The PCJ application is run as a multi-threaded application with the threads distributed over multiple Java Virtual Machines. Each task has its own local memory to store and access variables locally. Selected variables can be shared between tasks and can be accessed, read and modified by other tasks. The library provides methods to perform basic operations like synchronization of tasks, get and put values in an asynchronous one-sided way. Additionally, PCJ offers methods for creating groups of tasks, broadcasting and monitoring variables. The library hides details of inter- and intra-node communication-making programming easy and feasible. The PCJ library allows for easy development of highly scalable (up to 200k cores) applications running on the large resources. PCJ applications can be also run on the systems designed for data analytics such as Hadoop clusters. In this case, performance is higher than for native applications. The PCJ library fully complies with Java standards, therefore, the programmer does not have to use additional libraries, which are not part of the standard Java distribution. In this paper, we present details of the PCJ library, its API and example applications. The results show good performance and scalability. It is noteworthy that the PCJ library due to its performance and ability to create simple code has great promise to be successful for the parallelization of HPC and Big Data applications.