Rozdział w tomie pokonferencyjnym
Miniatura
Licencja

ClosedAccessDostęp zamknięty

Big Data Analytics in Java with PCJ Library: Performance Comparison with Hadoop

Autor
Nowicki, Marek
Ryczkowska, Magdalena
Górski, Łukasz
Bała, Piotr
Data publikacji
2018-03-23
Abstrakt (EN)

The focus of this article is to present Big Data analytics using Java and PCJ library. The PCJ library is an award-winning library for development of parallel codes using PGAS programming paradigm. The PCJ can be used for easy implementation of the different algorithms, including ones used for Big Data processing. In this paper, we present performance results for standard benchmarks covering different types of applications from computational intensive, through traditional map-reduce up to communication intensive. The performance is compared to one achieved on the same hardware but using Hadoop. The PCJ implementation has been used with both local file system and HDFS. The code written with the PCJ can be developed much faster as it requires a smaller number of libraries used. Our results show that applications developed with the PCJ library are much faster compare to Hadoop implementation.

Słowa kluczowe EN
Big Data
Java
Parallel computing
Hadoop
Dyscyplina PBN
informatyka
Tytuł serii wydawniczej
Lecture Notes in Computer Science
Tytuł monografii
Parallel Processing and Applied Mathematics: 12th International Conference, PPAM 2017, Lublin, Poland, September 10-13, 2017, Revised Selected Papers, Part II / ed. Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski
Nazwa edycji konferencji
12th International Conference on Parallel Processing and Applied Mathematics
Strony od-do
318-327
Wydawca ministerialny
Springer
ISBN
978-3-319-78053-5
Licencja otwartego dostępu
Dostęp zamknięty