Publicações

2024

Tremarin, Gabriel Dineck; Marciano, Anna Victoria Gonc¸alves; Schepke, Claudio; Vogel, Adriano

Fortran DO CONCURRENT Evaluation in Multi-core for NAS-PB Conjugate Gradient and a Porous Media Application Inproceedings

Tremarin, Gabriel Dineck; Marciano, Anna Victoria Gonc¸alves; Schepke, Claudio; Vogel, Adriano (Ed.): pp. 133–143, 2024.

Abstract | Links | BibTeX | Tags: multicore, Parallel programming, parallelism

2023

Fim, Gabriel Rustick; Griebler, Dalvan

Implementação e Avaliação do Paralelismo de Flink nas Aplicações de Processamento de Log e Análise de Cliques Inproceedings doi

Anais da XXIII Escola Regional de Alto Desempenho da Região Sul, pp. 69-72, Sociedade Brasileira de Computação, Porto Alegre, Brazil, 2023.

Abstract | Links | BibTeX | Tags: Parallel programming, Stream processing

Andrade, Gabriella; Griebler, Dalvan; Santos, Rodrigo; Fernandes, Luiz Gustavo

A parallel programming assessment for stream processing applications on multi-core systems Journal Article doi

Computer Standards & Interfaces, 84 , pp. 103691, 2023.

Abstract | Links | BibTeX | Tags: Parallel programming, Stream processing

@article{ANDRADE:CSI:2023,
title = {A parallel programming assessment for stream processing applications on multi-core systems},
author = {Gabriella Andrade and Dalvan Griebler and Rodrigo Santos and Luiz Gustavo Fernandes},
url = {https://doi.org/10.1016/j.csi.2022.103691},
doi = {10.1016/j.csi.2022.103691},
year = {2023},
date = {2023-03-01},
journal = {Computer Standards & Interfaces},
volume = {84},
pages = {103691},
publisher = {Elsevier},
abstract = {Multi-core systems are any computing device nowadays and stream processing applications are becoming recurrent workloads, demanding parallelism to achieve the desired quality of service. As soon as data, tasks, or requests arrive, they must be computed, analyzed, or processed. Since building such applications is not a trivial task, the software industry must adopt parallel APIs (Application Programming Interfaces) that simplify the exploitation of parallelism in hardware for accelerating time-to-market. In the last years, research efforts in academia and industry provided a set of parallel APIs, increasing productivity to software developers. However, a few studies are seeking to prove the usability of these interfaces. In this work, we aim to present a parallel programming assessment regarding the usability of parallel API for expressing parallelism on the stream processing application domain and multi-core systems. To this end, we conducted an empirical study with beginners in parallel application development. The study covered three parallel APIs, reporting several quantitative and qualitative indicators involving developers. Our contribution also comprises a parallel programming assessment methodology, which can be replicated in future assessments. This study revealed important insights such as recurrent compile-time and programming logic errors performed by beginners in parallel programming, as well as the programming effort, challenges, and learning curve. Moreover, we collected the participants’ opinions about their experience in this study to understand deeply the results achieved.},
keywords = {Parallel programming, Stream processing},
pubstate = {published},
tppubtype = {article}
}

Vogel, Adriano; Danelutto, Marco; Griebler, Dalvan; Fernandes, Luiz Gustavo

Revisiting self-adaptation for efficient decision-making at run-time in parallel executions Inproceedings doi

31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 43-50, IEEE, Naples, Italy, 2023.

Abstract | Links | BibTeX | Tags: Parallel programming, Self-adaptation

Araujo, Gabriell; Griebler, Dalvan; Rockenbach, Dinei A; Danelutto, Marco; Fernandes, Luiz Gustavo

NAS Parallel Benchmarks with CUDA and Beyond Journal Article doi

Software: Practice and Experience, 53 (1), pp. 53-80, 2023.

Abstract | Links | BibTeX | Tags: Benchmark, GPGPU, Parallel programming

@article{ARAUJO:SPE:23,
title = {NAS Parallel Benchmarks with CUDA and Beyond},
author = {Gabriell Araujo and Dalvan Griebler and Dinei A Rockenbach and Marco Danelutto and Luiz Gustavo Fernandes},
url = {https://doi.org/10.1002/spe.3056},
doi = {10.1002/spe.3056},
year = {2023},
date = {2023-01-01},
journal = {Software: Practice and Experience},
volume = {53},
number = {1},
pages = {53-80},
publisher = {Wiley},
abstract = {NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardware and software. Several research efforts from academia have made these benchmarks available with different parallel programming models beyond the original versions with OpenMP and MPI. This work joins these research efforts by providing a new CUDA implementation for NPB. Our contribution covers different aspects beyond the implementation. First, we define design principles based on the best programming practices for GPUs and apply them to each benchmark using CUDA. Second, we provide ease of use parametrization support for configuring the number of threads per block in our version. Third, we conduct a broad study on the impact of the number of threads per block in the benchmarks. Fourth, we propose and evaluate five strategies for helping to find a better number of threads per block configuration. The results have revealed relevant performance improvement solely by changing the number of threads per block, showing performance improvements from 8% up to 717% among the benchmarks. Fifth, we conduct a comparative analysis with the literature, evaluating performance, memory consumption, code refactoring required, and parallelism implementations. The performance results have shown up to 267% improvements over the best benchmarks versions available. We also observe the best and worst design choices, concerning code size and the performance trade-off. Lastly, we highlight the challenges of implementing parallel CFD applications for GPUs and how the computations impact the GPU's behavior.},
keywords = {Benchmark, GPGPU, Parallel programming},
pubstate = {published},
tppubtype = {article}
}

2022

Löff, Júnior; Hoffmann, Renato Barreto; Griebler, Dalvan; Fernandes, Luiz Gustavo

Combining stream with data parallelism abstractions for multi-cores Journal Article doi

Journal of Computer Languages, 73 , pp. 101160, 2022.

Abstract | Links | BibTeX | Tags: Parallel programming, Stream processing

@article{LOFF:COLA:22,
title = {Combining stream with data parallelism abstractions for multi-cores},
author = {Júnior Löff and Renato Barreto Hoffmann and Dalvan Griebler and Luiz Gustavo Fernandes},
url = {https://doi.org/10.1016/j.cola.2022.101160},
doi = {10.1016/j.cola.2022.101160},
year = {2022},
date = {2022-12-01},
journal = {Journal of Computer Languages},
volume = {73},
pages = {101160},
publisher = {Elsevier},
abstract = {Stream processing applications have seen an increasing demand with the raised availability of sensors, IoT devices, and user data. Modern systems can generate millions of data items per day that require to be processed timely. To deal with this demand, application programmers must consider parallelism to exploit the maximum performance of the underlying hardware resources. In this work, we introduce improvements to stream processing applications by exploiting fine-grained data parallelism (via Map and MapReduce) inside coarse-grained stream parallelism stages. The improvements are including techniques for identifying data parallelism in sequential codes, a new language, semantic analysis, and a set of definition and transformation rules to perform source-to-source parallel code generation. Moreover, we investigate the feasibility of employing higher-level programming abstractions to support the proposed optimizations. For that, we elect SPar programming model as a use case, and extend it by adding two new attributes to its language and implementing our optimizations as a new algorithm in the SPar compiler. We conduct a set of experiments in representative stream processing and data-parallel applications. The results showed that our new compiler algorithm is efficient and that performance improved by up to 108.4x in data-parallel applications. Furthermore, experiments evaluating stream processing applications towards the composition of stream and data parallelism revealed new insights. The results showed that such composition may improve latencies by up to an order of magnitude. Also, it enables programmers to exploit different degrees of stream and data parallelism to accomplish a balance between throughput and latency according to their necessity.},
keywords = {Parallel programming, Stream processing},
pubstate = {published},
tppubtype = {article}
}

Andrade, Gabriella; Griebler, Dalvan; Santos, Rodrigo; Fernandes, Luiz Gustavo

Opinião de Brasileiros Sobre a Produtividade no Desenvolvimento de Aplicações Paralelas Inproceedings doi

Anais do XXII Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD), pp. 276-287, SBC, Florianópolis, Brasil, 2022.

Abstract | Links | BibTeX | Tags: Parallel programming

Rockenbach, Dinei A; Löff, Júnior; Araujo, Gabriell; Griebler, Dalvan; Fernandes, Luiz G

High-Level Stream and Data Parallelism in C++ for GPUs Inproceedings doi

XXVI Brazilian Symposium on Programming Languages (SBLP), pp. 41-49, ACM, Uberlândia, Brazil, 2022.

Abstract | Links | BibTeX | Tags: GPGPU, Parallel programming, Stream processing

Andrade, Gabriella; Griebler, Dalvan; Santos, Rodrigo; Kessler, Christoph; Ernstsson, August; Fernandes, Luiz Gustavo

Analyzing Programming Effort Model Accuracy of High-Level Parallel Programs for Stream Processing Inproceedings doi

48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA 2022), pp. 229-232, IEEE, Gran Canaria, Spain, 2022.

Abstract | Links | BibTeX | Tags: Parallel programming, Stream processing

Fim, Gabriel; Welter, Greice; Löff, Júnior; Griebler, Dalvan

Compressão de Dados em Clusters HPC com Flink, MPI e SPar Inproceedings doi

Anais da XXII Escola Regional de Alto Desempenho da Região Sul, pp. 29-32, Sociedade Brasileira de Computação, Curitiba, Brazil, 2022.

Abstract | Links | BibTeX | Tags: Parallel programming, Stream processing

Hoffmann, Renato Barreto; Löff, Júnior; Griebler, Dalvan; Fernandes, Luiz Gustavo

OpenMP as runtime for providing high-level stream parallelism on multi-cores Journal Article doi

The Journal of Supercomputing, 78 (1), pp. 7655-7676, 2022.

Abstract | Links | BibTeX | Tags: Parallel programming, Stream processing

Löff, Júnior; Hoffmann, Renato Barreto; Pieper, Ricardo; Griebler, Dalvan; Fernandes, Luiz Gustavo

DSParLib: A C++ Template Library for Distributed Stream Parallelism Journal Article doi

International Journal of Parallel Programming, 50 (5), pp. 454-485, 2022.

Abstract | Links | BibTeX | Tags: Distributed computing, Parallel programming

2021

Löff, Júnior; Griebler, Dalvan; Mencagli, Gabriele; de Araujo, Gabriell ; Torquati, Massimo; Danelutto, Marco; Fernandes, Luiz Gustavo

The NAS parallel benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures Journal Article doi

Future Generation Computer Systems, 125 , pp. 743-757, 2021.

Abstract | Links | BibTeX | Tags: Benchmark, Parallel programming

@article{LOFF:FGCS:21,
title = {The NAS parallel benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures},
author = {Júnior Löff and Dalvan Griebler and Gabriele Mencagli and Gabriell {de Araujo} and Massimo Torquati and Marco Danelutto and Luiz Gustavo Fernandes},
url = {https://doi.org/10.1016/j.future.2021.07.021},
doi = {10.1016/j.future.2021.07.021},
year = {2021},
date = {2021-07-01},
journal = {Future Generation Computer Systems},
volume = {125},
pages = {743-757},
publisher = {Elsevier},
abstract = {The NAS Parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite containing several benchmarks extracted from Computational Fluid Dynamics (CFD) models. The benchmark suite has important characteristics such as intensive memory communications, complex data dependencies, different memory access patterns, and hardware components/sub-systems overload. Parallel programming APIs, libraries, and frameworks that are written in C++ as well as new optimizations and parallel processing techniques can benefit if NPB is made fully available in this programming language. In this paper we present NPB-CPP, a fully C++ translated version of NPB consisting of all the NPB kernels and pseudo-applications developed using OpenMP, Intel TBB, and FastFlow parallel frameworks for multicores. The design of NPB-CPP leverages the Structured Parallel Programming methodology (essentially based on parallel design patterns). We show the structure of each benchmark application in terms of composition of few patterns (notably Map and MapReduce constructs) provided by the selected C++ frameworks. The experimental evaluation shows the accuracy of NPB-CPP with respect to the original NPB source code. Furthermore, we carefully evaluate the parallel performance on three multi-core systems (Intel, IBM Power and AMD) with different C++ compilers (gcc, icc and clang) by discussing the performance differences in order to give to the researchers useful insights to choose the best parallel programming framework for a given type of problem.},
keywords = {Benchmark, Parallel programming},
pubstate = {published},
tppubtype = {article}
}

Pieper, Ricardo; Löff, Júnior; Hoffmann, Renato Berreto; Griebler, Dalvan; Fernandes, Luiz Gustavo

High-level and Efficient Structured Stream Parallelism for Rust on Multi-cores Journal Article

Journal of Computer Languages, na (na), pp. na, 2021.

Abstract | BibTeX | Tags: Parallel programming, Stream processing

Maliszewski, Anderson M; Vogel, Adriano; Griebler, Dalvan; Schepke, Claudio; Navaux, Philippe

Ambiente de Nuvem Computacional Privada para Teste e Desenvolvimento de Programas Paralelos Incollection doi

Charão, Andrea; Serpa, Matheus (Ed.): Minicursos da XXI Escola Regional de Alto Desempenho da Região Sul, pp. 104-128, Sociedade Brasileira de Computação (SBC), Porto Alegre, 2021.

Abstract | Links | BibTeX | Tags: Cloud computing, Parallel programming

Vogel, Adriano; Gabriele, ; Mencagli, ; Griebler, Dalvan; Danelutto, Marco; Fernandes, Luiz Gustavo

Online and Transparent Self-adaptation of Stream Parallel Patterns Journal Article doi

Computing, 105 (5), pp. 1039-1057, 2021.

Abstract | Links | BibTeX | Tags: Parallel programming, Self-adaptation, Stream processing

@article{VOGEL:Computing:23,
title = {Online and Transparent Self-adaptation of Stream Parallel Patterns},
author = {Adriano Vogel and Gabriele and Mencagli and Dalvan Griebler and Marco Danelutto and Luiz Gustavo Fernandes},
url = {https://doi.org/10.1007/s00607-021-00998-8},
doi = {10.1007/s00607-021-00998-8},
year = {2021},
date = {2021-05-01},
journal = {Computing},
volume = {105},
number = {5},
pages = {1039-1057},
publisher = {Springer},
abstract = {Several real-world parallel applications are becoming more dynamic and long-running, demanding online (at run-time) adaptations. Stream processing is a representative scenario that computes data items arriving in real-time and where parallel executions are necessary. However, it is challenging for humans to monitor and manually self-optimize complex and long-running parallel executions continuously. Moreover, although high-level and structured parallel programming aims to facilitate parallelism, several issues still need to be addressed for improving the existing abstractions. In this paper, we extend self-adaptiveness for supporting autonomous and online changes of the parallel pattern compositions. Online self-adaptation is achieved with an online profiler that characterizes the applications, which is combined with a new self-adaptive strategy and a model for smooth transitions on reconfigurations. The solution provides a new abstraction layer that enables application programmers to define non-functional requirements instead of hand-tuning complex configurations. Hence, we contribute with additional abstractions and flexible self-adaptation for responsiveness at run-time. The proposed solution is evaluated with applications having different processing characteristics, workloads, and configurations. The results show that it is possible to provide additional abstractions, flexibility, and responsiveness while achieving performance comparable to the best static configuration executions.},
keywords = {Parallel programming, Self-adaptation, Stream processing},
pubstate = {published},
tppubtype = {article}
}

Vanzan, Anthony; Fim, Gabriel; Welter, Greice; Griebler, Dalvan

Aceleração da Classificação de Lavouras de Milho com MPI e Estratégias de Paralelismo Inproceedings doi

21th Escola Regional de Alto Desempenho da Região Sul (ERAD-RS), pp. 49-52, Sociedade Brasileira de Computação, Joinville, RS, Brazil, 2021.

Abstract | Links | BibTeX | Tags: Agriculture, Distributed computing, Parallel programming, Stream processing

Leonarczyk, Ricardo; Griebler, Dalvan

Implementação MPIC++ e HPX dos Kernels NPB Inproceedings doi

21th Escola Regional de Alto Desempenho da Região Sul (ERAD-RS), pp. 81-84, Sociedade Brasileira de Computação, Joinville, RS, Brazil, 2021.

Abstract | Links | BibTeX | Tags: Benchmark, Parallel programming

2019

Maliszewski, Anderson; Roloff, Eduardo; Griebler, Dalvan; Navaux, Philippe

O Impacto da Interconexão de Rede no Desempenho de Programas Paralelos Inproceedings doi

Anais do XX Simpósio em Sistemas Computacionais de Alto Desempenho, pp. 73-84, Sociedade Brasileira de Computação, Campo Grande, Brazil, 2019.

Abstract | Links | BibTeX | Tags: Cloud computing, Parallel programming

2018

Griebler, Dalvan; De Sensi, Daniele ; Vogel, Adriano; Danelutto, Marco; Fernandes, Luiz Gustavo

Service Level Objectives via C++11 Attributes Inproceedings doi

Euro-Par 2018: Parallel Processing Workshops, pp. 745-756, Springer, Turin, Italy, 2018.

Abstract | Links | BibTeX | Tags: Parallel programming