A list of my most relevant publications is available here. In most cases, you can also get direct access to the bibliography information in LaTeX style as well as to the publishers’ sites to download them. You can also find my most relevant publications on Google Scholar, DBLP and Scopus. Please, visit my Lattes CV (in portuguese) for a full list of publications and projects.
João F. Uller, João V. Souto, Pedro H. Penna, Márcio Castro, Henrique Freitas, Jean-François Méhaut. LWMPI: An MPI Library for NoC-Based Lightweight Manycore Processors with On-Chip Memory Constraints. In: Concurrency and Computation: Practice and Experience (CCPE), 2021. pdf bib
Pedro H. Penna, João V. Souto, João F. Uller, Márcio Castro, Henrique Freitas, Jean-François Méhaut. Inter-Kernel Communication Facility of a Distributed Operating System for NoC-Based Lightweight Manycores. In: International Journal of Parallel and Distributed Computing (JPDC), 2021. pdf bib
Alexandre Santana, Vinicius Freitas, Márcio Castro, Laércio L. Pilla, Jean-François Méhaut. ARTful: A Model for User-defined Schedulers Targeting Multiple HPC Runtime Systems. In: Software: Practice and Experience (SPE), 2021. pdf bib
Lais Borin, George Lima, Márcio Castro, Patrícia D. M. Plentz. Dynamic Power Management Under the RUN Scheduling Algorithm: A Slack Filling Approach. In: Real-Time Systems (RTS), 2021. pdf bib
Vinicius Freitas, Laércio L. Pilla, Alexandre de L. Santana, Márcio Castro, Johanne Cohen. PackStealLB: A Scalable Distributed Load Balancer based on Work Stealing and Workload Discretization. In: International Journal of Parallel and Distributed Computing (JPDC), v.150, 2021. pdf bib
Sérgio G. Pfleger, Patrícia D. M. Plentz, Rodrigo C. O. Rocha, Alyson D. Pereira, Márcio Castro. Real-Time Video Denoising on Multicores and GPUs with Kalman-based and Bilateral Filters Fusion. In: Journal of Real-Time Image Processing (JRTIP), v. 16, no. 5, 2017. pdf bib
Matheus A. Souza, Pedro H. Penna, Matheus M. Queiroz, Luís F. W. Góes, Henrique C. Freitas, Márcio Castro, Philippe O. A. Navaux, Jean-François Méhaut. CAP Bench: A Benchmark Suite for Performance and Energy Evaluation of Low-Power Many-Core Processors. In: Concurrency and Computation: Practice and Experience (CCPE), v. 29, no. 4, 2017. pdf bib
Pedro H. Penna, Márcio Castro, Henrique C. Freitas, François Broquedis, Jean-François Méhaut. Design Methodology for Workload-Aware Loop Scheduling Strategies Based on Genetic Algorithm and Simulation. In: Concurrency and Computation: Practice and Experience (CCPE), v. 29, no. 22, 2016. pdf bib
Márcio Castro, Emilio Francesquini, Fabrice Dupros, Hideo Aochi, Philippe O. A. Navaux, Jean-François Méhaut. Seismic Wave Propagation Simulations on Low-power and Performance-centric Manycores. In: Parallel Computing (PARCO), v. 54, 2016. pdf bib
Laércio L. Pilla, Tiago C. Bozzetti, Márcio Castro, Philippe O. A. Navaux, Jean-François Méhaut. ComprehensiveBench: A Benchmark for the Extensive Evaluation of Global Scheduling Algorithms. In: Journal of Physics: Conference Series (JPCS), v. 649, 2015. pdf bib
Emilio Francesquini, Márcio Castro, Pedro H. Penna, Fabrice Dupros, Henrique C. Freitas, Philippe O. A. Navaux, Jean-François Méhaut. On the Energy Efficiency and Performance of Irregular Application Executions on Multicore, NUMA and Manycore Platforms. In: International Journal of Parallel and Distributed Computing (JPDC), v. 76, 2015. pdf bib
Edson L. Padoin, Laércio L. Pilla, Márcio Castro, Francieli Z. Boito, Philippe O. A. Navaux, Jean-François Méhaut. Performance/Energy Trade-off in Scientific Computing: The Case of ARM big.LITTLE and Intel Sandy Bridge. In: IET Computers & Digital Techniques (IET-CDT), v. 9, 2015. pdf bib
Márcio Castro, Luís F. W. Góes, Jean-François Méhaut. Adaptive Thread Mapping Strategies for Transactional Memory Applications. In: International Journal of Parallel and Distributed Computing (JPDC), v. 74, 2014. pdf bib
Luís F. W. Góes, Christiane P. Ribeiro, Márcio Castro, Jean-François Méhaut, Murray Cole, Marcelo Cintra. Automatic Skeleton-Driven Memory Affinity for Transactional Worklist Applications. In: International Journal of Parallel Programming (IJPP), v. 42, 2014. pdf bib
Vanderlei Munhoz, Antoine Bonfils, Márcio Castro, Odorico Mendizabal. A Performance Comparison of HPC Workloads on Traditional and Cloud-based HPC Clusters. Workshop on Cloud Computing (WCC) - International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Porto Alegre, Brazil: IEEE Computer Society, 2023. pdf bib
Accepted Vanderlei Munhoz, Márcio Castro, Luis G. C. Rego. Evaluating the Parallel Simulation of Dynamics of Electrons in Molecules on AWS Spot Instances. Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD). Porto Alegre, Brazil: SBC, 2023.
Accepted Pedro M. C. Neto, Márcio Castro, Frank Siqueira. Balanceamento de Carga Dinâmico em Ambientes Kubernetes com o Kubernetes Scheduling Extension (KSE). Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD). Porto Alegre, Brazil: SBC, 2023.
Accepted Nicolas Vanz, João V. Souto, Márcio Castro. Virtualização e Migração de Processos em um Sistema Operacional Distribuído para Lightweight Manycores. Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD). Porto Alegre, Brazil: SBC, 2023.
Vanderlei Munhoz, Márcio Castro, Odorico Mendizabal. Strategies for Fault-Tolerant Tightly-coupled HPC Workloads Running on Low-Budget Spot Cloud Infrastructures. International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Bordeaux, France: IEEE Computer Society, 2022. pdf bib
Vanderlei Munhoz, Márcio Castro. HPC@Cloud: A Provider-Agnostic Software Framework for Enabling HPC in Public Cloud Platforms. Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD). Florianópolis, Brazil: SBC, 2022. pdf bib
Best Paper Samuel Fidelis, Márcio Castro, Frank Siqueira. Distributed Learning using Consensus on Edge AI. Brazilian Symposium on Computing Systems Engineering (SBESC). Fortaleza, Brazil: IEEE Computer Society, 2022. pdf bib
João V. Souto, Pedro H. Penna, Márcio Castro. A Task-based Execution Engine for Distributed Operating Systems Tailored to Lightweight Manycores with Limited On-Chip Memory. IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Belo Horizonte, Brazil: IEEE Computer Society, 2021. pdf bib
Emmanuel Podestá Jr., Pedro H. Penna, João F. Uller, Márcio Castro. A Trace-driven Methodology to Evaluate and Optimize Memory Management Services of Distributed Operating Systems for Lightweight Manycores. ACM/SIGAPP Symposium On Applied Computing (SAC). Gwangju, South Korea: ACM, 2021. pdf bib
João F. Uller, João V. Souto, Pedro H. Penna, Márcio Castro, Henrique Freitas, Jean-François Méhaut. Enhancing Programmability in NoC-Based Lightweight Manycore Processors with a Portable MPI Library. Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD). Santo André, Brazil: SBC, 2020. pdf bib
Anna V. Oikawa, Vinicius Freitas, Márcio Castro, Laércio L. Pilla. Adaptive Load Balancing based on Machine Learning for Iterative Parallel Applications. Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP). Västerås, Sweden: IEEE Computer Society, 2020. pdf bib
Pedro H. Penna, João V. Souto, Davidson F. Lima, Márcio Castro, François Broquedis, Henrique Freitas, Jean-François Méhaut. On the Performance and Isolation of Asymmetric Microkernel Design for Lightweight Manycores. Brazilian Symposium on Computing Systems Engineering (SBESC). Natal, Brazil: IEEE Computer Society, 2019. pdf bib
Vinicius Freitas, Alexandre de L. Santana, Márcio Castro, Laércio L. Pilla. Distributed Memory Graph Representation for Load Balancing Data: Accelerating Data Structure Generation for Decentralized Scheduling. International Conference on High Performance Computing & Simulation (HPCS). Dublin, Ireland: IEEE Computer Society, 2019. pdf bib
Pedro H. Penna, Matheus Souza, Emmanuel Podestá Jr., João V. Souto, Márcio Castro, François Broquedis, Henrique C. Freitas Jean-François Méhaut. RMem: An OS Service for Transparent Remote Memory Access in Lightweight Manycores. In: International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG). Valencia, Spain, 2019. pdf bib
Vinicius Freitas, Alexandre de L. Santana, Márcio Castro, Laércio L. Pilla. A Batch Task Migration Approach for Decentralized Global Rescheduling. In: International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Lyon, France: IEEE Computer Society, 2018. pdf bib
Alexandre de L. Santana, Vinicius Freitas, Márcio Castro, Laércio L. Pilla, Jean-François Méhaut. Reducing Global Schedulers’ Complexity Through Runtime System Decoupling. In: Simpósio de Sistemas Computacionais de Alto Desempenho (WSCAD). São Paulo, Brazil: SBC, 2018. pdf bib
Emmanuel Podestá Jr., Bruno M. do Nascimento, Márcio Castro. Energy Efficient Stencil Computations on the Low-Power Manycore MPPA-256 Processor. In: International European Conference on Parallel and Distributed Computing (Euro-Par). Turin, Italy: Springer, 2018. pdf bib
Alyson D. Pereira, Rodrigo Caetano O. Rocha, Márcio Castro, Luís Fabricio W. Góes, Mario A. R. Dantas. Extending OpenACC for Efficient Stencil Code Generation and Execution by Skeleton Frameworks. In: International Conference on High Performance Computing & Simulation (HPCS). Genoa, Italy: IEEE Computer Society, 2017. pdf bib
Alyson D. Pereira, Rodrigo Caetano O. Rocha, Luiz Ramos, Márcio Castro, Luís Fabrício W. Góes. Automatic Partitioning of Stencil Computations on Heterogeneous Systems. In: Workshop on Applications for Multi-core Architectures (WAMCA) - International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Campinas, Brazil: IEEE Computer Society, 2017. pdf bib
2nd Best Paper Pedro H. Penna, Márcio Castro, Patricia D. M. Plentz, Henrique C. Freitas, François Broquedis, Jean-François Méhaut. BinLPT: A Novel Workload-Aware Loop Scheduler for Irregular Parallel Loops. In: Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD). Campinas, Brazil: Brazilian Computer Society, 2017. pdf bib
Lais Borin, Márcio Castro, Patricia D. M. Plentz. Towards the Use of LITMUS-RT as a Testbed for Multiprocessor Scheduling in Energy Harvesting Real-time Systems. In: Brazilian Symposium on Computing Systems Engineering (SBESC). Curitiba, Brazil: IEEE Computer Society, 2017. pdf bib
Pedro H. Penna, Henrique C. de Freitas, João Caram, Márcio Castro, Jean-François Méhaut. Using The Nanvix Operating System in Undergraduate Operating System Courses. In: Brazilian Symposium on Computing Systems Engineering (SBESC). Curitiba, Brazil: IEEE Computer Society, 2017. pdf bib
Victor Martinez, Fabrice Dupros, Márcio Castro, Philippe O. A. Navaux. Performance Improvement of Stencil Computations for Multi-core Architectures based on Machine Learning. In: International Conference on Computational Science (ICCS). Zürich, Switzerland: Elsevier, 2017. pdf bib
Pedro H. Penna, Eduardo C. Inacio, Márcio Castro, Patricia D. M. Plentz, Henrique C. Freitas, François Broquedis, Jean-François Méhaut. Assessing the Performance of the SRR Loop Scheduler with Irregular Workloads. In: International Conference on Computational Science (ICCS). Zürich, Switzerland: Elsevier, 2017. pdf bib
Alyson Deives Pereira, Rodrigo Caetano O. Rocha, Márcio Castro, Luís Fabrício W. Góes, Mario Antonio Ribeiro Dantas. Enabling efficient stencil code generation in OpenACC. In: International Conference on Computational Science (ICCS). Zürich, Switzerland: Elsevier, 2017. pdf bib
Felipe Volpato, Madalena P. da Silva, Alexandre L. Gonçalves, Márcio Castro, Mario A. R. Dantas. Provisioning and Delivering Sepsis Data Supported by an Enhanced SDN Environment. In: IEEE International Symposium on Computer-Based Medical Systems (CBMS). Thessaloniki, Greece: IEEE Computer Society, 2017. pdf bib
Edson L. Padoin, Laércio L. Pilla, Márcio Castro, Philippe O. A. Navaux, Jean-François Méhaut. Exploration of Load Balancing Thresholds to Save Energy on Iterative Applications. In: Latin American High Performance Computing Conference (CARLA). Mexico City, Mexico: Springer, 2017. pdf bib
Edson L. Padoin, Márcio Castro, Laércio L. Pilla, Philippe O. A. Navaux, Jean-François Méhaut. Saving Energy by Exploiting Residual Imbalances on Iterative Applications. In: High Performance Computing Conference (HiPC). Goa, India: IEEE Computer Society, 2014. pdf bib
Best Paper Márcio Castro, Fabrice Dupros, Emilio Francesquini, Jean-François Méhaut, Philippe O. A. Navaux. Energy Efficient Seismic Wave Propagation Simulation on a Low-power Manycore Processor. In: International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Paris, France: IEEE Computer Society, 2014. pdf bib
Fernando Rui, Márcio Castro, Dalvan Griebler, Luiz Gustavo Fernandes. Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications. In: Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP). Turin, Italy: IEEE Computer Society, 2014. pdf bib
Márcio Castro, Emilio Francesquini, Thomas M. Nguélé, Jean-François Méhaut. Analysis of Computing and Energy Performance of Multicore, NUMA, and Manycore Platforms for an Irregular Application. In: Workshop on Irregular Applications: Architectures & Algorithms (IA3) - Supercomputing Conference (SC). Denver, USA: ACM, 2013. pdf bib
Márcio Castro, Pedro Velho, Luiz Gustavo Fernandes, Jean-François Méhaut. A Parallel Approach to Fine-tune Field Emission Displays Using a Genetic Algorithm. In: Latin American Conference on High-Performance Computing (CLCAR). San José, Costa Rica, 2013. bib
Márcio Castro, Luís Fabricio W. Góes, Christiane P. Ribeiro, Murray Cole, Marcelo Cintra, Jean-François Méhaut. A Machine Learning-Based Approach for Thread Mapping on Transactional Memory Applications. In: High Performance Computing Conference (HiPC). Bangalore, India: IEEE Computer Society, 2011. pdf bib
Mateus Raeder, Dalvan Griebler, Neumar Ribeiro, Luiz Gustavo Fernandes, Márcio Castro. A Hybrid Parallel Version of ICTM for Cluster of NUMA Machines. In: IADIS International Conference on Applied Computing (AC). Rio de Janeiro, Brazil: IADIS Press, 2011. pdf bib
Best Paper Poliana Oliveira, Henrique C. Freitas, Christiane P. Ribeiro, Márcio Castro, Vania Marangonzova-Martin, Jean-François Méhaut. Performance Evaluation of WiNoCs for Parallel Workloads Based on Collective Communications. In: IADIS International Conference on Applied Computing (AC). Rio de Janeiro, Brazil: IADIS Press, 2011. pdf bib
Best Paper Christiane P. Ribeiro, Márcio Castro, Jean-François Méhaut, Vania Marangonzova-Martin, Henrique Freitas, Carlos A. P. S. Martins. Investigating the Impact of CPU and Memory Affinity on Multi-core Platforms: A Case Study of Numerical Scientific Multithreaded Applications. In: IADIS International Conference on Applied Computing (AC). Rio de Janeiro, Brazil: IADIS Press, 2011. pdf bib
Márcio Castro, Kiril Georgiev, Vania Marangonzova-Martin, Jean-François Méhaut, Luiz Gustavo Fernandes, Miguel Santana. Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures. In: Euromicro International Conference on Parallel, Distributed and Network-Based Computing (PDP). Aya Napa, Cyprus: IEEE Computer Society, 2011. pdf bib
Christiane P. Ribeiro, Jean-François Méhaut, Alexandre Carissimi, Márcio Castro, Luiz Gustavo Fernandes. Memory Affinity for Hierarchical Shared Memory Multiprocessors. In: International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). São Paulo, Brazil: IEEE Computer Society, 2009. pdf bib
2nd Best Paper Christiane P. Ribeiro, Márcio Castro, Luiz Gustavo Fernandes, Jean-François Méhaut, Alexandre Carissimi, Fabrice Dupros. High Performance Applications on Hierarchical Shared Memory Multiprocessors. In: Colloque d’Informatique: Brésil / INRIA, Coopérations, Avancées et Défis (COLIBRI). Bento Gonçalves, Brazil: Brazilian Computer Society, 2009. pdf bib
Márcio Castro, Luiz Gustavo Fernandes, Christiane P. Ribeiro, Jean-François Méhaut, Marilton S. de Aguiar. NUMA-ICTM: A Parallel Version of ICTM Exploiting Memory Placement Strategies for NUMA Machines. In: International Parallel and Distributed Processing Symposium (IPDPS). Rome, Italy: IEEE Computer Society, 2009. pdf bib
Márcio Castro. Improving the Performance of Transactional Memory Applications on Multicores: A Machine Learning-based Approach. Ph.D. Thesis at Université de Grenoble Alpes (UGA), 2012. pdf bib
Márcio Castro. NUMA-ICTM: Uma Versão Paralela do ICTM Explorando Estratégias de Alocação de Memória para Máquinas NUMA. M.Sc. Dissertation at Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), 2009. pdf bib
Márcio Castro, Gustavo Serra. Paralelização da Simulação da Trajetória de Elétrons em um Dispositivo FED. Final project presented for the B.Sc. Degree at Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), 2006.