🌟 Overview
- Event: eScience 2025
- Presenter: Didier Barradas Bautista
🗂️ Table of Contents
🧑💻 Introduction
This tutorial offers a practical exploration of advanced data science techniques using GPUs, focusing on NVIDIA's RAPIDS libraries (cuDF, cuML, cuGraph) and distributed computing with Dask and Skorch. Attendees will learn to leverage these technologies to enhance the performance of machine learning models.
⚙️ Setup Instructions
Coming soon: Instructions for setting up your environment (Google Colab, RAPIDS, Dask, etc.).
📂 Tutorial Materials
- Hands-on notebooks (coming soon)
- Example datasets
- Code samples for GPU-accelerated workflows
🔗 Resources
📬 Contact
For questions about the tutorial or networking, contact me on GitHub or LinkedIn.
For questions about the conferece, contact Tutorial Chairs.
📝 Brief Outline of Topics
- Introduction to GPU-accelerated data science
- Overview of RAPIDS libraries (cuDF, cuML, cuGraph), and distributed computing libraries (Dask and Skorch)
- Hands-on examples and use cases
📅 Detailed Agenda
- Introduction and Motivation (40 minutes)
- (20 min.) – Theory Importance of GPU acceleration in data science
- (20 min.) – Theory Overview of RAPIDS libraries
- Hands-on with RAPIDS Libraries (1 hour)
- (30 min.) – Hands-On Data manipulation with cuDF, accelerated pandas and Dask
- (30 min.) – Hands-On GPU-accelerated machine learning with cuML
- Break (10 minutes)
- Advanced Use Cases and Applications (1 hour)
- (10 min.) – Theory Introduction Hyperparameter Optimization
- (25 min.) – Hands-On Hyperparameter tuning with Dask-ML
- (25 min.) – Hands-On GPU-accelerated HPO with Skorch
- Q&A and Wrap-up (10 minutes)