Summary and Schedule
This workshop teaches core workflows for building, training, and tuning ML/AI models in Google Cloud’s Vertex AI platform. Participants learn to set up data, configure Vertex AI Workbench notebooks, launch training and tuning jobs, and optimize resource costs effectively within GCP. The workshop also includes a section on building retrieval-augmented generation (RAG) pipelines using Gemini models.
| Setup Instructions | Download files required for the lesson | |
| Duration: 00h 00m | 1. Overview of Google Cloud for Machine Learning |
What problem does GCP aim to solve for ML researchers? How does using a notebook as a controller help organize ML workflows in the cloud? How does GCP compare to AWS for ML workflows? |
| Duration: 00h 11m | 2. Data Storage: Setting up GCS |
How can I store and manage data effectively in GCP for Vertex AI
workflows? What are the advantages of Google Cloud Storage (GCS) compared to local or VM storage for machine learning projects? |
| Duration: 00h 31m | 3. Notebooks as Controllers |
How do you set up and use Vertex AI Workbench notebooks for machine
learning tasks? How can you manage compute resources efficiently using a “controller” notebook approach in GCP? |
| Duration: 01h 01m | 4. Accessing and Managing Data in GCS with Vertex AI Notebooks |
How can I load data from GCS into a Vertex AI Workbench
notebook? How do I monitor storage usage and costs for my GCS bucket? What steps are involved in pushing new data back to GCS from a notebook? |
| Duration: 01h 31m | 5. Using a GitHub Personal Access Token (PAT) to Push/Pull from a Vertex AI Notebook |
How can I securely push/pull code to and from GitHub within a Vertex AI
Workbench notebook? What steps are necessary to set up a GitHub PAT for authentication in GCP? How can I convert notebooks to .py files and ignore .ipynb files in version
control?
|
| Duration: 02h 06m | 6. Training Models in Vertex AI: Intro |
What are the differences between training locally in a Vertex AI
notebook and using Vertex AI-managed training jobs? How do custom training jobs in Vertex AI streamline the training process for various frameworks? How does Vertex AI handle scaling across CPUs, GPUs, and TPUs? |
| Duration: 02h 38m | 7. Training Models in Vertex AI: PyTorch Example |
When should you consider a GPU (or TPU) instance for PyTorch training in
Vertex AI, and what are the trade‑offs for small vs. large
workloads? How do you launch a script‑based training job and write all artifacts (model, metrics, logs) next to each other in GCS without deploying a managed model? |
| Duration: 03h 08m | 8. Hyperparameter Tuning in Vertex AI: Neural Network Example |
How can we efficiently manage hyperparameter tuning in Vertex
AI? How can we parallelize tuning jobs to optimize time without increasing costs? |
| Duration: 04h 08m | 9. Resource Management & Monitoring on Vertex AI (GCP) |
How do I monitor and control Vertex AI, Workbench, and GCS costs
day‑to‑day? What specifically should I stop, delete, or schedule to avoid surprise charges? How can I automate cleanup and set alerting so leaks get caught quickly? |
| Duration: 05h 13m | 10. Retrieval-Augmented Generation (RAG) with Vertex AI |
How do we go from “a pile of PDFs” to “ask a question and get a cited
answer” using Google Cloud tools? What are the key parts of a RAG system (chunking, embedding, retrieval, generation), and how do they map onto Vertex AI services? How much does each part of this pipeline cost (VM time, embeddings, LLM calls), and where can we keep it cheap? Can we use open models / Hugging Face instead of Google models, and what does that change? |
| Duration: 05h 43m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Setup (Complete Before the Workshop)
Before attending this workshop, you’ll need to complete a few setup steps to ensure you can follow along smoothly. The main requirements are:
-
GitHub Account – Create an account and be ready to
fork a repository.
-
GCP Access – Use a shared Google Cloud
project (if attending the Machine Learning Marathon or Research
Bazaar) or sign up for a personal GCP Free Tier account.
-
Titanic Dataset – Download the required CSV files
in advance.
- (Optional) Google Cloud Skills Boost — For a broader overview of GCP, visit the Getting Started with Google Cloud Fundamentals course.
Details on each step are outlined below.
1. GitHub Account
You will need a GitHub account to access the code provided during
this lesson. If you don’t already have a GitHub account, please sign up for GitHub to create a free
account.
Don’t worry if you’re a little rusty on using GitHub or git; we will
only use a couple of git commands during the lesson, and the instructor
will guide you through them.
2. GCP Access
There are two ways to get access to GCP for this lesson. Please wait for a pre-workshop email from the instructor to confirm which option to choose.
Option 1) Shared Google Cloud Project
If you are attending this lesson as part of the Machine Learning Marathon or Research Bazaar, the instructors will provide access to a shared GCP project for all attendees. You do not need to set up your own account. The instructors will add you to a shared account.
What to expect:
- During the lesson, you will log in with your Google account
credentials.
- This setup ensures that all participants have a consistent
environment and avoids unexpected billing for attendees.
- Please use shared credits responsibly — they are limited and reused
for future training events.
- Stay within the provided exercises and avoid launching additional
compute-heavy workloads (e.g., training large language models).
- Do not enable additional APIs or services unless instructed.
- Stay within the provided exercises and avoid launching additional
compute-heavy workloads (e.g., training large language models).
Option 2) GCP Free Tier — Skip If Using Shared Project
If you are attending this lesson as part of the Machine Learning Marathon or Research Bazaar, you can skip this step. Otherwise, please follow these instructions:
- Go to the GCP Free Tier
page and click Get started for free.
- Complete the signup process. The Free Tier includes a $300 credit
valid for 90 days and ongoing free usage for some smaller
services.
- Once your account is ready, log in to the Google Cloud Console.
- During the lesson, we will enable only a few APIs (Compute Engine, Cloud Storage, and Notebooks).
Following the lesson should cost well under $15 total if you are using your own credits.
3. Download the Data
For this workshop, you will need the Titanic dataset, which can be used to train a classifier predicting survival.
Please download the following zip file (Right-click → Save as):
data.zipExtract the zip folder contents (Right-click → Extract all on Windows; double-click on macOS).
-
Save the two data files (train and test) somewhere easy to access, for example:
-
~/Downloads/data/titanic_train.csv ~/Downloads/data/titanic_test.csv
-
In the first episode, you will create a Cloud Storage bucket and upload this data to use with your notebook.
4. (Optional) Google Cloud Skills Boost — Getting Started with Google Cloud Fundamentals
If you want a broader introduction to GCP before the workshop, consider exploring the Getting Started with Google Cloud self-paced learning path. It covers the basics of the Google Cloud environment, including project structure, billing, IAM (Identity and Access Management), and common services like Compute Engine, Cloud Storage, and BigQuery. This step is optional but recommended for those that want a broader overview of GCP before diving into ML/AI use-cases.