Machine Learning for Environmental Engineering

Explore and test state-of-the-art machine learning methods applied to environmental sciences and engineering challenges.

Enroll Add to Favorites

Modules/Weeks

Weekly Effort

2–5 hours

Discipline

AI & Data Science

School

Columbia Engineering

Format

Self-Paced Online

Cost

Free

$20 certificate (optional)

Course Description What You Will Learn Instructors

This course aims to develop a solid understanding of state-of-the-art machine learning methods and their application to problems in environmental science and engineering. Potential areas of application include, but are not limited to, remote sensing, environmental modeling, and geophysical fluid dynamics.

The first part of the course will focus on applying "vanilla" machine learning algorithms to simple problems, while introducing key tools such as PyTorch and Jupyter notebooks. We will cover feedforward neural networks, shallow versus deep architectures, regression trees, random forests, and XGBoost through hands-on examples. In parallel, we will discuss essential machine learning concepts, including hyperparameter tuning, batch sizes, optimization techniques, and assumptions about data distributions.

Next, the course will explore more advanced neural network architectures, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs).

The course will cover probabilistic models and uncertainty quantification using Gaussian Processes, Bayesian Neural Networks, and ensemble methods, emphasizing the distinction between aleatoric and epistemic uncertainties.

Finally, we will transition to cutting-edge topics in generative AI, with a focus on variational autoencoders, diffusion models, as well as transfer learning and metalearning.

Free Enrollment with Optional Certificate

This course is available at no cost and includes full access to all instructional materials, videos, and assessments. Learners who successfully complete all course requirements will have the option to purchase a verified certificate of completion for $20.

Certificate Sample

Course Prerequisites

Computer language: Python
College-level linear algebra

What You Will Learn

By the end of this course, learners will be able to:

Experiment with basic machine learning algorithms using PyTorch and notebooks, focusing on "vanilla" models applied to simple environmental problems.
Gain practical experience on feedforward neural networks, shallow vs. deep networks, regression trees, random forests, and XGBoost through hands-on examples.
Understand core machine learning techniques such as hyperparameter tuning, batch sizing, optimization techniques, and distributional assumptions underlying different algorithms.
Explore intermediate deep learning models, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), and their applications to environmental datasets.
Analyze uncertainty in machine learning, using Gaussian Processes, Bayesian Neural Networks, and ensemble methods for aleatoric and epistemic uncertainties.
Explore advanced machine learning topics, including generative models (e.g., variational autoencoders, diffusion models), as well as transfer learning and metalearning techniques.

Module 1: Regression Trees

Bagging, boosting, random forests
Gradient boosting
Limiting overfitting

Module 2: Neural Networks and Shallow vs Deep Networks

Shallow feedforward
Backpropagation
Deep networks
Training and overfitting

Module 3: Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)

Images and Convolutional Neural Networks
Pooling, convolutions
Recurrent neural networks, Long-Short Term Memory

Module 4: Gaussian Processes

Bayesian formulation
Families of Gaussian processes

Module 5: Generative AI

Autoencoder
Variational autoencoder
GAN - Generative adversarial networks
Diffusion models

Please note: Lecture videos will be released weekly through December 3, 2025. Please check back each week for new content. Each module includes only one quiz. No new lecture videos will be posted the week of November 24 due to the Thanksgiving holiday.

Some course materials are not available for public viewing due to licensing or privacy considerations. These items may appear as unavailable (e.g. 404 not found). If you encounter unavailable content, please note that you will need to explore alternative resources independently to support your learning. Thank you for your understanding.

Instructors

Pierre Gentine

Maurice Ewing and J. Lamar Worzel Professor of Geophysics; Professor of Earth and Environmental Engineering; Professor at the Climate School; LEAP Director

Pierre Gentine is a Professor in the department of Earth and Environmental Engineering and in the department of Earth and Environmental Sciences. He is director of the National Science Foundation Science and Technology Center "Learning the Earth with Artificial intelligence and Physics" and a director of the Graduate Program in Earth and Environmental Engineering. Dr. Gentine and his group investigate the multiscale nature of the continental hydrologic and carbon cycle, with observations (remote sensing and in situ), models and machine learning.

Dr. Gentine received his undergraduate degree from SupAéro, in France. He earned his PhD in Civil and Environmental Engineering at MIT in 2010. He joined the faculty at Columbia in 2009 as an instructor in applied mathematics and then as a tenure track assistant professor in Earth and Environmental Engineering in 2011.