Skip to content

Memory Method Compare

Source directory:

  • examples/reducingmemory/method_compare/

This example group compares memory-saving methods that preserve the standard shot-based objective while changing how wavefields are stored or recomputed.

Main Methods

  • PyTorch eager + compile
  • PyTorch eager + ckpt
  • PyTorch eager + compile + ckpt
  • CUDA boundary saving
  • CUDA ckpt

Benchmark Scripts

  • common_benchmark.py: shared benchmark loop, summary plotting, and gradient plotting
  • acoustic2d_memory_benchmark.py: 2D acoustic benchmark setup
  • acoustic3d_memory_benchmark.py: 3D acoustic benchmark setup

What It Compares

  • eager full
  • eager compile full
  • eager checkpointing with different chunk sizes
  • eager compile plus checkpointing
  • cuda full
  • cuda boundary saving on GPU
  • cuda boundary saving on CPU with transfer tuning
  • cuda chunk checkpointing
  • cuda recursive checkpointing

Practical Guidance

  • if you need the simplest eager-side memory reduction, start with PyTorch checkpointing
  • if you are already on the CUDA backend and want the best memory/runtime tradeoff, test CUDA boundary saving first
  • if boundary saving is still too memory hungry, try CUDA checkpointing
  • CPU boundary saving is the most aggressive for GPU memory reduction, but often costs the most wall time

Example Figures

The following figures come from the 2D and 3D acoustic memory benchmarks.

summary.png: the 2D acoustic benchmark summary figure comparing methods by runtime and memory-related metrics.

Memory benchmark summary

gradients.png: the 2D side-by-side gradient comparison, useful for checking whether different memory-saving methods still produce consistent inversion gradients.

Memory benchmark gradient comparison

summary.png from the 3D benchmark: a higher-cost comparison on the 3D acoustic setup, showing how the same ideas behave when model size and wavefield state are much larger.

3D memory benchmark summary

gradients.png from the 3D benchmark: a 3D gradient comparison used to check whether the reduced-memory methods still match the reference gradient pattern.

3D memory benchmark gradient comparison