Skip to content

PropCUDA

class PropCUDA(
    equation,
    shape,
    source_type=[],
    receiver_type=[],
    abcn=50,
    free_surface=False,
    dh=10.0,
    dt=0.002,
    dev=None,
    use_ckpt=False,
    ckpt_chunks=100,
    ckpt_mode="chunk",
    ckpt_num=0,
    pml_type="spml",
    nt=-1,
    B=1,
    allow_growth=True,
    full_mode="full",
    boundary_saving_config=None,
)

Implementation:

  • src/sweep/propagator/cuda.py

Compiled CUDA propagator backed by equation-specific bindings from sweep._C.

In current Torch-side examples, this backend is often reached through PropTorch(..., backend="cuda"). PropCUDA remains the lower-level CUDA class when you want to work with CUDA-specific runtime details directly.

Note

PropCUDA is the backend with the most runtime-specific behavior: anisotropic dh, reusable buffers, boundary saving, recursive checkpointing, and RTM all live here.

Parameters

  • equation (equation instance): Equation instance whose compiled CUDA binding will be used. PropCUDA expects the equation to expose _C(), and optionally _C_rtm() for RTM. It also expects a cuda_layout spec for CUDA runtime buffer allocation.
  • shape (tuple[int, ...]): Physical model shape before absorbing boundaries are added. Use (nz, nx) in 2D and (nz, ny, nx) in 3D.
  • source_type (list[str], optional): Wavefield names used for source injection. These names are resolved through the equation field metadata, so aliases may also be accepted. If omitted, PropCUDA uses equation.default_source_fields.
  • receiver_type (list[str], optional): Wavefield names sampled at receiver locations. These are also resolved through the equation field metadata and default to equation.default_receiver_fields when omitted.
  • abcn (int, optional): Absorbing boundary width.
  • free_surface (bool, optional): Whether the top boundary is treated as a free surface. This affects coordinate shifts before entering the CUDA kernels.
  • dh (float or tuple[float, ...], optional): Grid spacing. Unlike the other propagators, PropCUDA supports anisotropic tuple-valued spacing: (dz, dx) in 2D and (dz, dy, dx) in 3D.
  • dt (float, optional): Time step in seconds.
  • dev (device, optional): Execution device for tensors and reusable CUDA buffers.
  • use_ckpt (bool, optional): Enables checkpoint-based memory reduction in the CUDA path.
  • ckpt_chunks (int, optional): Checkpoint interval used in chunk checkpointing.
  • ckpt_mode (str, optional): Checkpoint strategy. Supported values here are "chunk" and "recursive".
  • ckpt_num (int, optional): Number of persistent checkpoints used by recursive checkpointing.
  • pml_type (str, optional): PML implementation passed into equation setup.
  • nt (int, optional): Stored time-step count. The actual working value is normally inferred from the runtime wavelet.
  • B (int, optional): Initial batch capacity for reusable runtime buffers.
  • allow_growth (bool, optional): If True, runtime buffers may grow when a larger batch is seen later. If False, larger batches than the preallocated capacity raise an error.
  • full_mode (str, optional): Stored on the base class and not currently the main runtime switch for this backend.
  • boundary_saving_config (dict, optional): Configuration for saving forward boundary values instead of storing all wavefields. Normalized form: {"enabled": False, "storage": "gpu", "transfer_interval": 1, "pinned_memory": False}.

Supported keys are:

  • enabled (bool): Whether boundary saving is enabled.
  • storage (str): Where saved boundary values live. Supported values are "gpu" and "cpu".
  • transfer_interval (int): How often boundary values are transferred when CPU storage is used. This must be at least 1. When storage="gpu", the effective interval is forced to 1.
  • pinned_memory (bool): Whether to use pinned host memory when storage="cpu". When storage="gpu", this is effectively disabled.

Equation Requirements

For the CUDA path, an equation should expose:

  • _C(): compiled forward/backward CUDA entry points
  • _C_rtm() when RTM support exists
  • cuda_layout: a CUDALayoutSpec instance describing CUDA buffer layout

For field discovery, equations may also expose:

  • available_fields()
  • describe_field(name)
  • default_source_fields
  • default_receiver_fields

Forward Parameters

forward(
    wavelet,
    sources,
    receivers,
    models=None,
    source_encoding=False,
    adj=False,
    return_wavefield=False,
    use_boundary_saving=None,
    boundary_saving_config=None,
    **kwargs,
)
  • wavelet (array-like): Source time function. Accepted layouts are (nt,), (B, nt), (B, nsrc, nt), and the source-encoding super-shot layout (1, nsrc, nt).
  • sources (array-like): Source coordinates. Accepted layouts are (B, dim) and (B, nsrc, dim), including (1, nsrc, dim) for a source-encoding super-shot.
  • receivers (array-like): Receiver coordinates. This path expects batched receiver coordinates as well, typically (B, nreceivers, dim) and (1, nreceivers, dim) for a source-encoding super-shot.
  • models (list[torch.Tensor], optional): List of model tensors in the exact order required by equation.models. They are padded and expanded across the active batch before being passed into the binding.
  • source_encoding (bool, optional): If True, runs with a single encoded batch instead of one batch element per shot. PropCUDA also auto-detects source encoding when the runtime inputs use (1, nsrc, nt), (1, nsrc, dim), and (1, nreceivers, dim).
  • adj (bool, optional): Adjoint-style forward switch.
  • return_wavefield (bool, optional): Present in the signature, but the current main CUDA forward path still returns only the synthetic data.
  • use_boundary_saving (bool, optional): Runtime override for enabling boundary saving.
  • boundary_saving_config (dict, optional): Runtime override for the boundary-saving policy.

In the shape descriptions above:

  • B is the runtime batch size
  • nsrc is the number of sources inside one batch element
  • dim is 2 in 2D and 3 in 3D

RTM Parameters

  • adjoint_source (array-like): Input data for reverse-time migration. Accepted layouts are (B, nt, nrec[, 1]) and (B, nrec, nt).

Return Value

  • forward(...): synthetic data record
  • rtm(...): (syn, image, source_illumination, receiver_illumination)