Cone Beam FDK Reconstruction

This example demonstrates 3D cone-beam FDK (Feldkamp-Davis-Kress) reconstruction on a 3D Shepp-Logan phantom using the public call chain from diffct:

ConeProjectorFunction.apply    -- differentiable cone forward projection
cone_cosine_weights            -- 1/r^2 cosine pre-weighting
ramp_filter_1d                 -- row-wise ramp filter along detector-u
angular_integration_weights    -- per-view integration weights
cone_weighted_backproject      -- voxel-driven FDK backprojection gather

The accompanying source file is examples/fdk_cone.py.

Overview

The FDK algorithm is the standard analytical method for 3D cone-beam CT reconstruction on a circular source orbit. This example shows how to:

  • Configure the 3D cone-beam geometry (source, detector, orbit).

  • Generate cone-beam projections of a 3D Shepp-Logan phantom.

  • Apply FDK cosine pre-weighting, ramp filtering (with optional zero-padding and frequency-domain windowing), and angular-integration weights.

  • Run the dedicated voxel-driven FDK gather kernel through cone_weighted_backproject to reconstruct the volume, already amplitude-calibrated.

Mathematical Background

Cone-Beam Geometry

Cone-beam CT uses a point X-ray source and a 2D detector. Key parameters:

  • \(D\) (sid): Source-to-Isocenter Distance. Source position at angle \(\phi\) is \(\vec{r}_s(\phi) = (-D \sin\phi, D \cos\phi, 0)\).

  • \(D_{sd}\) (sdd): Source-to-Detector Distance. The detector plane is perpendicular to the central ray, sdd away from the source. The magnification at isocenter is \(D_{sd}/D\).

  • \((u, v)\): detector in-plane and axial coordinates.

  • \((du, dv)\): detector cell pitch.

3D Forward Projection

The cone-beam projection at source angle \(\phi\) and detector position \((u, v)\) is

\[p(\phi, u, v) = \int_0^{\infty} f\!\left(\vec{r}_s(\phi) + t \cdot \vec{d}(\phi, u, v)\right) \, dt\]

where \(\vec{d}\) is the unit ray direction from source to detector cell. ConeProjectorFunction implements this integral via a Siddon ray-march with trilinear interpolation.

FDK Algorithm

The Feldkamp-Davis-Kress formula reconstructs an approximation of the 3D volume in three steps:

  1. Cosine pre-weighting of the raw projection:

    \[p_w(\phi, u, v) = \frac{D_{sd}}{\sqrt{D_{sd}^2 + u^2 + v^2}}\, p(\phi, u, v).\]
  2. Row-wise ramp filtering along the detector-u direction:

    \[p_f(\phi, u, v) = \mathcal{F}_u^{-1}\bigl\{\,|\omega_u|\, \mathcal{F}_u\{p_w(\phi, u, v)\}\bigr\}.\]

    Each detector row is filtered independently. ramp_filter_1d lets you choose between Ram-Lak, Hann, Hamming, Cosine and Shepp-Logan shapes (see below).

  3. Voxel-driven backprojection with the classical FDK distance weight:

    \[f(x, y, z) = \frac{1}{2} \int_0^{2\pi} \frac{D^2}{U(\phi, x, y)^2}\, p_f\!\left(\phi, u_{xyz}, v_{xyz}\right) d\phi,\]

    where \(U(\phi, x, y) = D + x \sin\phi - y \cos\phi\) is the distance from the source to the voxel along the central ray direction, and the voxel’s projected detector coordinates are

    \[u_{xyz} = \frac{D_{sd}}{U}\,(x \cos\phi + y \sin\phi), \qquad v_{xyz} = \frac{D_{sd}}{U}\, z.\]

    cone_weighted_backproject dispatches to a dedicated voxel-driven gather kernel for this step. For every voxel it computes the projected \((u, v)\), bilinearly samples the filtered sinogram, multiplies by \((D/U)^2\), and accumulates across views. The Siddon-based autograd cone backprojector is reserved for the differentiable path used by iterative reconstruction and is not touched by this analytical pipeline.

Implementation Steps

The main() function in examples/fdk_cone.py goes through eight stages:

  1. Volume geometry – set Nx, Ny, Nz, voxel_spacing.

  2. Source trajectory – sample num_views angles on \([0, 2\pi)\).

  3. Detector geometry – set det_u, det_v, du, dv, detector_offset_u, detector_offset_v, sdd, sid.

  4. CUDA transfer – move phantom and angles to the GPU.

  5. Forward projection via ConeProjectorFunction.apply.

  6. FDK pipeline:

    1. cone_cosine_weights -> pre-weighted sinogram.

    2. ramp_filter_1d with sample_spacing=du, pad_factor=2, window="hann".

    3. angular_integration_weights with redundant_full_scan=True.

    4. cone_weighted_backproject -> raw FDK volume.

    5. F.relu -> optional non-negativity clamp.

  7. Quantitative summary – print raw MSE, clamped MSE, raw / clamped reconstruction ranges.

  8. Visualization – mid-slice phantom, mid-view sinogram, mid-slice reconstruction.

Ramp Filter Options

ramp_filter_1d accepts several high-precision parameters (all optional and fully backward compatible with the historical ramp_filter_1d(sino, dim=1) call):

  • sample_spacing – physical detector pitch along dim. Set this to du for cone-beam FDK so the filter output is in physical units and the reconstructed amplitude is detector-pitch independent.

  • pad_factor – zero-pad the detector dimension to pad_factor * N before the FFT. 2 is recommended for FDK.

  • window – frequency-domain apodization. Options: None / "ram-lak" (unwindowed), "hann" (used in this example), "hamming", "cosine", "shepp-logan".

  • use_rfft – use torch.fft.rfft for real inputs (default True, ~2x faster than the complex path).

3D Shepp-Logan Phantom

The 3D phantom extends the 2D version with 10 ellipsoids:

  • A large outer skull ellipsoid.

  • Mid-sized ellipsoids for brain tissue contrast.

  • Small ellipsoids representing ventricles and high-contrast lesions.

Each ellipsoid is described by a center \((x_0, y_0, z_0)\), semi-axes \((a, b, c)\), a rotation angle \(\phi\) around the z-axis, and an attenuation amplitude.

FDK Approximations and Limitations

FDK uses a few approximations on circular orbits:

  • Circular source trajectory (no helical motion).

  • Row-wise 1D ramp filtering rather than a 2D filter.

  • Small-cone approximation – reconstruction quality degrades as the cone angle grows.

These introduce cone-beam artifacts at large cone angles, but FDK remains the default analytical method due to its simplicity and efficiency.

Source

3D Cone Beam FDK Example
  1"""Circular-orbit cone-beam FDK reconstruction example.
  2
  3This script demonstrates the full analytical FDK pipeline on a 3D
  4Shepp-Logan phantom using ``diffct``. The call chain is:
  5
  6    ConeProjectorFunction.apply    # forward projection (sinogram)
  7    cone_cosine_weights            # 1/r^2 cosine pre-weight
  8    ramp_filter_1d                 # row-wise ramp filter along u
  9    angular_integration_weights    # per-view integration weights
 10    cone_weighted_backproject      # voxel-driven FDK gather
 11
 12Every geometry and reconstruction parameter below is annotated with
 13its meaning, units, and available options so the script can be used
 14as a reference for configuring the other cone-beam entry points in
 15the library.
 16"""
 17
 18import math
 19
 20import matplotlib.pyplot as plt
 21import numpy as np
 22import torch
 23import torch.nn.functional as F
 24
 25from diffct.differentiable import (
 26    ConeProjectorFunction,
 27    angular_integration_weights,
 28    cone_cosine_weights,
 29    cone_weighted_backproject,
 30    parker_weights,
 31    ramp_filter_1d,
 32)
 33
 34
 35def shepp_logan_3d(shape):
 36    """Build a 3D Shepp-Logan phantom.
 37
 38    Parameters
 39    ----------
 40    shape : tuple of int
 41        Volume shape as ``(Nz, Ny, Nx)`` in the same order the rest of the
 42        library uses. Values are clipped to ``[0, 1]``.
 43    """
 44    # Voxel grid normalized to [-1, 1] on each axis so the standard
 45    # Shepp-Logan ellipsoid parameters can be reused directly.
 46    zz, yy, xx = np.mgrid[: shape[0], : shape[1], : shape[2]]
 47    xx = (xx - (shape[2] - 1) / 2) / ((shape[2] - 1) / 2)
 48    yy = (yy - (shape[1] - 1) / 2) / ((shape[1] - 1) / 2)
 49    zz = (zz - (shape[0] - 1) / 2) / ((shape[0] - 1) / 2)
 50
 51    # Columns: x0, y0, z0, a, b, c, phi, _, _, amplitude.
 52    el_params = np.array(
 53        [
 54            [0,     0,       0,     0.69,  0.92,  0.81, 0,             0, 0,  1.0],
 55            [0,    -0.0184,  0,     0.6624, 0.874, 0.78, 0,            0, 0, -0.8],
 56            [0.22,  0,       0,     0.11,  0.31,  0.22, -np.pi / 10.0, 0, 0, -0.2],
 57            [-0.22, 0,       0,     0.16,  0.41,  0.28,  np.pi / 10.0, 0, 0, -0.2],
 58            [0,     0.35,   -0.15,  0.21,  0.25,  0.41, 0,             0, 0,  0.1],
 59            [0,     0.10,    0.25,  0.046, 0.046, 0.05, 0,             0, 0,  0.1],
 60            [0,    -0.10,    0.25,  0.046, 0.046, 0.05, 0,             0, 0,  0.1],
 61            [-0.08,-0.605,   0,     0.046, 0.023, 0.05, 0,             0, 0,  0.1],
 62            [0,    -0.605,   0,     0.023, 0.023, 0.02, 0,             0, 0,  0.1],
 63            [0.06, -0.605,   0,     0.023, 0.046, 0.02, 0,             0, 0,  0.1],
 64        ],
 65        dtype=np.float32,
 66    )
 67
 68    x_pos = el_params[:, 0][:, None, None, None]
 69    y_pos = el_params[:, 1][:, None, None, None]
 70    z_pos = el_params[:, 2][:, None, None, None]
 71    a_axis = el_params[:, 3][:, None, None, None]
 72    b_axis = el_params[:, 4][:, None, None, None]
 73    c_axis = el_params[:, 5][:, None, None, None]
 74    phi = el_params[:, 6][:, None, None, None]
 75    val = el_params[:, 9][:, None, None, None]
 76
 77    xc = xx[None, ...] - x_pos
 78    yc = yy[None, ...] - y_pos
 79    zc = zz[None, ...] - z_pos
 80
 81    c = np.cos(phi)
 82    s = np.sin(phi)
 83    # Each ellipsoid only rotates around the z-axis here.
 84    xp = c * xc - s * yc
 85    yp = s * xc + c * yc
 86    zp = zc
 87
 88    mask = (
 89        (xp ** 2) / (a_axis ** 2)
 90        + (yp ** 2) / (b_axis ** 2)
 91        + (zp ** 2) / (c_axis ** 2)
 92        <= 1.0
 93    )
 94
 95    phantom = np.sum(mask * val, axis=0)
 96    return np.clip(phantom, 0.0, 1.0)
 97
 98
 99def main():
100    # ------------------------------------------------------------------
101    # 1. Volume geometry
102    # ------------------------------------------------------------------
103    # ``Nx / Ny / Nz`` are the number of voxels along each axis. The
104    # phantom tensor has shape ``(Nz, Ny, Nx)`` which matches the
105    # ``(D, H, W)`` layout that every cone-beam routine in diffct
106    # expects. Making the volume larger increases the reconstruction
107    # grid but also increases memory and runtime roughly ``O(N^3)``.
108    Nx, Ny, Nz = 128, 128, 128
109    phantom_cpu = shepp_logan_3d((Nz, Ny, Nx))
110
111    # ``voxel_spacing`` is the physical size of one voxel in the same
112    # length unit used by ``du``, ``dv``, ``sdd`` and ``sid`` below
113    # (commonly millimeters). All geometry math inside the CUDA kernels
114    # is done in voxel units, and physical spacings are divided by
115    # ``voxel_spacing`` internally, so only the *ratios* matter.
116    voxel_spacing = 1.0
117
118    # ------------------------------------------------------------------
119    # 2. Detector geometry
120    # ------------------------------------------------------------------
121    # (Listed before the trajectory so the short-scan coverage below can
122    # use the detector fan angle to compute ``pi + 2*gamma_max``.)
123    #
124    # ``det_u`` / ``det_v`` are the number of detector cells along the
125    # u (in-plane / horizontal) and v (axial / vertical) directions.
126    # ``du`` / ``dv`` are their physical spacings. Together they define
127    # the detector size in physical units:
128    #     detector_width_u  = det_u * du
129    #     detector_height_v = det_v * dv
130    # Make sure the detector is large enough that no ray that intersects
131    # the reconstructed field-of-view ever projects outside it; rays
132    # that miss the detector are zero-filled and create truncation
133    # artifacts.
134    det_u, det_v = 256, 256
135    du, dv = 1.0, 1.0
136
137    # Principal-ray offsets. A nonzero ``detector_offset_u`` models a
138    # detector that is shifted sideways relative to the ideal principal
139    # ray (useful for half-fan / offset-detector acquisitions). These
140    # values are in the same physical unit as ``du`` / ``dv``.
141    detector_offset_u = 0.0
142    detector_offset_v = 0.0
143
144    # ``sdd`` = source-to-detector distance, ``sid`` = source-to-
145    # isocenter distance, both in physical units. Their ratio
146    # ``sdd / sid`` is the geometric magnification at the detector.
147    # Typical clinical cone-beam systems have a magnification around
148    # 1.3 - 2.0; here we pick 1.5.
149    sdd = 900.0
150    sid = 600.0
151
152    # ------------------------------------------------------------------
153    # 3. Source trajectory (circular orbit)
154    # ------------------------------------------------------------------
155    # ``apply_parker`` selects between two supported trajectories:
156    #
157    #   False -> full 2*pi circular scan. Each ray is sampled twice
158    #            (once going one way, once the opposite), so the FDK
159    #            formula carries a 1/2 redundancy factor which is
160    #            absorbed by ``redundant_full_scan=True`` inside the
161    #            angular integration weights.
162    #
163    #   True  -> minimal short scan of length ``pi + 2*gamma_max``
164    #            where ``gamma_max = atan(u_max / sdd)`` is the maximum
165    #            fan angle. Every ray is sampled *at least once*, some
166    #            rays twice; the standard Parker window smoothly tapers
167    #            the duplicate regions so each ray's total contribution
168    #            integrates to pi (the same effective weight as the
169    #            full-scan case after the 1/2 redundancy factor), and
170    #            the angular weights use plain trapezoidal rule
171    #            (``redundant_full_scan=False``).
172    #
173    # Short scans are common on clinical C-arm and cone-beam systems
174    # where a full 2*pi rotation is mechanically impossible.
175    apply_parker = False
176
177    if apply_parker:
178        u_max = ((det_u - 1) * 0.5) * du + abs(detector_offset_u)
179        gamma_max = math.atan(u_max / sdd)
180        scan_range = math.pi + 2.0 * gamma_max
181    else:
182        scan_range = 2.0 * math.pi
183
184    # ``num_views`` is the number of projection angles sampled on the
185    # orbit. More views generally mean smaller angular aliasing and a
186    # cleaner FDK reconstruction, at the cost of memory and runtime
187    # linear in ``num_views``.
188    num_views = 360
189
190    # Angles in radians. ``endpoint=False`` avoids duplicating the start
191    # angle at the end of the sweep.
192    angles_np = np.linspace(
193        0.0, scan_range, num_views, endpoint=False
194    ).astype(np.float32)
195
196    # ------------------------------------------------------------------
197    # 4. Move everything to CUDA
198    # ------------------------------------------------------------------
199    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
200    phantom_torch = torch.tensor(
201        phantom_cpu, device=device, dtype=torch.float32
202    ).contiguous()
203    angles_torch = torch.tensor(angles_np, device=device, dtype=torch.float32)
204
205    # ------------------------------------------------------------------
206    # 4.5  Pick a forward projector backend
207    # ------------------------------------------------------------------
208    # ``ConeProjectorFunction`` / ``ConeBackprojectorFunction`` both accept
209    # a ``backend`` keyword that selects the underlying CUDA kernel family.
210    # The choice applies to both forward and adjoint, and each option ships
211    # with a matched scatter/gather pair so autograd and the standalone
212    # Backprojector Function work unchanged.
213    #
214    #   "siddon"           - 3D ray-driven Siddon traversal with
215    #                        trilinear interpolation. One thread per
216    #                        (view, det_u, det_v). Fastest forward.
217    #                        Pick this when you only need a forward
218    #                        projection and don't need a matched cell-
219    #                        integrated model.
220    #
221    #   "sf_tr"            - 3D voxel-driven separable-footprint with a
222    #                        trapezoidal transaxial (u) footprint and a
223    #                        rectangular axial (v) footprint evaluated
224    #                        at the voxel-centre magnification. **Mass-
225    #                        conserving per voxel**, closed-form
226    #                        integrated over each detector cell in
227    #                        both u and v. About 2x slower than siddon
228    #                        in our benchmarks. On analytical FDK with
229    #                        the matched SF gather backprojector
230    #                        (``cone_weighted_backproject(backend=
231    #                        "sf_tr")``) SF and Siddon VD produce
232    #                        visually equivalent edge profiles on
233    #                        Shepp-Logan and the walnut example at
234    #                        typical CBCT geometries; the "SF is sharper
235    #                        at sub-nominal" claim you see in the SF /
236    #                        LEAP literature only shows up in an extreme
237    #                        sub-nominal regime that isn't hit by the
238    #                        default geometries shipped here. The real
239    #                        reason to pick "sf_tr" is that you want a
240    #                        **cell-integrated forward model** for
241    #                        iterative reco / learned priors / sinogram
242    #                        losses.
243    #
244    #   "sf_tt"            - Same transaxial trapezoid as SF-TR but the
245    #                        axial footprint is ALSO a trapezoid, built
246    #                        from four ``(U_near, U_far) x (z_bot, z_top)``
247    #                        corner projections. Captures the variation
248    #                        of axial magnification inside a single voxel
249    #                        at large cone angles. In practice this
250    #                        refines the SF-TR result marginally at the
251    #                        cost of ~40% more runtime.
252    #
253    # All three backends are byte-accurate matched adjoint pairs on the
254    # autograd path (verified by ``tests/test_adjoint_inner_product.py``).
255    # The default is kept at "sf_tr" so the reader sees the SF path run
256    # end-to-end; switching to "siddon" gives a visually equivalent FDK
257    # reconstruction at this geometry.
258    projector_backend = "sf_tr"
259
260    # ------------------------------------------------------------------
261    # 5. Forward projection: volume -> sinogram
262    # ------------------------------------------------------------------
263    # ``ConeProjectorFunction`` is the differentiable cone-beam forward
264    # projector. It returns a (num_views, det_u, det_v) sinogram. This
265    # call is autograd-aware, so using the same function inside an
266    # iterative reconstruction loop is supported (see
267    # ``iterative_reco_cone.py``). ``backend`` selects the CUDA kernel
268    # family used for both the forward and its adjoint - see step 4.5
269    # above for the trade-offs.
270    sinogram = ConeProjectorFunction.apply(
271        phantom_torch,
272        angles_torch,
273        det_u,
274        det_v,
275        du,
276        dv,
277        sdd,
278        sid,
279        voxel_spacing,
280        detector_offset_u,
281        detector_offset_v,
282        0.0,                # center_offset_x
283        0.0,                # center_offset_y
284        0.0,                # center_offset_z
285        projector_backend,
286    )
287
288    # ==================================================================
289    # 6. FDK analytical reconstruction
290    # ==================================================================
291
292    # --- 6.0  Optional Parker redundancy weighting -------------------
293    # ``parker_weights`` returns a ``(num_views, det_u)`` tensor that
294    # tapers rays in the redundantly-sampled regions of a short scan.
295    # For a full 2*pi scan it detects the coverage and returns all
296    # ones (no-op), so this branch is safe to run unconditionally; we
297    # gate it on ``apply_parker`` just for clarity.
298    #
299    # The returned weight depends on the in-plane fan angle only, so
300    # we broadcast it across the v (axial) direction via ``unsqueeze(-1)``.
301    if apply_parker:
302        parker = parker_weights(
303            angles_torch,
304            det_u,
305            du,
306            sdd,
307            detector_offset=detector_offset_u,
308        )
309        sinogram = sinogram * parker.unsqueeze(-1)
310
311    # --- 6.1  Cosine pre-weight --------------------------------------
312    # Multiplies each detector pixel by ``sdd / sqrt(sdd^2 + u^2 + v^2)``,
313    # i.e. the cosine of the cone angle. This compensates for the
314    # extra path length that off-center rays traverse relative to the
315    # principal ray. The ``unsqueeze(0)`` broadcasts the 2D weight over
316    # every view.
317    weights = cone_cosine_weights(
318        det_u,
319        det_v,
320        du,
321        dv,
322        sdd,
323        detector_offset_u=detector_offset_u,
324        detector_offset_v=detector_offset_v,
325        device=device,
326        dtype=phantom_torch.dtype,
327    ).unsqueeze(0)
328    sino_weighted = sinogram * weights
329
330    # --- 6.2  1D ramp filter along the detector-u direction ----------
331    # ``ramp_filter_1d`` is a generic building block. For high-quality
332    # FDK reconstruction the recommended arguments are:
333    #
334    #   dim : int
335    #       The detector-u axis. For sinograms of shape
336    #       ``(views, u, v)`` that axis is ``1``.
337    #
338    #   sample_spacing : float
339    #       Physical detector-u spacing ``du``. The ramp filter is
340    #       rescaled by ``1 / sample_spacing`` so that the output is
341    #       in physical units, and the amplitude of the reconstruction
342    #       stays calibrated across different detector pitches.
343    #
344    #   pad_factor : int
345    #       Zero-pad the signal to ``pad_factor * N`` along ``dim``
346    #       before the FFT. Options:
347    #           1 (default)  - no padding, fastest, but circular
348    #                          convolution wrap-around can contaminate
349    #                          the edges of the reconstruction.
350    #           2 (typical)  - good balance for cone-beam FDK.
351    #           4 (strict)   - used when the object is very close to
352    #                          the detector edges.
353    #
354    #   window : str or None
355    #       Frequency-domain apodization applied on top of the bare
356    #       ramp. Available options are:
357    #           None / "ram-lak" - sharp Ram-Lak ramp, no smoothing.
358    #           "hann"           - Hann window, strongly suppresses
359    #                              high-frequency noise. Used below.
360    #           "hamming"        - slightly milder than Hann.
361    #           "cosine"         - equivalent to a half-cosine rolloff.
362    #           "shepp-logan"    - sinc rolloff, classical choice.
363    #
364    #   use_rfft : bool
365    #       Use ``torch.fft.rfft`` instead of ``fft`` when the input is
366    #       real. Default ``True``; set to ``False`` only if you need
367    #       the complex path (e.g. to reuse an already-complex buffer).
368    sinogram_filt = ramp_filter_1d(
369        sino_weighted,
370        dim=1,
371        sample_spacing=du,
372        pad_factor=2,
373        window="hann",
374    ).contiguous()
375
376    # --- 6.3  Per-view angular integration weights -------------------
377    # For a full 2*pi scan each view contributes ``pi / num_views`` to
378    # the FDK integral: the ``1/2`` redundancy factor of the FDK
379    # formula is absorbed inside ``redundant_full_scan=True``. For a
380    # Parker short scan the redundancy has already been handled by the
381    # Parker window above, so we fall back to a plain trapezoidal rule
382    # with ``redundant_full_scan=False`` (the 1/2 factor would then be
383    # incorrect).
384    d_beta = angular_integration_weights(
385        angles_torch, redundant_full_scan=(not apply_parker)
386    ).view(-1, 1, 1)
387    sinogram_filt = sinogram_filt * d_beta
388
389    # --- 6.4  Voxel-driven FDK backprojection -------------------------
390    # ``cone_weighted_backproject`` dispatches to one of three voxel-
391    # driven gather kernels based on ``backend``:
392    #
393    #   "siddon" (default) - bilinear gather: for each voxel compute
394    #                        its projected detector ``(u, v)``,
395    #                        bilinearly sample the filtered sinogram
396    #                        and accumulate ``(sid/U)^2 * sample``.
397    #   "sf_tr"            - LEAP-style chord-weighted separable-
398    #                        footprint gather: integrate the filtered
399    #                        sinogram over each voxel's transaxial
400    #                        trapezoidal footprint and axial
401    #                        rectangular footprint, weighted by the
402    #                        in-plane chord through the voxel and the
403    #                        ``sqrt(1+(v/sdd)^2)`` axial correction
404    #                        (matches the tilt==0 branch of LEAP's
405    #                        ``coneBeamBackprojectorKernel_SF`` in
406    #                        ``projectors_SF.cu``). Matches the SF-TR
407    #                        forward projector.
408    #   "sf_tt"            - same as "sf_tr" but the axial footprint
409    #                        is also a trapezoid built from four
410    #                        z-corner projections. Matches SF-TT.
411    #
412    # On Shepp-Logan and the walnut example all three backends give
413    # visually indistinguishable edge profiles and amplitude matches
414    # within ~1%; the SF backends are worth the ~2-3x forward cost
415    # only if you also want a cell-integrated *forward* model
416    # (iterative reco, learned priors, sinogram losses). Here we pass
417    # the same ``projector_backend`` we picked at step 4.5 so forward
418    # and backward stay consistent. Amplitude is calibrated inside
419    # the wrapper so the returned volume is ready to compare against
420    # ``phantom_torch`` directly.
421    reconstruction_raw = cone_weighted_backproject(
422        sinogram_filt,
423        angles_torch,
424        Nz,
425        Ny,
426        Nx,
427        du,
428        dv,
429        sdd,
430        sid,
431        voxel_spacing=voxel_spacing,
432        detector_offset_u=detector_offset_u,
433        detector_offset_v=detector_offset_v,
434        backend=projector_backend,
435    )
436
437    # Optional non-negativity clamp. FDK can produce small negative
438    # values near sharp edges because the ramp filter has negative
439    # lobes in the spatial domain; for visualization and quantitative
440    # comparison against the non-negative phantom we clamp with a ReLU.
441    reconstruction = F.relu(reconstruction_raw)
442
443    # ------------------------------------------------------------------
444    # 7. Quantitative summary
445    # ------------------------------------------------------------------
446    raw_loss = torch.mean((reconstruction_raw - phantom_torch) ** 2)
447    clamped_loss = torch.mean((reconstruction - phantom_torch) ** 2)
448
449    scan_label = "Parker short scan" if apply_parker else "full 2*pi scan"
450    print(f"Cone Beam FDK example ({scan_label}):")
451    print(f"  Raw MSE:             {raw_loss.item():.6f}")
452    print(f"  Clamped MSE:         {clamped_loss.item():.6f}")
453    print(f"  Reconstruction shape: {tuple(reconstruction.shape)}")
454    print(
455        "  Raw reco data range:  "
456        f"[{reconstruction_raw.min().item():.4f}, {reconstruction_raw.max().item():.4f}]"
457    )
458    print(
459        "  Clamped reco range:   "
460        f"[{reconstruction.min().item():.4f}, {reconstruction.max().item():.4f}]"
461    )
462    print(
463        "  Phantom data range:   "
464        f"[{float(phantom_cpu.min()):.4f}, {float(phantom_cpu.max()):.4f}]"
465    )
466
467    # ------------------------------------------------------------------
468    # 8. Visualization
469    # ------------------------------------------------------------------
470    reconstruction_cpu = reconstruction.detach().cpu().numpy()
471    sinogram_cpu = sinogram.detach().cpu().numpy()
472    mid_slice = Nz // 2
473
474    plt.figure(figsize=(12, 4))
475    plt.subplot(1, 3, 1)
476    plt.imshow(phantom_cpu[mid_slice, :, :], cmap="gray")
477    plt.title("Phantom mid-slice")
478    plt.axis("off")
479    plt.subplot(1, 3, 2)
480    # Transpose so the v axis is vertical in the figure.
481    plt.imshow(sinogram_cpu[num_views // 2].T, cmap="gray", origin="lower")
482    plt.title("Sinogram mid-view")
483    plt.axis("off")
484    plt.subplot(1, 3, 3)
485    plt.imshow(reconstruction_cpu[mid_slice, :, :], cmap="gray")
486    plt.title("Recon mid-slice")
487    plt.axis("off")
488    plt.tight_layout()
489    plt.show()
490
491
492if __name__ == "__main__":
493    main()