SD/CG Optimization

Overview

The SD/CG optimizer is a hybrid steepest-descent / conjugate-gradient method with optional DIIS acceleration. It begins with a steepest descent phase to rapidly reduce large forces, then automatically switches to conjugate gradient for more efficient convergence near a minimum. This is the recommended choice when a starting geometry may be significantly distorted. Invoked with method=sdcg (or simply the keyword sdcg, which defaults to this mode).

The switch from SD to CG is controlled by sd_max_iter and cg_switch_fmax: the optimizer moves to CG once either the SD iteration limit is reached or the maximum force drops below the specified threshold. The parameters sd_enabled and cg_enabled are derived internally from the method keyword and are not intended to be set directly.

Parameters

Parameter Type Default Description
max_step float 0.2 Maximum step length in Angstrom.
max_iter int 256 Maximum number of optimization cycles.
sd_max_iter int 50 Maximum iterations in the initial SD phase before switching to CG.
cg_switch_fmax float 0.0 Switch from SD to CG when the maximum force drops below this value (eV/Å). Set to 0 to switch purely based on sd_max_iter.
cg_restart_threshold float 0.2 Restart CG directions when the gradient change ratio exceeds this value.
cg_beta_method str "prp+" Conjugate gradient beta formula.
diis_enabled bool True Enable DIIS extrapolation acceleration.
diis_store_every int 5 Store a DIIS snapshot every N steps.
diis_min_snapshots int 3 Minimum snapshots required before DIIS extrapolation is attempted.
diis_memory int 6 Maximum number of DIIS snapshots stored.
write_traj bool False Write intermediate geometries to a trajectory file.
traj_every int 1 Write trajectory frame every N steps.
verbose int 1 Output verbosity level.

Input Example

#model=uma
#opt(method=sdcg)
#device=gpu0

C      -0.77812600     -1.06756100      0.32105900
C       1.30255300      0.05212000     -0.02829900
C      -0.97199300      1.45624900      0.82365700
C       1.98122900     -0.41843300      1.25017500
C       2.26403300      0.43516800     -1.13310700
N       0.29116400     -0.87682100     -0.50039900
N      -2.01916300     -1.37200600     -0.23311200
O      -1.66473400      1.63022300     -0.40183700
H      -2.25863200      0.87013900     -0.47589900
H       0.02784900     -0.67831200     -1.45849500
H      -0.60975900     -1.36785600      1.34117800
H       0.68126400      0.91830000      0.28694100
H      -2.57326300     -2.05736100      0.25741200
H      -2.05242800     -1.50454200     -1.23382300
H      -0.36899300      2.35139200      0.99174700
H      -1.63960900      1.29421500      1.67465300
H       2.76444900      0.29146200      1.52284600
H       1.72511400      0.73069500     -2.03779400
H       2.43559300     -1.40609800      1.13421500
H       1.27300500     -0.44722400      2.08057900
H       2.85034500      1.29869700     -0.81432200
H       2.96000900     -0.36880600     -1.38891000

When to Use

  • Distorted starting geometries: The SD phase rapidly reduces large forces without requiring any history, then hands off to CG for efficient final convergence — combining the robustness of SD with the speed of CG.
  • When L-BFGS oscillates: If L-BFGS enters a cycle of large oscillating steps, SD/CG can provide a more stable convergence path.
  • General-purpose fallback: SD/CG is a reliable choice whenever L-BFGS or RFO fails to converge, without requiring manual tuning of which phase to use.

Convergence Behaviour

The figures below show a typical SD/CG run on a 22-atom organic molecule (UMA model). The vertical dotted line marks the automatic switch from the SD phase to the CG phase. Energy drops steeply during SD, then the CG phase refines to convergence.

SD/CG energy vs iteration
Fig. 1 — Energy convergence (SD/CG)
SD/CG force vs iteration
Fig. 2 — Force convergence (SD/CG)
Note

For production-quality geometry optimizations, L-BFGS (the default) is generally preferred due to its superior convergence near minima. SD/CG is most useful as a pre-optimizer for severely distorted structures or when L-BFGS has difficulty converging.