SD/CG Optimization

Overview

The SD/CG optimizer is a hybrid steepest-descent / conjugate-gradient method with optional DIIS acceleration. It begins with a steepest descent phase to rapidly reduce large forces, then automatically switches to conjugate gradient for more efficient convergence near a minimum. This is the recommended choice when a starting geometry may be significantly distorted. Invoked with method=sdcg (or simply the keyword sdcg, which defaults to this mode).

The switch from SD to CG is controlled by sd_max_iter and cg_switch_fmax: the optimizer moves to CG once either the SD iteration limit is reached or the maximum force drops below the specified threshold. With the default cg_switch_fmax=0.0, MAPLE sets an automatic threshold equal to 0.5 × the initial maximum force. The parameters sd_enabled and cg_enabled are derived internally from the method keyword and are not intended to be set directly.

Parameters

Parameter	Type	Default	Description
`max_step`	float	`0.2`	Maximum step length in Angstrom.
`max_iter`	int	`256`	Maximum number of optimization cycles.
`sd_max_iter`	int	`50`	Maximum iterations in the initial SD phase before switching to CG.
`cg_switch_fmax`	float	`0.0`	Switch from SD to CG when the maximum force drops below this value (Hartree/Angstrom). The default `0.0` enables the runtime auto-threshold: 0.5 × the initial maximum force.
`cg_restart_threshold`	float	`0.2`	Restart CG directions when the gradient change ratio exceeds this value.
`cg_beta_method`	str	`"prp+"`	Conjugate gradient beta formula.
`diis_enabled`	bool	`True`	Enable DIIS extrapolation acceleration.
`diis_store_every`	int	`5`	Store a DIIS snapshot every N steps.
`diis_min_snapshots`	int	`3`	Minimum snapshots required before DIIS extrapolation is attempted.
`diis_memory`	int	`6`	Maximum number of DIIS snapshots stored.
`verbose`	int	`1`	Output verbosity level.

Input Example

#model=uma
#opt(method=sdcg)
#device=gpu0

C      -0.77812600     -1.06756100      0.32105900
C       1.30255300      0.05212000     -0.02829900
C      -0.97199300      1.45624900      0.82365700
C       1.98122900     -0.41843300      1.25017500
C       2.26403300      0.43516800     -1.13310700
N       0.29116400     -0.87682100     -0.50039900
N      -2.01916300     -1.37200600     -0.23311200
O      -1.66473400      1.63022300     -0.40183700
H      -2.25863200      0.87013900     -0.47589900
H       0.02784900     -0.67831200     -1.45849500
H      -0.60975900     -1.36785600      1.34117800
H       0.68126400      0.91830000      0.28694100
H      -2.57326300     -2.05736100      0.25741200
H      -2.05242800     -1.50454200     -1.23382300
H      -0.36899300      2.35139200      0.99174700
H      -1.63960900      1.29421500      1.67465300
H       2.76444900      0.29146200      1.52284600
H       1.72511400      0.73069500     -2.03779400
H       2.43559300     -1.40609800      1.13421500
H       1.27300500     -0.44722400      2.08057900
H       2.85034500      1.29869700     -0.81432200
H       2.96000900     -0.36880600     -1.38891000

When to Use

Distorted starting geometries: The SD phase rapidly reduces large forces without requiring any history, then hands off to CG for efficient final convergence — combining the robustness of SD with the speed of CG.
When L-BFGS oscillates: If L-BFGS enters a cycle of large oscillating steps, SD/CG can provide a more stable convergence path.
General-purpose fallback: SD/CG is a reliable choice whenever L-BFGS or RFO fails to converge, without requiring manual tuning of which phase to use.

Convergence Behaviour

The figures below show a typical SD/CG run on a 22-atom organic molecule (UMA model). The vertical dotted line marks the automatic switch from the SD phase to the CG phase. Energy drops steeply during SD, then the CG phase refines to convergence.

SD/CG energy vs iteration — Fig. 1 — Energy convergence (SD/CG)

SD/CG force vs iteration — Fig. 2 — Force convergence (SD/CG)

Note

For production-quality geometry optimizations, L-BFGS (the default) is generally preferred due to its superior convergence near minima. SD/CG is most useful as a pre-optimizer for severely distorted structures or when L-BFGS has difficulty converging.