Skip to main content

Automatic differentiation

Automatic differentiation (AD) is a core functionality of gradgen. Both forward-mode and reverse-mode are implemented. Gradgen also supports all common scalar functions such as exp\exp, sin\sin, cos\cos, etc, and lots of special functions such as erf\mathrm{erf}, erfc\mathrm{erfc}. For vector variables, several functions are supported, such as x1\|x\|_1, x2\|x\|_2, x\|x\|_\infty, xp\|x\|_p, and more.

Gradgen supports the following AD operations:

  • derivatives of scalar functions f:RRf:\mathbb{R}\to \mathbb{R} by forward AD
  • Jacobian-vector products, JF(x)vJF(x)v, for functions F:RnRmF:\mathbb{R}^n\to \mathbb{R}^m by forward AD
  • gradients of functions f:RnRf:\mathbb{R}^n\to \mathbb{R} by reverse AD
  • vector-Jacobian products, uJF(x)uJF(x), by reverse AD
  • Jacobian matrices for functions F:RnRmF:\mathbb{R}^n\to \mathbb{R}^m
  • Hessians of functions f:RnRf:\mathbb{R}^n\to \mathbb{R}

Both symbolic expressions and functions can be differentiated.

Forward mode

Scalar derivative

We can use derivative for the derivative of scalar functions, f:RRf:\mathbb{R}\to \mathbb{R}; this implements a forward-mode AD.

As an example, for the function f(x)=sin(x)+x2f(x) = \sin(x) + x^2 we can compute the derivative as follows

from gradgen import Function, SX, derivative

x = SX.sym("x")
expr = x.sin() + x * x
dexpr = derivative(expr, x)

df = Function("df", [x], [dexpr])
print(df(2.0))

Note that the generated expression, dexpr, can be simplified. This is done by default when you generate Rust code.

Vector directional derivative

from gradgen import Function, SXVector, jvp

x = SXVector.sym("x", 2)
expr = x.dot(x)
directional = jvp(expr, x, [1.0, 0.0])

df = Function("df", [x], [directional])
print(df([3.0, 4.0])) # 6.0

Differentiating a Function

You can also build a new function representing a forward-mode directional derivative of an existing Function:

from gradgen import Function, SX

x = SX.sym("x")
y = SX.sym("y")
f = Function("f", [x, y], [x * y + x.sin()])

df_dx = f.jvp(1.0, 0.0)
df_dy = f.jvp(0.0, 1.0)

The returned object is another Function with the same primal inputs and differentiated outputs.

Reverse mode

Reverse mode is especially useful for scalar-output expressions with many inputs.

Scalar gradient

from gradgen import Function, SX, gradient

x = SX.sym("x")
y = SX.sym("y")
expr = x * y + x.sin()

grad_x = gradient(expr, x)
grad_y = gradient(expr, y)

g = Function("g", [x, y], [grad_x, grad_y])
print(g(2.0, 5.0))

Vector-Jacobian product

from gradgen import Function, SX, SXVector, vjp

x = SX.sym("x")
outputs = SXVector((x.sin(), x * x))
sensitivity = vjp(outputs, x, [2.0, 3.0])

g = Function("g", [x], [sensitivity])
print(g(2.0))

Reverse-mode differentiation of a Function

You can also build a reverse-mode differentiated function from cotangent seeds on the outputs:

from gradgen import Function, SX

x = SX.sym("x")
y = SX.sym("y")
f = Function("f", [x, y], [x * y + x.sin()])

reverse = f.vjp(1.0)
print(reverse(2.0, 5.0))

Function.vjp(...) returns a new Function with:

  • the same primal inputs as the original function
  • outputs ordered like the original inputs
  • values equal to the vector-Jacobian product for the supplied cotangent direction

If you want a runtime-seeded reverse-mode function instead, provide wrt_index:

from gradgen import Function, SXVector

x = SXVector.sym("x", 2)
G = Function(
"G",
[x],
[SXVector((x[0] + x[1], x[0] * x[1], x[1].sin()))],
input_names=["x"],
output_names=["y"],
)

reverse_x = G.vjp(wrt_index=0)
print(reverse_x([3.0, 4.0], [2.0, -1.0, 5.0]))

This returns a Function with:

  • the original primal inputs
  • one appended cotangent input per declared output
  • a single output block equal to JG(x)vJ_G(x)^\top v for the selected input block

For scalar-output functions, there is also a high-level Function.gradient(...) helper:

grad_f = some_scalar_function.gradient(0)

This returns a new Function whose outputs represent the gradient with respect to the selected input block.

Jacobians

The library also supports symbolic Jacobian construction.

from gradgen import Function, SXVector, jacobian

x = SXVector.sym("x", 2)
expr = x.dot(x)

jac = jacobian(expr, x)
f = Function("jac", [x], [jac])
print(f([3.0, 4.0])) # (6.0, 8.0)

For vector-output by vector-input cases, Jacobians are currently represented as flat row-major vectors because full matrix types are not implemented yet.

For example:

from gradgen import Function, SXVector

x = SXVector.sym("x", 2)
G = Function(
"G",
[x],
[SXVector((x[0] + x[1], x[0] * x[1], x[1].sin()))],
input_names=["x"],
output_names=["y"],
)

JG = G.jacobian(0)
print(JG([3.0, 4.0])) # (1.0, 1.0, 4.0, 3.0, 0.0, cos(4.0))

This corresponds to the 3×23 \times 2 matrix

[11430cos(4)]\begin{bmatrix} 1 & 1 \\ 4 & 3 \\ 0 & \cos(4) \end{bmatrix}

stored row by row.

You can also derive a Jacobian block from a function:

jac_f = some_function.jacobian(0)

If you want multiple input blocks at once, use:

blocks = some_function.jacobian_blocks()

For Rust code generation workflows that want one kernel to compute several artifacts together, you can also build a combined symbolic function:

joint = some_function.joint(("f", "jf"), 0, simplify_joint="high")

This low-level Function.joint(...) API operates on one selected wrt block at a time. Supported component names are:

  • "f" for the primal outputs
  • "grad" for the gradient block
  • "jf" for the Jacobian block
  • "hessian" for the Hessian block
  • "hvp" for the Hessian-vector product

The output order follows the requested component tuple. If "hvp" is included, the returned function appends a tangent input block.

Hessians

Hessians are supported for scalar-output expressions.

from gradgen import Function, SXVector, hessian

x = SXVector.sym("x", 2)
expr = x[0] * x[0] + x[0] * x[1] + x[1] * x[1]

hes = hessian(expr, x)
f = Function("hes", [x], list(hes))
print(f([3.0, 4.0])) # ((2.0, 1.0), (1.0, 2.0))

At the function level:

hes_f = some_scalar_function.hessian(0)

For multiple input blocks:

blocks = some_scalar_function.hessian_blocks()

Hessian-vector products

For scalar-output functions, you can also build Hessian-vector product kernels. The resulting function keeps the original inputs and appends one extra tangent input block:

hvp_f = some_scalar_function.hvp(0)

To build several blocks at once:

blocks = some_scalar_function.hvp_blocks()

List of supported functions

The following scalar functions, f:RRf:\mathbb{R}\to \mathbb{R} are supported. Note that all these functions act on vectors elementwise, e.g., for xRnx\in \mathbb{R}^n, sin(x)=(sin(x1),,sin(xn))\sin(x) = (\sin(x_1), \ldots, \sin(x_n)).

  • Binary operators: +, -, *, /
  • Common: x**a (xax^a), sqrt (x\sqrt{x})
  • Trigonometric: sin, cos, tan
  • Inverse trigonometric: asin, acos, atan, atan2
  • Hyperbolic: sinh, cosh, tanh
  • Inverse hyperbolic: asinh, acosh, atanh
  • Logarithmic: log (natural log), log10, logp1
  • Special functions: erf, erfc,
  • Other: abs, expm1 (ex1e^x - 1), log1p (log(1+x)\log(1+x)), cbrt (cube root), hypot (hypot(a,b)=a2+b2\mathrm{hypot}(a, b) = \sqrt{a^2 + b^2}), maximum (max{a,b}\max\{a, b\}), minimum (min{a,b}\min\{a, b\}),

The following functions are available for functions f:RnRf:\mathbb{R}^n\to \mathbb{R}

  • Quadratic form: quadform(x, P)
  • Bilinear form: bilinear_form(x, P, y)
  • Norms: norm1
  • Other: sum, mean, dot

Simplification

TODO

Move this section to Symbolic Framework

The library includes a bounded symbolic simplifier for expressions and functions.

from gradgen import SX, derivative, simplify

x = SX.sym("x")
expr = derivative(x * x, x)

print(expr) # unsimplified symbolic derivative
print(simplify(expr, "medium")) # cleaner symbolic expression

Supported simplification controls:

  • integer pass counts like max_effort=5
  • named effort presets:
    • "none"
    • "basic"
    • "medium"
    • "high"
    • "max"

Functions can also be simplified directly:

f_simplified = f.simplify(max_effort="high")

This works for ordinary functions and derived ones such as Jacobians and Hessians.