Evaluation & Derivatives
Evaluation
Given an expression tree specified with a Node
type, you may evaluate the expression over an array of data with the following command:
DynamicExpressions.EvaluateModule.eval_tree_array
— Methodeval_tree_array(
tree::AbstractExpressionNode{T},
cX::AbstractMatrix{T},
operators::OperatorEnum;
eval_options::Union{EvalOptions,Nothing}=nothing,
) where {T}
Evaluate a binary tree (equation) over a given input data matrix. The operators contain all of the operators used. This function fuses doublets and triplets of operations for lower memory usage.
Arguments
tree::AbstractExpressionNode
: The root node of the tree to evaluate.cX::AbstractMatrix{T}
: The input data to evaluate the tree on, with shape[num_features, num_rows]
.operators::OperatorEnum
: The operators used in the tree.eval_options::Union{EvalOptions,Nothing}
: SeeEvalOptions
for documentation on the different evaluation modes.
Returns
(output, complete)::Tuple{AbstractVector{T}, Bool}
: the result, which is a 1D array, as well as if the evaluation completed successfully (true/false). Afalse
complete means an infinity or nan was encountered, and a large loss should be assigned to the equation.
Notes
This function can be represented by the following pseudocode:
def eval(current_node)
if current_node is leaf
return current_node.value
elif current_node is degree 1
return current_node.operator(eval(current_node.left_child))
else
return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))
The bulk of the code is for optimizations and pre-emptive NaN/Inf checks, which speed up evaluation significantly.
You can also use the following shorthand by using the expression as a function:
(tree::AbstractExpressionNode)(X, operators::OperatorEnum; kws...)
Evaluate a binary tree (equation) over a given input data matrix. The
operators contain all of the operators used. This function fuses doublets
and triplets of operations for lower memory usage.
# Arguments
- `tree::AbstractExpressionNode`: The root node of the tree to evaluate.
- `cX::AbstractMatrix{T}`: The input data to evaluate the tree on.
- `operators::OperatorEnum`: The operators used in the tree.
- `kws...`: Passed to [`eval_tree_array`](@ref).
# Returns
- `output::AbstractVector{T}`: the result, which is a 1D array.
Any NaN, Inf, or other failure during the evaluation will result in the entire
output array being set to NaN.
For example,
using DynamicExpressions
operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos])
tree = Node(; feature=1) * cos(Node(; feature=2) - 3.2)
tree([1 2 3; 4 5 6.], operators)
3-element Vector{Float64}:
0.6967067435533686
-0.4544040965128262
-2.8266669740855317
This is possible because when you call OperatorEnum
, it automatically re-defines (::Node)(X)
to call the evaluation operation with the given operators loaded. It also re-defines
print,
show, and the various operators, to work with the
Node` type.
The Node
type does not know about which OperatorEnum
you used to create it. Thus, if you define an expression with one OperatorEnum
, and then try to evaluate it or print it with a different OperatorEnum
, you will get undefined behavior!
For safer behavior, you should use Expression
objects.
Evaluation options are specified using EvalOptions
:
DynamicExpressions.EvaluateModule.EvalOptions
— TypeEvalOptions
This holds options for expression evaluation, such as evaluation backend.
Fields
turbo::Val{T}=Val(false)
: IfVal{true}
, use LoopVectorization.jl for faster evaluation.bumper::Val{B}=Val(false)
: IfVal{true}
, use Bumper.jl for faster evaluation.early_exit::Val{E}=Val(true)
: IfVal{true}
, any element of any step becomingNaN
orInf
will terminate the computation. Foreval_tree_array
, this will result in the second return value, the completion flag, beingfalse
. For calling an expression usingtree(X)
, this will result inNaN
s filling the entire buffer. This early exit is performed to avoid wasting compute cycles. SettingVal{false}
will continue the computation as usual and thus result inNaN
s only in the elements that actually haveNaN
s.buffer::Union{ArrayBuffer,Nothing}
: If notnothing
, use this buffer for evaluation. This should be an instance ofArrayBuffer
which has anarray
field and anindex
field used to iterate which buffer slot to use.
You can also work with arbitrary types, by defining a GenericOperatorEnum
instead. The notation is the same for eval_tree_array
, though it will return nothing
when it can't find a method, and not do any NaN checks:
DynamicExpressions.EvaluateModule.eval_tree_array
— Methodeval_tree_array(tree::AbstractExpressionNode, cX::AbstractMatrix, operators::GenericOperatorEnum; throw_errors::Bool=true)
Evaluate a generic binary tree (equation) over a given input data, whatever that input data may be. The operators
enum contains all of the operators used. Unlike eval_tree_array
with the normal OperatorEnum
, the array cX
is sliced only along the first dimension. i.e., if cX
is a vector, then the output of a feature node will be a scalar. If cX
is a 3D tensor, then the output of a feature node will be a 2D tensor. Note also that tree.feature
will index along the first axis of cX
.
However, there is no requirement about input and output types in general. You may set up your tree such that some operator nodes work on tensors, while other operator nodes work on scalars. eval_tree_array
will simply return nothing
if a given operator is not defined for the given input type.
This function can be represented by the following pseudocode:
function eval(current_node)
if current_node is leaf
return current_node.value
elif current_node is degree 1
return current_node.operator(eval(current_node.left_child))
else
return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))
Arguments
tree::AbstractExpressionNode
: The root node of the tree to evaluate.cX::AbstractArray
: The input data to evaluate the tree on.operators::GenericOperatorEnum
: The operators used in the tree.throw_errors::Bool=true
: Whether to throw errors if they occur during evaluation. Otherwise, MethodErrors will be caught before they happen and evaluation will returnnothing
, rather than throwing an error. This is useful in cases where you are unsure if a particular tree is valid or not, and would prefer to work withnothing
as an output.
Returns
(output, complete)::Tuple{Any, Bool}
: the result, as well as if the evaluation completed successfully (true/false). If evaluation failed,nothing
will be returned for the first argument. Afalse
complete means an operator was called on input types that it was not defined for.
Likewise for the shorthand notation:
(tree::Node)(X::AbstractMatrix, operators::GenericOperatorEnum; throw_errors::Bool=true)
# Arguments
- `X::AbstractArray`: The input data to evaluate the tree on.
- `operators::GenericOperatorEnum`: The operators used in the tree.
- `throw_errors::Bool=true`: Whether to throw errors
if they occur during evaluation. Otherwise,
MethodErrors will be caught before they happen and
evaluation will return `nothing`,
rather than throwing an error. This is useful in cases
where you are unsure if a particular tree is valid or not,
and would prefer to work with `nothing` as an output.
# Returns
- `output`: the result of the evaluation.
If evaluation failed, `nothing` will be returned for the first argument.
A `false` complete means an operator was called on input types
that it was not defined for. You can change this behavior by
setting `throw_errors=false`.
Derivatives
DynamicExpressions.jl
can efficiently compute first-order derivatives of expressions with respect to variables or constants. This is done using either eval_diff_tree_array
, to compute derivative with respect to a single variable, or with eval_grad_tree_array
, to compute the gradient with respect all variables (or, all constants). Both use forward-mode automatic, but use Zygote.jl
to compute derivatives of each operator, so this is very efficient.
DynamicExpressions.EvaluateDerivativeModule.eval_diff_tree_array
— Methodeval_diff_tree_array(
tree::AbstractExpressionNode{T},
cX::AbstractMatrix{T},
operators::OperatorEnum,
direction::Integer;
turbo::Union{Bool,Val}=Val(false)
) where {T<:Number}
Compute the forward derivative of an expression, using a similar structure and optimization to evaltreearray. direction
is the index of a particular variable in the expression. e.g., direction=1
would indicate derivative with respect to x1
.
Arguments
tree::AbstractExpressionNode
: The expression tree to evaluate.cX::AbstractMatrix{T}
: The data matrix, with shape[num_features, num_rows]
.operators::OperatorEnum
: The operators used to create thetree
.direction::Integer
: The index of the variable to take the derivative with respect to.turbo::Union{Bool,Val}
: Use LoopVectorization.jl for faster evaluation. Currently this does not have any effect.
Returns
(evaluation, derivative, complete)::Tuple{AbstractVector{T}, AbstractVector{T}, Bool}
: the normal evaluation, the derivative, and whether the evaluation completed as normal (or encountered a nan or inf).
DynamicExpressions.EvaluateDerivativeModule.eval_grad_tree_array
— Methodeval_grad_tree_array(tree::AbstractExpressionNode{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; variable::Union{Bool,Val}=Val(false), turbo::Union{Bool,Val}=Val(false))
Compute the forward-mode derivative of an expression, using a similar structure and optimization to evaltreearray. variable
specifies whether we should take derivatives with respect to features (i.e., cX), or with respect to every constant in the expression.
Arguments
tree::AbstractExpressionNode{T}
: The expression tree to evaluate.cX::AbstractMatrix{T}
: The data matrix, with each column being a data point.operators::OperatorEnum
: The operators used to create thetree
.variable::Union{Bool,Val}
: Whether to take derivatives with respect to features (i.e.,cX
- withvariable=true
), or with respect to every constant in the expression (variable=false
).turbo::Union{Bool,Val}
: Use LoopVectorization.jl for faster evaluation. Currently this does not have any effect.
Returns
(evaluation, gradient, complete)::Tuple{AbstractVector{T}, AbstractMatrix{T}, Bool}
: the normal evaluation, the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).
You can compute gradients this with shorthand notation as well (which by default computes gradients with respect to input matrix, rather than constants).
(tree::Node{T})'(X::AbstractMatrix{T}, operators::OperatorEnum; turbo::Bool=false, variable::Bool=true)
Compute the forward-mode derivative of an expression, using a similar
structure and optimization to eval_tree_array. `variable` specifies whether
we should take derivatives with respect to features (i.e., X), or with respect
to every constant in the expression.
# Arguments
- `X::AbstractMatrix{T}`: The data matrix, with each column being a data point.
- `operators::OperatorEnum`: The operators used to create the `tree`.
- `variable::Bool`: Whether to take derivatives with respect to features (i.e., `X` - with `variable=true`),
or with respect to every constant in the expression (`variable=false`).
- `turbo::Bool`: Use `LoopVectorization.@turbo` for faster evaluation.
# Returns
- `(evaluation, gradient, complete)::Tuple{AbstractVector{T}, AbstractMatrix{T}, Bool}`: the normal evaluation,
the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).
Alternatively, you can compute higher-order derivatives by using ForwardDiff
on the function differentiable_eval_tree_array
, although this will be slower.
DynamicExpressions.EvaluateModule.differentiable_eval_tree_array
— Methoddifferentiable_eval_tree_array(tree::AbstractExpressionNode, cX::AbstractMatrix, operators::OperatorEnum)
Evaluate an expression tree in a way that can be auto-differentiated.
Enzyme
DynamicExpressions.jl
also supports automatic differentiation with Enzyme.jl
. Note that this is extremely experimental. You should expect to see occasional incorrect gradients. Be sure to explicitly verify gradients are correct for a particular space of operators (e.g., with finite differences).
Let's look at an example. First, let's create a tree:
using DynamicExpressions
operators = OperatorEnum(binary_operators=(+, -, *, /), unary_operators=(cos, sin))
x1 = Node{Float64}(feature=1)
x2 = Node{Float64}(feature=2)
tree = 0.5 * x1 + cos(x2 - 0.2)
Now, say we want to take the derivative of this expression with respect to x1 and x2. First, let's evaluate it normally:
X = [1.0 2.0 3.0; 4.0 5.0 6.0] # 2x3 matrix (2 features, 3 rows)
tree(X, operators)
Now, let's use Enzyme.jl
to compute the derivative of the outputs with respect to x1 and x2, using reverse-mode autodiff:
using Enzyme
function my_loss_function(tree, X, operators)
# Get the outputs
y = tree(X, operators)
# Sum them (so we can take a gradient, rather than a jacobian)
return sum(y)
end
dX = begin
storage=zero(X)
autodiff(
Reverse,
my_loss_function,
Active,
## Actual arguments to function:
Const(tree),
Duplicated(X, storage),
Const(operators),
)
storage
end
This will get returned as
2×3 Matrix{Float64}:
0.5 0.5 0.5
0.611858 0.996165 0.464602
which one can confirm is the correct gradient!
This will take a while the first time you run it, as Enzyme needs to take the gradients of the actual LLVM IR code. Subsequent runs won't spend any time compiling and be much faster.
Some general notes about this:
- We want to take a reverse-mode gradient, so we pass
Reverse
toautodiff
. - Since we want to take the gradient of the output of
my_loss_function
, we declareActive
as the third argument. - Following this, we pass our actual arguments to the function.
- Objects which we don't want to take gradients with respect to, and also don't temporarily store any data during the computation (such as
tree
andoperators
here) should be wrapped withConst
. - Objects which we wish to take derivatives with respect to, we need to use
Duplicated
, and explicitly create a copy of it, with all numerical values set to zero. Enzyme will then store the derivatives in this object.
- Objects which we don't want to take gradients with respect to, and also don't temporarily store any data during the computation (such as
Note that you should never use anything other than turbo=Val(false)
with Enzyme, as Enzyme and LoopVectorization are not compatible, and will cause a segfault. Even using turbo=false
will not work, because it would cause Enzyme to trace the (unused) LoopVectorization code!