Vai al contenuto

Verification of Neural Networks

Pubblicato:

A problem in two parts

Netron, an onnx visualizer

; test_small.vnnlib

(declare-const X_0 Real)
(declare-const Y_0 Real)
(assert (or
    (and
        (>= X_0 -1)
        (<= X_0 1)
        (>= Y_0 100)
    )
))

vnnlib file format, an smt2lib dialect

Taken from the VNNCOMP 2021 benchmarks

The verification

Loading diagram...

The goal is to linear the output layer of the neural network and verify the constraints

Common activation functions

Loading diagram...

R(x)={0x<0xx0R(x) = \begin{cases} 0 & x < 0 \newline x & x \ge 0 \end{cases}

Loading diagram...

LR(x)={axx<0xx0LR(x) = \begin{cases} ax & x < 0 \newline x & x \ge 0 \end{cases}

Other notable activation functions

Loading diagram...

S(x)=11+exS(x) = \frac{1}{1 + e^{-x}}

Loading diagram...

T(x)=e2x1e2x+1T(x) = \frac{e^{2x} - 1}{e^{2x} + 1}

Linearize the activation functions

The approach used by α-β-crown\alpha\text{-}\beta\text{-crown}, winner of VNNCOMP 2021, 2022 and 2023 is to linearize the activation functions, reducing the error introduced by the approximation with tunable parameters.

ReLU linearization

Paper

Convex relaxation of Neural Networks

To deal with intractability, convex relaxations can be used. For instance, Planet relaxation substitutes the ReLU function y=max(0,x)y = \max(0, x) with

y0yxyuulxululy \ge 0 \newline y \ge x \newline y \le \frac{u}{u - l}x - \frac{ul}{u - l} \newline

where ll and uu are the lower and upper bounds of the input xx. They can be obtained directly or estimated.

Propagate bounds

Bound propagation is a key step in guiding the search for a solution. State-of-the-art tools include LiPRA and PRIMA.

dlinear's current naive implementation only propagate explicit bounds and equality constraints.

These functionalities can be extended by considering the interval bounds of the input variables and propagating them through the graph of constraints.

Sum of infeasibilities

Comparison

ToolStrict constraintsOutput eq yyCorrelated inputsUnbounded xxOrPrecisionEfficient
dlinear\checkmark\checkmark\checkmark\checkmark\checkmark\checkmark×\times
alpha-beta-crown\checkmark*×\times×\times×\times×\times101010^{-10}\checkmark
neurosat×\times\checkmark×\times×\times\checkmark\checkmark\checkmark
nnenum×\times\checkmark×\times×\times×\times101010^{-10}\checkmark
Marabou×\times\checkmark\checkmark×\times\checkmark\checkmark\checkmark

SMT components: Terms and formulae

  • fVBf \in V_B: propositional variables (Formula)
  • xVRx \in V_R: real-valued variables (Term)
  • a,cRa, c \in \mathbb{R}: constants (Term)
  • =,,<,,>,\sim \in \\{=, \neq, <, \leq, >, \geq\\}: comparison operators
a11x1++a1nxn+c1Terma21xn+1++a2mxm+c2TermFormula\underbrace{\underbrace{a_{11} x_1 + \dots + a_{1n} x_n + c_1}_{\text{Term}} \sim \underbrace{a_{21} x_{n + 1} + \dots + a_{2m} x_m + c_2}_{\text{Term}}}_{\text{Formula}}

Conjunctive normal form

Most SMT solvers expect the input in CNF form, where lijl_{ij} are literals

(l00l0m0)(l10l1m1)(ln0lnmn)( l_{00} \lor \dots \lor l_{0m_0}) \land (l_{10} \lor \dots \lor l_{1m_1}) \land \dots \land (l_{n0} \lor \dots \lor l_{nm_n})

If-then-else terms

If f,f1,f2f, f_1, f_2 are formulas, ite(f,f1,f2)\text{ite}(f, f_1, f_2) is a formula equivalent to (f1f2)(¬f1f3)(f_1 \land f_2 ) \lor (\neg f1 \land f3 )

If t1,t2t_1, t_2 are terms and ff is a formula, term-ite(f,t1,t2)\text{term-ite}(f, t_1, t_2) is a term

Piecewise linear functions to ITE

Piecewise linear functions can be represented using if-then-else terms

; ReLU
(declare-const x Real)
(declare-const y Real)

(assert (= y (ite (<= x 0) 0 x)))

ITE to CNF

The if-then-else term can be converted to CNF by introducing a fresh variable cc with the following equisatisfability relation

f(term-ite(g,t1,t2))f(c)ite(g,t1=c,t2=c)f(\text{term-ite}(g, t1 , t2)) \equiv f (c) \land \text{ite}(g, t1 = c, t2 = c)

e.g.

term-ite(g,1,2)=term-ite(h,3,4)becomesite(g,c=1,c=2)ite(h,c=3,c=4)\text{term-ite}(g, 1, 2) = \text{term-ite}(h, 3, 4) \newline \text{becomes} \newline \text{ite}(g, c = 1, c = 2) \land \text{ite}(h, c = 3, c = 4)

Max encoding

The max function can be seen as a special case of an ITE term. Exploiting its characteristics, introducing two fresh variables a1,a2a_1, a_2 allows to encode it directly in CNF:

y=max(x1,x2)    (yx1=a1)(a10)(yx2=a2)(a20)(a10a20)\begin{array}{lcr} y = \max(x_1, x_2) & \implies & (y − x_1 = a_1) \land (a_1 \ge 0) \land \newline & & (y − x_2 = a_2) \land (a_2 \ge 0) \land \newline & & (a1 \le 0 \lor a2 \le 0) \end{array}

Linear layers, non-linear activation layers

Given a neural network with LL layers, we can divide them into two categories:

  • Linear layers: fi(x)=Wix+bif_i(x) = W_i x + b_i
    • Input: xRmx \in \R^m, weights: WiRn×mW_i \in \R^{n \times m}, bias: biRnb_i \in \R^n
  • Activation layers: non-linear ff
    • Piece-wise linear functions
    • General non-linear functions

Neural Network

Tightening the bounds

Converting all the layers of a neural network up to an activation layer to a linear constraint allows to compute the bounds of the output of the activation layer, as long the input is bounded.

1x114x27r1=2x1+3x21    15r122r2=4x12x2+3    15r215\begin{array}{lcr} -1 \le x_1 \le 1 \newline -4 \le x_2 \le 7 \newline r_1 = 2x_1 + 3x_2 - 1 & \implies & -15 \le r_1 \le 22 \newline r_2 = 4x_1 - 2x_2 + 3 & \implies & -15 \le r_2 \le 15 \newline \end{array}

Fixing the piecewise linear functions

If the bounds on output of the activation layer are strict enough, it may be possible to fix the piecewise linear term.

0x114x27r1={2x1+3x21if 2x1+3x21>00otherwise    r1=2x1+3x21\begin{array}{lc} 0 \le x_1 \le 1 \newline 4 \le x_2 \le 7 \newline r_1 = & \begin{cases} 2x_1 + 3x_2 - 1 & \text{if } 2x_1 + 3x_2 - 1 > 0 \newline 0 & \text{otherwise} \end{cases} \newline & \implies r_1 = 2x_1 + 3x_2 - 1 \end{array}

Sum of Infeasibilities

Instead of adding the non-fixed activation layers to the constraints of the LP problem, they can be used to minimize the sum of infeasibilities.

Sorting by the violation they introduce gives us a way to prioritize the search for the solution.

minr1(2x1+3x21)+r2s.t.1x114x27r1,r20r12x1+3x21r24x12x2+3\begin{array}{lcr} \min & r_1 - (2x_1 + 3x_2 - 1) + r_2 \newline \text{s.t.} & -1 \le x_1 \le 1 \newline & -4 \le x_2 \le 7 \newline & r_1, r_2 \ge 0 \newline & r_1 \ge 2x_1 + 3x_2 - 1 \newline & r_2 \ge 4x_1 - 2x_2 + 3 \end{array}

Completeness vs Real world

SMT solvers aim for a complete approach, a mathematical solution of the problem, employing symbolic representation of the inputs and exact arithmetic (when possible).
In the real world, however, speed of the computation is usually the main concert, hence floating point arithmetic is almost always used.

As a result, it can happen that the solution found by the SMT solver is not the same as the one computed by the neural network (e.g. OnnxRuntime).

Future work

  • Benchmarks
  • Other heuristics to optimize the search for the solution
  • use overapproximation of bounds to reduce the search space
    • How much completeness are we sacrificing?

References