q
LENS Manual: Group Output Types
Special Topics: Group Output Types
The group output types form a pipeline of functions which compute the
units' outputs in the forward direction and backpropagate the
outputDerivs in the backward direction. The basic types
determine the output as a function of the input. The clamping types set
or alter the output based on the externalInput. The other types
modify an already-computed output value. There shouldn't be more than
one basic type. There may be no basic type if there is a clamping type.
Basic Output Types
- LINEAR
- This simply copies the input to the output.
- LOGISTIC
- Computes the traditional sigmoid function:
O = 1 / (1 + exp(-i * gain))
Gain is the inverse of the temperature. It is used to avoid division.
Ordinarily the gain is taken from the network's gain field or
the group's gain field if that is set. If ADAPTIVE_GAIN is
used, each unit will have its own trainable gain.
- TERNARY
- This is essentially a normal sigmoid shifted to the right added to a
negated sigmoid of -i shifted to the left. Alternately, you can think
of it as a [-1,1] sigmoid that has a flat place at 0. It is designed
to give the unit stable outputs at -1, 1, and 0. You could think of
such units as coding whether a feature is present, absent, or
unknown. The gain affects the slope of each of the two
sigmoids. The ternaryShift sets the distance between their
centers. Increasing the ternaryShift will make the central
plateau wider. Increasing the gain will make the transitions between
plateaus sharper.
- TANH
- This is equivalent to 1 - 2S(2 i), where S is the ordinary
sigmoid function and I is the input. Note that its slope is actually
twice what the slope would be if you just stretched a sigmoid to the
range [-1,1]. So you may want to use half the normal gain to
compensate. If ADAPTIVE_GAIN is used, each unit will have its own
trainable gain.
- EXPONENTIAL
- This is just exp(i). There is a big potential for overflow
with this, so you may want to be careful how you use it.
- GAUSSIAN
- This computes a gaussian radial basis function: exp(-i^2 *
gain^2). This is often as effective as LOGISTIC, although it
can become a bit unstable at the end of training. It can also be used
in conjunction with ADAPTIVE_GAIN for individual, trainable gains for
each unit.
- SOFT_MAX
- This is equivalent to an exponential followed by a normalization.
However, SOFT_MAX scales the values before
computing the exponential. This doesn't affect the end result but it
avoids overflow. A SOFT_MAX OUTPUT group will get DIVERGENCE error by
default.
- KOHONEN
- This is used for the map layer in a KOHONEN network. It should be
combined with a DISTANCE input function. It finds the unit whose
weight vector is most similar to the input vector. Any unit in the
map whose squared Euclidean distance from the best unit is greater
than neighborhood will be silent. Groups in the neighborhood
will have output equal to 1.0 minus the ratio between the unit's input
and the largest input of any unit. The output will therefore fall in
the range [0.0, 1.0].
In the backward pass, the inputDeriv of units in the neighborhood
will be set to 1.0 and that of the others to 0.0. Only units in the
neighborhood will be able to alter their incoming weights. The
DISTANCE procedure, in the backward pass, will cause the incoming
weights to drift towards the input vector.
- OUT_BOLTZ
- This is used for groups in a Boltzmann network. If the unit has an
externalInput, the output will be clamped to that value.
Otherwise, if it has a target and the network is in the gracePeriod
(the positive phase of the Boltzmann algorithm), the output will be
clamped to the target. Otherwise, it is computed as a time-averaged
sigmoid of the input.
- OUT_COPY
- The units in a group with an OUT_COPY output function simply copy
their outputs from some field in the corresponding units of another
group. The copyConnect
command must be used to specify which group and which field will be
the source of the copying.
- INTERACT_INTEGR
- This implements the interactive-activation output rule. For a
traditional interactive-activation model, it should be used with
DOT_PRODUCT inputs, and an INCR_CLAMP input function. It contains a
decay term and time-averages the activations. The decay is fixed at
1.0. This version crops the unit outputs to the range [0, maxOutput]
because negative outputs are not normally used in an IA model.
Clamping Output Types
- HARD_CLAMP
- If the externalInput is a real number, this sets the output
to the externalInput. Otherwise it does nothing.
- BIAS_CLAMP
- This sets the output to the initOutput (defaults to 1.0 for BIAS groups).
- ELMAN_CLAMP
- In order for an ELMAN_CLAMP function to work, you must first use elmanConnect to associate a
source group with the context group. This simply copies the
(cached) output from each source unit and adds it to the output of the
corresponding context unit.
It is possible to have more that one ELMAN_CLAMP function. In this
case, the output will simply sum the outputs from each of the source
units. If a group has multiple ELMAN_CLAMP functions, each call to
elmanConnect will define the source group for the first function that
has not yet been assigned a group.
- WEAK_CLAMP
- This shifts the output a certain fraction of the way towards the
externalInput. The fraction is determined by the
clampStrength. Specifically, the function is:
o = o + clampStrength * (externalInput - o)
Output Modifying Types
- OUT_INTEGR
- This is just like IN_INTEGR but it integrates the output rather than
the input. This is put on by default in a CONTINUOUS network unless
IN_INTEGR is specified.
- OUT_NORM
- This normalizes the outputs of the units in the group to sum to
1.0. It probably should not be used unless the un-normalized values
are constrained to be positive. The SOFT_MAX function should be used
rather than an EXPONENTIAL followed by OUT_NORM because SOFT_MAX will
avoid numerical overflow.
- OUT_NOISE
- This makes the output noisy. The type of noise is determined by
the group's noiseProc and noiseRange parameters.
- OUT_DERIV_NOISE
- This injects noise into the outputDerivs on the backward
pass. The type of noise is determined by
the group's noiseProc and noiseRange parameters.
- OUT_CROPPED
- This crops the output to within the range [minOutput,
maxOutput]. You may want to use this after OUT_NOISE to
prevent outputs outside of this range.
- OUT_WINNER
- This is a winner-take-all filter. The most active unit retains its
activation and the other units are set to the minimum output value
for the group. In the backward phase, the original outputs are
restored to enable error to be backpropagated across the transfer
function.
Douglas Rohde
Last modified: Tue Nov 21 03:03:49 EST 2000