LENS

Special Topics: Building and Initializing Networks


The primary commands for building networks are addNet, addGroup, and connectGroups. addNet takes the name of the network and optionally the maximum number of time intervals that an example may run, the number of ticks per interval (equal to 1/dt), and flags specifying the network type. For standard feed-forward and simple recurrent networks, you only need to specify the name of the network and the number of time intervals. The special network types at the moment are SRBPTT, CONTINUOUS and BOLTZMANN. The ticks per interval should only be set to something other than 1 for continuous or Boltzmann networks.

addNet can also be used to create many feed-forward or simple-recurrent networks with a single command. Or it could be used in conjunction with some additional connectGroups commands to create nearly any network with a minimum number of commands.

There is no way to save the structure of the network or generate any sort of project files in Lens. You must write scripts by hand if you would like to reuse an architecture. The only parts of a network that can be saved are the raw weights and the parameters.

Groups

When a group is constructed, you must specify its name, number of units and type. The number of units in a group is fixed when the group is created. There is no way to add or delete units short of destroying the group and building a new one. All of the units in a group have basically the same type (although it is possible to have FROZEN and non-FROZEN units and LESIONED and non-LESIONED units in the same group). Therefore most interesting types in the network are at the group level.

The group order is the order in which the groups appear in the Network's group array. This is the order in which groups will be traversed during forward passes. The default group order will be the order in which groups are created. However, it can be changed with the orderGroups command.

A "bias" group is automatically created with the network.

Connections

Projections are normally formed between groups using connectGroups, although finer control over connectivity can be obtained using connectGroupToUnit or connectUnits. connectGroups builds full connectivity by default, but it can create one-to-one connections or several patterns of randomized sparse connectivity.

In Lens, connections, or links, are owned by the receiving unit. Units only keep an array of incoming connections and it is a bit difficult to locate the links coming out of a particular unit. Links are organized into blocks. A block is a set of links that come from consecutive units in the same sending group. When using full connectivity, a typical hidden unit might have two blocks: one containing the link from the bias unit and one containing the links from the input units. The links themselves are stored in a single array, the unit's incoming array, with no physical partitioning into blocks. The unit also has a block array containing the information for each block. This includes a number of other parameters specific to the links in the block. It also includes the number of links in the block, which is used to determines which links belong to which blocks.

Each block of links also has a link type associated with it. Link types are handles that allow operations to be performed on selected links. From the user's point of view, the link type is a string label. If no type is specified, the type of a new set of links will be the name of the sending group. For example:

    connectGroups input1 hidden
would create blocks of links with type "input1". On the other hand,
    connectGroups input1 hidden -t studly
would create links with type "studly".

A given pair of sending and receiving units may have more than one connection between them, but the connections must be of different types or they would be indistinguishable. Most users need not be concerned with link types as the automatically created types will usually suffice.

To connect a source group to an ELMAN context group, the elmanConnect command must be used. When a projection of type ONE_TO_ONE is created, the link weights are by default frozen at 1.0.

Link Weights

Each block of links has associated with it a randMean and randRange. These determine the mean and range of the link weights when the block is randomized. Weights will be selected uniformly over the range [randMean-randRange, randMean+randRange]. If either value is equal to NaN, which is the default, then the corresponding value in the group structure will be used. If that too is NaN, the network's value will be used. Most users will just want to randomize all weights with the same parameters and need only be concerned with the network-wide values.

The block's min and max fields can be used to set bounds on the link weights. By default, these are NaN, which means there is no bound. If min is set, any link weight less than min is set to min following each weight update. Likewise, weights cannot be larger than max. If min or max is 0 and a link hits the bound, the weight will be set to zero and, in the absence of noise, will remain at zero because it will have no effect on the error. If you wish to avoid this, you might want to use a small positive or negative value rather than 0. Also be sure to set the randMean and randRange so that the initial weights will fall within the bounds.

The setLinkValues command can be used to change the randMean and randRange or other block parameters for selected types of blocks in the entire network or part of the network. The randWeights command will randomize the weights of all or selected link types in the whole network or just part of it. Normally the user need only use the resetNet procedure, which randomizes all non-frozen weights in the network and performs other initializations in preparation for training.

Freezing

Often you want a set of links in the network to be fixed. For example, you might pre-train half the network and then fix that part while you train the rest. This is accomplished by freezing the links. When links are frozen, their weights will not be changed during weight updates. Frozen links will also not be randomized by the resetNet command, but they will be randomized by randWeights. Furthermore, frozen links will not contribute to the weightCost or gradientLinearity.

Another important use of freezing is in assembling a network from one or more pre-trained parts. The loadWeights command can selectively load weights into frozen or thawed parts of a network. Therefore, temporary freezing can be used to load weights into selective parts of a network.

The smallest grain at which freezing can occur is at the block level. Individual links within a block cannot be selectively frozen. In order to freeze some links in a projection and not others, the links must be given different link types so that they will not become part of the same block.

The freezeWeights command is used to freeze links in all or part of the network. Each accepts an optional link type if only certain blocks are to be frozen. If all the links into a unit or group are frozen, the entire unit or group will be skipped during weight updates, which is a bit faster. The effects of freezing is reversed with thawWeights.

It is possible to selectively save and load only certain links in the network by freezing the links you do not want saved or changed and calling saveWeights or loadWeights with the "-nofrozen" parameter. If part of your network has been pre-trained and is stored in a weight file, you may need to freeze the other links in the network, load the file, thaw the links and then freeze the ones you just loaded.

Lesioning

Lesioning is a form of instantaneous, relatively permanent damage to a set of links or a unit. Links are lesioned with the lesionLinks command. This will affect the incoming links of a particular type to a specified group. You may specify the percentage of links that will be affected. There are three forms of link lesions: the links could be permanently removed, they could have their weights set to a particular value (normally 0.0), or noise could be added to their weights. These changes are permanent in that the old link weights are not stored. It is advisable to save the link weights before lesioning the network. Deleted links cannot be restored short of creating them again.

The second major form of lesioning is at the unit level and relies on the lesionUnits command. Lesioning is a temporary state for a unit. A lesioned unit has 0.0 output and output derivative. It will not compute its input or output or their reverses during forward and backward passes. Lesioned output units will not contribute to the error. Units can be unlesioned with healUnits.


Douglas Rohde
Last modified: Thu Feb 12 23:18:36 EST 1998