Although Lens is primarily object-oriented in its design, in that networks and example sets are natural represented as object hierarchies, the code does not conform religiously to object-oriented principles. Both for expediency and efficiency, the internal fields of an object are visible and modifiable by any module that includes the header file defining the object.
One of the overriding principles that has been found to be useful in designing Lens is that of layered structure. The default behavior should be that desired by most users, but users also ought to be able to modify that behavior at any of a number of levels.
One example of this is the way the mean and range of the distribution used for weight randomizations is specified. Fields for these parameters exist at the network, group, and block level. When a weight is randomized, the values at the block level are first checked. If those don't exist (have the value NaN), those at the group level are checked. If those don't exist, the network's values are used. Most users want the same values for the entire network, so initially all but the network's values are set to NaN and all of the weights can be affected by simply changing the network's values. More advanced users can specify the values at the level of a group or a link type (the block level) to get increasingly fine-grained control.
A similar example occurs in weight freezing. Weights can be frozen at the level of the network, a group, unit, or block. By freezing the network, all weights can be immediately frozen and need not even be traversed during weight updates. Otherwise, the user may freeze or thaw weights at any level down to the block.
The most important use of levels is in the design of the network updating and backpropagation commands. The network structure contains pointers to functions used for training the network at the level of an entire session, a batch, an example, or a tick. Each function tends to repeatedly use the function pointer for the function at the next level down the hierarchy. Therefore, a function can be replaced at any level to modify the call stack.
When altering the behavior of the network, the simplest change is to replace the functions that operate on each tick. The more complicated functions at higher levels can be retained intact. In general, the more complex a user's modification, the higher up the default call stack must be intercepted and the more code the user will need to copy and modify.
So the key in making modifications is to determine the level at which the modification will be the easiest and will fit into the existing code most seamlessly. The use of function pointers, rather than hard-coded functions, means that Lens can often be modified without making changes to any of the source code files other than extension.c.
The basic types most commonly used in Lens are the char
,
int
, real
, flag
, and
mask
. A "real" is a floating point number which is usually
a 4-byte float, but could be an 8-byte double if compiled with the
-DDOUBLE_REAL flag. To work properly in that case, real
should be used for all floating point numbers.
A flag
is a boolean value. It is actually equivalent to an
unsigned int. Chars are not used because of the need to link these
values to Tcl variables and Tcl doesn't recognize the char. A
flag
should be used rather than another type if the value
really is supposed to be either true (non-zero) or false (zero), or
TCL_OK (0) or TCL_ERROR (1). The macros TRUE and FALSE, rather than 1
and 0, should be used for boolean constants.
A mask
is an unsigned int
used to hold a bit
mask, where each bit is a distinct boolean value. Typically
mask
s are used for the type field of networks, groups, and
other structures. Most of the type bits used in the mask
s
are defined in type.h.
If you are comparing two real numbers and there is some chance that they both are NaN (and you would like them to be considered equal in that case), use the SAME(x,y) macro rather than (x == y).
All C functions that are to be registered as Tcl commands must return either TCL_OK, or TCL_ERROR. TCL_OK evaluates to false and TCL_ERROR to true. Accompanying the return value is a return message, which is stored in the Tcl interpreter. More about this is said in the section on creating a new shell command.
Any function that has some chance of incurring an error should return an
error code and have a return type of flag
. The first
function to detect the error should fill in the error message. Ideally,
the TCL_ERROR return value will be passed up the call stack until the
initial C function returns and the shell prints out the error message,
or whatever is appropriate at the time. Therefore, it is important for
user-defined functions to catch the error values generated by other
functions they call and to immediately return TCL_ERROR if an error was
detected. Otherwise, errors may occur silently, which will be very
frustrating for later users.
New error messages are typically generated with the warning() command, as in:
return warning("%s: index %d is out of range", argv[0], i);warning() will set the return message in the interpreter and return TCL_ERROR.
To exit cleanly with a message, result() should be used rather than warning(). append() will add more to the end of the message stored in the interpreter. error() will print a message to standard error (rather than store it in the interpreter) and return TCL_ERROR. This is normally used within a function that was not invoked by the interpreter, such as during initialization. fatalError() is similar to error() but it will kill the Lens process. Only use this when something really bad happens.
The Lens standard for function names is for the name to be in lower case except the first letter of each word following the first, as in "netForAllOtherGroups()". Macros (anything #defined) should be in all upper-case with underscores between words.
Each word of a global variable should start with an upper case letter, but the rest is lowercase, as in "LinkTypeName".
Most local variables follow the same style of naming as functions, although few will be more than one word. Additionally, single-character upper-case variables are commonly used for pointers to a particular type of structure. Here are some of the typical names:
Network N; Group G, H; Unit U, V; Block B; Link L; ExampleSet S; Example E; Event V, W;
Any function that is not used outside of the module in which it was
defined should be declared static
. This will prevent other
modules from using the function, which will make the code easier to read
and modify. It will also provide a hint to the compiler that could make
calls to that function faster. Any function that is not declared
static should be given an appropriate declaration in the corresponding
.h
file.
Some arrays used in the Lens structures, such as the group
array, are arrays of pointers to structures. Others, such as the
unit
array within a group or the link array within a unit
are arrays of structures arranged sequentially in memory. However,
because traversing sub-structures is such a big part of network code,
special macros have been written to concisely handle most sorts of
traversal. These are located in network.h. For example, instead of:
int g; Group G; for (g = 0; g < Net->numGroups; g++) { G = Net->group[g]; foo(G); }you can simply do:
FOR_EACH_GROUP(foo(G));or
FOR_EACH_GROUP({ foo(G); });
There is no need to declare the G and g variables as they are declared locally by the macro. To traverse all the non-lesioned units in the network, you'd do:
FOR_EACH_GROUP(FOR_EACH_UNIT(G, foo(U)));Using macros is better than writing traversal functions because it is faster and the procedure, foo() in this case, can be an arbitrary piece of code rather than a function of fixed type. Also, with macros the underlying data structures can be changed without changing most of the code traversing those structures. When using macros, you should be aware of what local variables are declared in the macro. You should not change the value of any of these variables because you could break the macro.
Typically, when you create a new type of something, be it a network, group, weight update algorithm, command, or whatever, you will write some functions associated with that type. Lens provides a number of commands that allow you to register the type. This will associate the name of the type with the functions that define it. Then you can use the type just like it is a built-in Lens type.
All that was pretty vague and probably didn't make much sense, but it will be clear when you read the sections on how to create something of each of these types in the "How to Create..." section of the manual.
Built-in group types are declared in type.h. Default types and type conflicts are resolved in type.c. The group types are registered in act.c using the registerGroupType() function. It's good to know this things if you are trying to understand the Lens code.
Each unit contains a history array for its inputs, outputs, targets, and outputDerivs. Depending on the unit's type, some of these arrays may not be allocated, in which case they will be set to NULL. Otherwise, the arrays will have length equal to the historyLength field in the network and in the unit. The length of the histories can be changed by the user with the setTime command.
When setting or retrieving a value in a history array, the array is
treated as a circular buffer. The index should always be taken modulo
the historyLength. This will be done automatically if you always
set values in these arrays with the SET_HISTORY()
macro and
retrieve values with the GET_HISTORY()
macro, both defined
in network.h. These macros will also check to be sure the array exists.
SET_HISTORY()
takes four arguments: the unit, the name of
the history array, the index, and the new value.
GET_HISTORY()
takes three arguments: the unit, the name of
the history array, and the index and will evaluate to the value stored
in the array at the given index (modulo historyLength) or NaN is
the array doesn't exist. Thus, instead of doing:
U->outputHistory[Net->currentTick] = U->output;which will not work, you should do:
SET_HISTORY(U, outputHistory, Net->currentTick, U->output);When you write your own code, you can use the extraneous history arrays, like targets for non-output units, to store whatever values you want. A number of macros are provided in network.h for transfering values from units to history arrays of the same group or a different group and back again.
Rather than calling
They take the same arguments as the standard functions with the
exception of an additional label for the variable being allocated. This
label will be used in any messages that are produced to aid in tracking
down the error. The standard label is "functionName:variableName", as
in "initNetworkExtension:N->ext".
Any code that relies on the existence of the lastValue field
within each link should be surrounded by the following statements:
As mentioned, you should try to contain your changes as much as possible
to the extension module. However, on occasion you may need to make
changes to other files, type.h in particular. In this case, you should
clearly mark any alterations to the code so that you can move the
alteration to the new version if the original file changes. The method
I use to mark the code is to surround any change with the following
comments:
The Tcl code is interpreted, meaning that it is not compiled in advance
but just used on the fly. This is frustrating since it means that most
errors, including something simple like a misspelled variable, will go
undetected until it is too late. The advantage to interpreted code is
that development is faster. The code can be modified and
re-sourced immediately without
restarting Lens.
In order to avoid conflicts in naming functions and variables, the
built-in Lens Tcl code will precede all function names and global
variables with a period and all global arrays with an underscore.
As long as the user does not define or change variables that begin with
a period or underscore, there should be no problem. Recall that global
Tcl variables cannot be accessed from within a Tcl procedure unless they
are declared global.
malloc()
, calloc()
, and
realloc()
directly, all memory allocation should be done
with safeMalloc(),
safeCalloc()
, and
safeRealloc()
, which are defined in util.c. These will
perform some error checking and, if a memory error occurs, shut down the
program with an error message. Ok, so they aren't very safe, but at
least they tell you where the error occurred.
Advanced Lens
#ifdef ADVANCED
...
#endif /* ADVANCED */
This will prevent them from being compiled into the standard version of
Lens. This generally affects only the definition and registration of
weight update algorithms.
Making Changes
/* DOUG... */
new code
/* ...DOUG */
Tcl Modules
Douglas Rohde
Last modified: Mon Nov 13 23:52:25 EST 2000