Files: General File Usage

When reading a file with loadWeights or loadExamples or writing a file with saveWeights, saveExamples, or openNetOutputFile, file names with the extension .gz, .bz, .bz2, or .Z will automatically be decompressed or compressed as appropriate. When loading a file, if a file with the exact name does not exist but a file of the same name plus .gz, .bz, .bz2, or .Z does exists, the compressed file will be decompressed. Therefore, the user need not know in advance if the file is compressed. If a file name does not begin with "-" or "|" it will be treated as an actual file rather than a pipeline or Tcl channel.

If the file name is simply a dash ("-"), standard input or standard output will be used, as appropriate. If the name begins with a dash, the remainder of the name will be treated as the name of an existing Tcl channel. A Tcl channel is created with the open command and closed with close. The channel can be a file or a command pipeline and may be opened for reading, writing, or both. The channel will still be open when the Lens command using it completes. The following example shows how to use a filter, chooseExample, that chooses the next training example given the network's output on the previous example:

    lens> set channel [open |chooseExample RDWR]
    lens> openNetOutputFile -$channel
    lens> loadExamples -$channel -s theSet -m PIPE
    lens> useTrainingSet theSet
    lens> train
    lens> deleteExampleSet theSet
    lens> close $channel

If the file name begins with a pipe ("|"), a process will be started and the data read from or written to the pipeline. This is an alternate way to interact with a pipeline that avoids having to open and close a Tcl channel. This differs in that the pipeline will be closed when the command completes (except when using the openNetOutputFile command). In addition, there is no way to create a two-way pipe as in the previous example without making it a channel.

The description of the xor.in network gives an illustration of how an example-generating program can be used in this way.

The following will let you preview the output of saveExamples by piping it through more:

    lens> saveExamples "my set" |more

To write to a pipeline whose description consists of more than one word, you will need to enclose the command and the initial | in quotes:

    lens> saveExamples "my set" "| grep name | sort -r > names"

You could also read from a pipeline, which would be convenient when generating example sets on the fly. Note that the usage of the pipe symbol is a bit unconventional here, since we are reading from the pipeline rather than writing to it. The first example reads from a file, discarding the example tags. The second reads from a process that is presumably generating a finite number of examples. If you have trouble reading from pipes, read this note.

    lens> loadExamples "| grep -v tag mySet.ex" -s "my set"
    lens> loadExamples "| genex 1000" -s "my set"

Finally, you can specify example sets by hand by doing this:

    lens> loadExamples "|cat" -s toySet

...and then typing in your example set and pressing Ctrl-d when you are done. This is useful if you are just experimenting and want a very small example file or perhaps if you don't feel like ftping a file but you can use your mouse to copy and paste a small example file.

Automatic decompression, reading from a pipe, and reading from stdin, will not work for script files run with the source command or for postscript files generated when a display is printed.

Douglas Rohde

Last modified: Mon Nov 20 18:38.30 EST 2000