Giuseppe Cocomazzi | sbudella at gmail dot com
[hg bundle, all changesets]
The only dependency needed is the Chicken Scheme implementation, available in pre-compiled packages for most platforms. The library comes with a tiny Makefile that generates the shared libs and the import files, so running a simple make is enough to use the library interactively from within the Chicken interpreter. Front-end facilities are provided in the form of the following "driver" scripts:
draw.scm: expect from stdin a list of pairs
((s a m p l e) . 0|1) that
is, the first element is a proper list of symbols representing a string the
inferred automaton shall accept or reject according to whether the second
element of the pair is 1 or 0, respectively. Examples of input samples are
provided in the samples directory. The script writes to stdout a complete
graph representation of the automaton inferred from the list of samples,
described in the Graphviz DOT language. Furthermore, a dump of the automaton in
a form suitable for the Scheme read function is written on file descriptor 4,
so redirect this output to a normal file if you want to preserve the machine
graph.sh: a user-friendly wrapper for the driver script described above. If a Graphviz installation is found, a proper drawing of the graph of the automaton is also generated.
test.scm: revive a DFA from its machine dump and test it against a list of samples provided from stdin.
We want to infer a DFA starting from the collection of sample strings in
samples/even.sample. The given file attempts to provide enough information
to the algorithm to reconstruct the automaton able to recognize even numbers in
binary form. To generate the graph and the machine dump:
$ ./draw.scm <samples/even.sample 4>even.dump
or, using the graph.sh script:
$ ./graph.sh samples/even.sample Machine dump: even.dump Graph: even.gv Drawing: even.png
Let's try a more difficult challenge. We want to reconstruct the DFA from one
of the training sets available from the Stamina competition
page. Provided that the downloaded files are in
the samples/stamina_sets directory, let's first convert the data to the
S-expr format; a simple script is provided for the purpose:
$ ./lispify.sh <samples/stamina_sets/1_training.txt >samples/1.sample $ ./graph.sh samples/1.sample Machine dump: 1.dump Graph: 1.gv Drawing: 1.png
We now use the reconstructed finite state machine to test it against the
competition test case:
$ ./lispify.sh -t <samples/stamina_sets/1_test.txt | ./test.scm 1.dump (0 1 1 0 0 0 0 0 0 0 0 0 0 0 0) ---> #t (0 1 0 0 0 0 0 1 1 1 0 1) ---> #t (0 1 1 0 1 0 0 0 0 0 1 1 1 0 0 0) ---> #t ...
Copyright (c) 2016 Giuseppe Cocomazzi sbudella at gmail dot com
Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.