ga-nn-ag 
Copyright 2004-2005 Oswaldo Morizaki Hirakata

License notice:

    ga-nn-ag is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    ga-nn-ag is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with ga-nn-ag; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA



Things you need:
	Syslog daemon
	gcc
	Dependencies listed in "my_header.h"	


Some things you need to know:

Put all the files in the same directory, I mean binaries, config files, input 
and output files (you could put input and output in other place, but you should 
add the path in the input_preffix and output_preffix values).

Five syslog facilities are required for the log (using the syslog daemon): 

	local0 (for the genetic algorithm server), 
	local1 (for the genetic algorithm client), 
	local2 (for the neural network), 
	local3 (for additional neural network programs), and 
	local4 (for the remote server)
	
Three config files are required: ga_server_config, nn_train_config, 
nn_test_config.


Configuration files:
=================
ga_server_config:
=================

This file controls the parameters of the genetic algorithm. Most of the 
parameters are mandatory. All the values need to be in the next line below the 
parameter name, followed by an enter (newline). Every probability (_prob) should
be between 0.0 and 1.0.


agregation (unsigned int)
Note the mispell. If zero, the system generates a set to search for a solution.
If different of zero, then agregation is the number of outputs to  be added to 
a current solution. This require the result* files of  a previous search. 
(Please make a backup of your result files, since this would be changed if 
agregation is over zero).

base (unsigned float)
The ratio between the best and worse fitness (corresponding to the best and 
worse network). This imply that the best network have base-times more chance to 
breed.

boost (unsigned float)
This is the relative influence of the size of the network. It affect the fitness 
value in an inverse exponential way. For example, if boost is equal to one and 
we have 2 networks having the same error, but one is the biggest one, and the 
other is the smallest one in the set, the big one have just 37% (1 / e) of the 
fitness value of the small one. If your not sure, keep it in zero.

generations (unsigned int)
The maximum number of generations the system runs.

hostX (ip address)
X should be a number (starting with zero, and going up). This is a list of ip of 
remote machines ip (one hostX declaration per ip), in num-doted format, running 
the ga_server_remote daemon to activate remote clients. If there's just one host,
 this isn't necesary.

init_preffix (const char[256])
Name preffix for the init (blueprint) of the neural networks: init_preffix0 is the 
blueprint for neural network 0, init_preffix1 for the neural network 1, and so on.

ip (ip address)
Bind address ip, in num-doted format (xxx.xxx.xxx.xxx). This is the ip where 
remote clients connect. 

level (unsigned float)
The error level. In case the error of any neural network becomes below this 
value, the system exits. This parameter is a bit tricky, since the error is the 
sum of all errors in the testing phase (I mean square errors). Just put a low 
value if you want to pass all the generations.

local_server_port (port number)
The port for incoming remote client connections. Actually, this is the first
of the port range, since every client requires a different port (for simultaneous
persistent connections). So up to "local_server_port" + "poblation" are required.

master (boolean)
In case of more than one niche, a master niche (who starts everything) must 
exist, so in a multi niche model, the master niche should be a 1, and the other
must be set to zero.

max_local (unsigned int)
Maximum number of local clients running. If poblation is over this number, the system
tries to connect to a remote host to run the remaining clients (up to max_local clients
per host).

max_neuron_mut (unsigned int)
The maximum number of neurons that could be added / deleted per network per 
generation in an event of mutation (Note that I said “could”)

max_num_layer (unsigned int)
Maximum number of layers for the starting poblation (including input and output 
layer)

max_num_neuron (unsigned int)
Maximum number of neurons per layer for the starting poblation (Not including 
input an output layer).

mut_net_prob (unsigned float)
The probability of mutation of a whole network. Mutation would add or delete up 
to max_neuron_mut neurons.

mut_neuron_prob (unsigned float)
Probability of mutation of a single neuron. This mutation could affect one of 
the parameters of the neuron.

mut_new_prob (unsigned float)
The probability of getting a whole new offspring (a newly generated random 
network) instead of a child.

nicheX (ip address)
X should be a number (starting with zero, and going up). This is a list of ip of 
remote machines ip (one nicheX declaration per ip), in num-doted format, 
running the ga_server_remote daemon to run remote niches. 
If there just one niche, this parameter isn't necesary

poblation (unsigned int)
This is number of neural networks it would test per generation. Please keep in 
mind that every network requires two process slots to run, and 4-5 open descriptors,
so the upper value for this will be different for every distribution 
(I tested with up to 180 and works fine). 

prob_alpha
prob_conv_rate
prob_bias_corr
prob_delta_type
prob_momentum
prob_num_con
prob_bias 
In the event of a neuron mutation, one parameter of the neuron got changed. 
These values are their relative probability of get selected to change. If they 
are equal, then they have equal probability to being selected.

remote_server_port (port number)
Listening port for the ga_server_remote daemon, which spawns the 	remote clients.

result_preffix (const char[256])
Similar to init_preffix, but this is for the resulting (trained) network.

==== END OF SECTION ga_server_config ====

================
nn_train_config:
================
This file sets the rules for the neural network under training. Format is equal 
to any other configuration file, values goes below parameter name.

input_preffix (const char[256])
Input file names are designed by a prefix and a number in this way. For example, 
if “num_pat” is 30, and the input_preffix is “input”, then input file names 
would be: input0, input1, input2 ... input28, input29. This parameter could include
path values like: "val/input" (directory "val" relative to the working
directory, with preffix "input").

Input file content format is simple, just put values in a stack, I mean values 
are separated by the newline character ('\n'). Don't add comments or any text, 
just the input values. For example if the input is formed by the values: “0, 
0.5, 3, 2.5”, Input file content should look like:

0
0.5
3
2.5

Note that there aren't spaces at the beginning of the file.

mode (unsigned int)
Operation mode flag. Zero means production slave (test mode for ga_client), one 
means training, 2 means production. Because this file is for the training phase, 
this value needs to be one.

num_input (unsigned int)
This is the number of elements in the input layer.

num_output (unsigned int)
This is the number of elements in the output layer. If agregation is over zero, 
then this number should be the new number of outputs.

num_pat (unsigned int)
This is the number of training patterns. 

output_preffix (const char[256])
Similar to input_preffix. Content also is similar to input files, with one 
limitation: input values could be any floating point value (In the “C float” 
range). Output is limited to values between zero and one, because of the 
threshold function

==== END OF SECTION nn_train_config ====

==============
nn_test_config
==============
It's very similar to the nn_config file, indeed it could be a carbon copy if you 
want (in such case, you will be testing the network with the training inputs and 
outputs), except for the training value, this need to be zero in this case.

Please consult the example config files for further information. Or you could 
email me: cvn62@yahoo.com

