Skip to content

Using ATHENA

ATHENA can be run from the command-line by executing "athena.py" (or "python athena.py") and specifying the desired inputs, outputs and other optional settings. Options can be either passed on the command-line or can be included in a paramters file that can be passed to ATHENA on the command-line.

Every option (except for the parameter file option) is available on both the command-line and in the parameter file. The available options are the same no matter where they appear, but are formatted differently. Options on the command line are lower-case, start with two dashes and may contain single dashes to separate words (such as outcome-file), while in a configuration file the same option would be in upper-case, contain no dashes and instead use underscores to separate words (i.e. OUTCOME_FILE).

Parameters in the parameter file take precedence over any passed on the command-line. Many parameters have default values.

All options are listed here in both their command line and parameter file forms. If an option allows or requires any further arguments, they are also noted along with their default values, if any. Arguments which are required are enclosed in <angle brackets>, while arguments which are optional are enclosed in [square brackets].

File Options

Command-line Parameter file Arguments Information
--color-map-file COLOR_MAP_FILE [STR] Default: NONE. File for specifying colors in output plots
--contin-file CONTIN_FILE [STR] Default: NONE. File containing continuous input data
--geno-file GENO_FILE [STR] Default: NONE. File containing genotypic input data
--grammar-file GRAMMAR_FILE [STR] Default: NONE. File containing BNF grammar for GE
--outcome-file OUTCOME_FILE [STR] Default: NONE. File containing outcome (phenotype) variables
--param-file [STR] Default: NONE. File containing parameters
--test-contin-file TEST_CONTIN_FILE [STR] Default: NONE. File containing test continuous input data
--test-geno-file TEST_GENO_FILE [STR] Default: NONE. File containing test genotypic input data
--test-outcome-file TEST_OUTCOME_FILE [STR] Default: NONE. File containing test outcome (phenotype) variables

Data Options

Command-line Parameter file Arguments Information
--cv CV [INT] Default: 5. Number of cross-validations to split data
--geno-encode GENO_ENCODE [STR] Default: NONE. Sets genotype encoding. Must be either add_quad or additive
--includedvars INCLUDEDVARS [STR STR ...] Default: NONE. Variable names to be included in run (accepts multiple arguments)
--missing MISSING [STR] Default: NONE. Value denoting missing data in input files
--outcome OUTCOME [STR] Default: NONE. Column header designating which variable to use. Uses first when not designated.
--scale-contin SCALE_CONTIN Default: False. Scale continuous input variables from 0-1.0
--scale-outcome SCALE_OUTCOME Default: False. Scale outcome variables form 0-1.0

Reporting Options

Command-line Parameter file Arguments Information
--hof-size HOF_SIZE [INT] Default: 1. Sets number of the best networks to save for reporting. Must be >= to elite size
--out OUT [STR] Default: athena_results. Specifies name prefix for output file

Algorithm Options

Command-line Parameter file Arguments Information
--codon-consumption CODON_CONSUMPTION [STR] Default: eager. Whether grammar will consume codons when only one choice for a rule in the grammar. (eager, lazy)
--codon-size CODON_SIZE [INT] Default: NONE. Maximum value of a codon in the genome of an individual in the evolutionary population. At a minimum it should be>= the largest number of choices for a rule in the grammar
--fitness FITNESS [STR] Default: balanced_acc. Metric for fitness (balanced_acc, auc or r-squared)
--genome-type GENOME_TYPE [STR] Default: standard. GE genome type to use (standard, leap or mcge) generation
--gens GENS [INT] Default: 50. Number of generations in evolution
--init INIT [STR] Default: sensible. Initialization procedure (sensible or random)
--max-depth MAX_DEPTH [INT] Default: 100. Max depth for mapping with individuals exceeding this being invalid
--max-init-tree-depth MAX_INIT_TREE_DEPTH [INT] Default: 11 Maximum depth for tree created by sensible initialization
--max-init-genome-length MAX_INIT_GENOME_LENGTH [INT] Default: 250. Maximum genome length for random initializtion
--min-init-tree-depth MIN_INIT_TREE_DEPTH [INT] Default: 7. Minimum depth for tree created by sensible initialization
--min-init-genome-length MIN_INIT_GENOME_LENGTH [INT] Default: 50. Minimum genome length for random initializtion
--nelite NELITE [INT] Default: 1. number of best networks carried over to next generation
--p-crossover P_CROSSOVER [FLOAT] Default: 0.8. Probability of a crossover during selection
--crossover CROSSOVER [STR] Default: onepoint. Crossover operator type. (onepoint, match, block)
--crossover2 CROSSOVER2 [STR] Default: NONE. Crossover operator type for algorithm to switch to. (onepoint, match, block)
--gen-cross-switch GEN_CROSS_SWITCH [INT] Default: NONE. Generation at which to switch crossover operator
--p-mut P_MUT [FLOAT] Default: 0.01. Probability of a crossover during selection
--pop-size POP_SIZE [INT] Default: 250. Population size for GE algorithm
--random-seed RANDOM_SEED [INT] Default: 12345. Random seed
--selection SELECTION [STR] Default: tournament. Selection type (tournament, lexicase, epsilon_lexicase)

Parellization Options

Command-line Parameter file Arguments Information
--gens-migrate GENS_MIGRATE [INT] Default: 25. Generational interval for migrating best individuals for multi-process run