| Parameter | Default value | Description |
| FEAT | 200 | The number of feature words. |
| TARG | 10000 | The number of target words. |
| COS_THRESH | 0.66 | The cosine threshold. |
| CWTARG | 5000 | The number of nodes in the cosine graph for partitioning 1. |
| TOPADD | 9500 | The number of nodes in the cosine graph when building the lexicon. |
| NB_THRESH | 4 | Threshold for number of cooccurrences. |
| NB_MAX | 100 | Maximum number of neighbours that are considered when computing similarity. |
| CONF_OVERLAP | 2 | Minimal overlap of clusters from partitioning 1 and 2. |
| SING_ADD | 200 | The number of words to be added as single clusters. |
| BEHEAD | 2000 | The number of words to be skipped for partitioning 1. |
| TOKENIZER | medusa | Specifies the tokenizing procedure (medusa, normal). |
| PREPROC | true | Perform preprocessing.* |
| PART1 | true | Perform partitioning 1.* |
| PART2 | true | Perform partitioning 2.* |
| JOINPARTS | true | Perform joining of partitionings.* |
| BUILDLEX | force | Perform building of the lexicon.* |
| TAGGING | false | Perform tagging.* |
| DELTEMPFILES (optional) | true | Controls whether temporary files are deleted or not (possible values are 'true' and 'false'). |
| Table 1: Most parameters are specified in the unsupos/config/unsupos.conf
file. Note that the parameter S_THRESH (significance threshold for nb-cooccurrences) is set in the Medusa configuration file.
* Possible values are true (create files if they do not exist), false (do not
create corresponding files, i. e. skip this step), force (create/overwrite files),
and break (stop the UnsuPosTagger). |