Liste de liens :
Liste de liens :
Option Description ------ ----------- -?, --help show help --algorithm <MachineLearningAlgorithm> machine learning algorithm: [MaxEnt, LinearSVM, Perceptron, LinearSVMOneVsRest] --analyse analyse text --beamWidth <Integer> beam width in pos-tagger and parser beam search --blockSize <Integer> The block size to use when applying filters - if a text filter regex goes beyond the blocksize, Talismane will fail. --builtInTemplate pre-defined output template: <Talismane$BuiltInTemplate> [standard, with_location, with_prob, with_comments, original] --compare compare two annotated corpora --crossValidationSize <Integer> number of cross-validation folds --csvEncoding <String> CSV file encoding in output --csvLocale <String> CSV file locale in output --csvSeparator <String> CSV file separator in output --cutoff <Integer> in how many distinct events should a feature appear in order to get included in the model? --earlyStop <Boolean> stop as soon as the beam contains n terminal configurations --encoding <String> encoding for input and output --endModule <Talismane$Module> where to end analysis: [languageDetector, sentenceDetector, tokeniser, posTagger, parser] --evalFile <File> evaluation corpus file --evalPattern <String> input pattern for evaluation --evalPatternFile <File> input pattern file for evaluation --evaluate evaluate annotated corpus --excludeIndex <Integer> cross-validation index to exclude for training --features <File> a file containing the training feature descriptors --inFile <File> input file or directory --includeIndex <Integer> cross-validation index to include for evaluation --includeUnknownWordResults <Boolean> if true, will add a file ending with ". lexiconCoverage.csv" giving lexicon word coverage --inputEncoding <String> encoding for input --inputPattern <String> input pattern --inputPatternFile <File> input pattern file --iterations <Integer> the number of training iterations (MaxEnt, Perceptron) --keepDirStructure <Boolean> for analyse and process: if true, and inFile is a directory, outFile will be generated as a directory and the inFile directory structure will be maintained --labeledEvaluation <Boolean> if true, takes both governor and dependency label into account when determining errors --languageCorpusMap <File> a file giving a mapping of languages to corpora for langauge-detection training --languageModel <File> statistical model for language recognition --lexicalEntryRegex <File> file describing regex for reading lexical entries in the corpus --lexicon <File> semi-colon delimited list of pre- compiled lexicon files --linearSVMCost <Double> parameter C, typical values are powers of 2, from 2^-5 to 2^5 --linearSVMEpsilon <Double> parameter epsilon, typical values are 0.01, 0.05, 0.1, 0.5 --locale <String> locale --logConfigFile <File> logback configuration file --maxParseAnalysisTime <Integer> how long we will attempt to parse a sentence before leaving the parse as is, in seconds --minFreeMemory <Integer> minimum amount of remaining free memory to continue a parse, in kilobytes --mode <Talismane$Mode> execution mode: [normal, server] --module <Talismane$Module> training / evaluation / processing module: [languageDetector, sentenceDetector, tokeniser, posTagger, parser] --newline <String> how to handle newlines: options are SPACE (will be replaced by a space) and SENTENCE_BREAK (will break sentences) --oneVsRest <Boolean> should we treat each outcome explicity as one vs. rest, allowing for an event to have multiple outcomes? --option <Talismane$ProcessingOption> process command option: [output, posTagFeatureTester, parseFeatureTester] --outDir <File> output directory (for writing evaluation and analysis files other than the standard output) --outFile <File> output file or directory (when inFile is a directory) --outputDivider <String> a string to insert between sections marked for output (e.g. XML tags to be kept in the analysed output). The String NEWLINE is interpreted as " ". Otherwise, used literally. --outputEncoding <String> encoding for output --parserModel <File> statistical model for dependency parsing --parserRules <File> semi-colon delimited list of files containing parser rules --port <Integer> which port to listen on --posTaggerModel <File> statistical model for pos-tagging --posTaggerRules <File> semi-colon delimited list of files containing pos-tagger rules --predictTransitions should the transitions leading to the <Parser$PredictTransitions> corpus dependencies be predicted - normally only required for training (leave at "depends"). Options are: [yes, no, depends] --process process annotated corpus --processByDefault <Boolean> If true, the input file is processed from the very start (e.g. TXT files). If false, we wait until a text filter tells us to start processing (e.g. XML files). --propagateBeam <Boolean> should we propagate the pos-tagger beam to the parser --sentenceAnnotators <File> semi-colon delimited list of files containing sentence annotators --sentenceCount <Integer> max sentences to process --sentenceFile <File> the text of sentences represented by the tokenised input is provided by this file, one sentence per line --sentenceModel <File> statistical model for sentence detection --sessionId <String> the current session id - configuration read as talismane.core.[sessionId] --startModule <Talismane$Module> where to start analysis (or evaluation): [languageDetector, sentenceDetector, tokeniser, posTagger, parser] --startSentence <Integer> first sentence index to process --suffix <String> suffix to all output files --template <File> user-defined template for output --testWords <String> comma-delimited test words for pos- tagger feature tester --textAnnotators <File> semi-colon delimited list of files containing text annotators --tokeniserBeamWidth <Integer> beam width in tokeniser beam search --tokeniserModel <File> statistical model for tokenisation --tokeniserPatterns <File> a file containing the patterns for tokeniser training --train train model