|
waitForProcess(process,
numCorpusSentences,
measureByGap,
outputFile,
counterName,
updateMessage,
timeout=None)
Waits for a process to finish, and tracks the number of entities it
writes to it's outputfile. |
source code
|
|
|
makeSubset(input,
workdir,
fromLine)
Make a subset of the input data from "fromLine" to end of
input file. |
source code
|
|
|
mergeOutput(dir,
numCorpusSentences,
measureByGap,
outputArgs={ } )
Merge output files (multiple files may have been created if program
failed on a sentence) |
source code
|
|
|
getSubsetEndPos(subsetFileName,
measureByGap)
Return the sentence count to which this process reached by counting
the sentences in the output file. |
source code
|
|
|
getLines(filename,
measureByGap)
Number of sentences in the file, measured either in lines, or by
empty "gap" lines |
source code
|
|
|
runSentenceProcess(launchProcess,
programDir,
input,
workdir,
measureByGap,
counterName,
updateMessage,
timeout=None,
processArgs={ } ,
outputArgs={ } )
Runs a process on input sentences, and in case of problems skips one
sentence and reruns the process on the remaining ones. |
source code
|
|
|
|
|
|
|
|
|
|