TEES.Utils.DetectHeads

Module DetectHeads

Functions

getTriggers(corpus)
Returns a dictionary of "entity type"->"entity text"->"count"

source code

getDistribution(trigDict)
Converts a dictionary of "entity type"->"entity text"->"count" to "entity text"->"entity type"->"(count, fraction)"

source code

getHeads(corpus)

source code

getOverlap()

source code

removeHeads(corpus)

source code

findHeads(corpus, stringsFrom, methods, parse, tokenization)

source code

mapSplits(splits, string, stringOffset)
Maps substrings to a string, and stems them

source code

findHeadsDictionary(corpus, stringsFrom, parse, tokenization)

source code

findHeadsSyntactic(corpus, parse, tokenization)
Determine the head token for a named entity or trigger.

source code

getEntityHeadToken(entity, tokens, tokenHeadScores)

source code

findHeadToken(candidateTokens, tokenHeadScores)
Select the candidate token that is closest to the root of the subtree of the depencdeny parse to which the candidate tokens belong to.

source code

getTokenHeadScores(tokens, dependencies, sentenceId=None)
A head token is chosen using a heuristic that prefers tokens closer to the root of the dependency parse.

source code

Variables

[hide private]

__package__ = 'TEES.Utils'

Function Details

[hide private]

findHeadsSyntactic(corpus, parse, tokenization)

source code

Determine the head token for a named entity or trigger. The head token is the token closest to the root for the subtree of the dependency parse spanned by the text of the element.

Parameters:

entityElement (cElementTree.Element) - a semantic node (trigger or named entity)
verbose - Print selected head tokens on screen
verbose - boolean

findHeadToken(candidateTokens, tokenHeadScores)

source code

Select the candidate token that is closest to the root of the subtree of the depencdeny parse to which the candidate tokens belong to. See getTokenHeadScores method for the algorithm.

Parameters:

candidateTokens (list of cElementTree.Element objects) - the list of syntactic tokens from which the head token is selected

getTokenHeadScores(tokens, dependencies, sentenceId=None)

source code

A head token is chosen using a heuristic that prefers tokens closer to the root of the dependency parse. In a list of candidate tokens, the one with the highest score is the head token. The return value of this method is a dictionary that maps token elements to their scores.