Package TEES :: Package Core :: Module SentenceGraph :: Class SentenceGraph
[hide private]

Class SentenceGraph

source code

The main purpose of SentenceGraph is to connect the syntactic dependency parse (a graph where dependencies are edges and tokens are nodes) to the semantic interactions (which form a graph where interactions are edges and entities are nodes). Additionally, SentenceGraph provides several dictionaries that e.g. map element ids to their corresponding elements.

Instance Methods [hide private]
 
__init__(self, sentenceElement, tokenElements, dependencyElements)
Creates the syntactic graph part of the SentenceGraph.
source code
 
getSentenceId(self) source code
 
makeEntityGraph(self, entities, interactions, entityToDuplicates=None) source code
 
getInteractions(self, entity1, entity2, merged=False)
Return a list of interaction-elements which represent directed interactions from entity1 to entity2.
source code
 
getOutInteractions(self, entity, merged=False) source code
 
mapInteractions(self, entityElements, interactionElements, verbose=False)
Maps the semantic interactions to the syntactic graph.
source code
 
mapEntity(self, entityElement, verbose=False)
Determine the head token for a named entity or trigger.
source code
 
findHeadToken(self, candidateTokens)
Select the candidate token that is closest to the root of the subtree of the depencdeny parse to which the candidate tokens belong to.
source code
 
getTokenHeadScores(self)
A head token is chosen using a heuristic that prefers tokens closer to the root of the dependency parse.
source code
 
_markNamedEntities(self)
This method is used to define which tokens belong to _named_ entities.
source code
 
getTokenText(self, token)
Returns the text of a token, and masks it if the token is the head token of a named entity.
source code
 
getCleared(self) source code
 
mergeInteractionGraph(self, merge=True)
For merging duplicate entities
source code
Method Details [hide private]

__init__(self, sentenceElement, tokenElements, dependencyElements)
(Constructor)

source code 

Creates the syntactic graph part of the SentenceGraph. The semantic graph can be added with mapInteractions.

Parameters:
  • sentenceElement (cElementTree.Element) - interaction-XML sentence-element
  • tokenElements (list of cElementTree.Element objects) - interaction-XML syntactic token elements
  • dependencyElements (list of cElementTree.Element objects) - interacton-XML syntactic dependency elements

getInteractions(self, entity1, entity2, merged=False)

source code 

Return a list of interaction-elements which represent directed interactions from entity1 to entity2.

Parameters:
  • entity1 (cElementTree.Element) - a semantic node (trigger or named entity)
  • entity2 (cElementTree.Element) - a semantic node (trigger or named entity)

mapInteractions(self, entityElements, interactionElements, verbose=False)

source code 

Maps the semantic interactions to the syntactic graph.

Syntactic dependencies are defined between tokens. Semantic edges (interactions) are defined between annotated entities. To utilize the correlation of the dependency parse with the semantic interactions, the graphs must be aligned by mapping the interaction graph's nodes (entities) to the syntactic graph's nodes (tokens). This is done by determining the head tokens of the entities.

Parameters:
  • entityElements (list of cElementTree.Element objects) - the semantic nodes (triggers and named entities)
  • interactionElements (list of cElementTree.Element objects) - the semantic edges (e.g. Cause and Theme for GENIA)
  • verbose - Print selected head tokens on screen
  • verbose - boolean

mapEntity(self, entityElement, verbose=False)

source code 

Determine the head token for a named entity or trigger. The head token is the token closest to the root for the subtree of the dependency parse spanned by the text of the element.

Parameters:
  • entityElement (cElementTree.Element) - a semantic node (trigger or named entity)
  • verbose - Print selected head tokens on screen
  • verbose - boolean

findHeadToken(self, candidateTokens)

source code 

Select the candidate token that is closest to the root of the subtree of the depencdeny parse to which the candidate tokens belong to. See getTokenHeadScores method for the algorithm.

Parameters:
  • candidateTokens (list of cElementTree.Element objects) - the list of syntactic tokens from which the head token is selected

getTokenHeadScores(self)

source code 

A head token is chosen using a heuristic that prefers tokens closer to the root of the dependency parse. In a list of candidate tokens, the one with the highest score is the head token. The return value of this method is a dictionary that maps token elements to their scores.

_markNamedEntities(self)

source code 

This method is used to define which tokens belong to _named_ entities. Named entities are sometimes masked when testing learning of interactions, to prevent the system making a trivial decision based on commonly interacting names.

getTokenText(self, token)

source code 

Returns the text of a token, and masks it if the token is the head token of a named entity.

Parameters:
  • token (cElementTree.Element) - interaction-XML syntactic token.

mergeInteractionGraph(self, merge=True)

source code 

For merging duplicate entities

keepDuplicates - allows calling the function with no effect, so that the same code
                 can be used for merged and unmerged cases