Senseval-2 was a set of competitions held in early 2001 to evaluate Word Sense Disambiguation (WSD) systems to see how well they perform at word sense identifications. 12 different competitions in 11 different languages were held. Each competition required a system to input text containing <head> tags around words or phrases whose sense is to be identified. The workshop in Toulouse, France was held July 5 & 6, 2001.
Woody Haynes had three entries in the competition, creatively named IIT1, IIT2, and IIT3. All three used similar methodologies. IIT1 and IIT2 were entered in the English Lexical Sample competition and all three were entered in the English All Word competition.
Systems entered in the English Lexical Sample and English All Word competition used the WordNet 1.7 (a prerelease version with limited distribution) as the inventory of word senses. The XML below contains first four test items in the English Lexical Sample competition for adjective instances of faithful (from eng-lex-samp.evaluation.xml distributed in test.gz):
<instance id="faithful.40006" docsrc="bnc_CL2_135">
<context>
Nothing was lacking.
They were very strong, you see, the Jenkins in Pontrhydyfen.
They kept the chapel going and they cleaned it as well.
They were all very funny, they could all perform, you know, do a turn.
Quite unselfconscious.
Quite outstanding,&equo; she said, as she had said before.
All Dic's &bquo;golden children&equo; stayed true through their great loss. &bracket-slash-p; &bracket-p;
If parents are your primary role models, then Richard &bracket-pb; Jenkins was formed by an alcoholic father and a beloved mother who died and left him when he was two.
They say that in such a case you search for your mother for the rest of your life.
Oddly, though he could never bear to be out of her sight and cried whenever she left him, as soon as his eldest sister, the <head>faithful</head> Cis, got him to her own home he ceased to cry.
</context>
</instance>
<instance id="faithful.40012" docsrc="bnc_BMV_103">
<context>
The feudal bond established in the first place a special relationship between the lord and a man whom we should call a cavalry officer, whom they called a vassal.
It enabled the lord both to recruit a knight and to reward him.
The vassal knelt before his lord, and placed his hands between his: and so became his man.
This was the act of homage.
Then he rose and swore a solemn oath to keep faith, to be true to his lord.
By this act he performed fealty.
These oaths, and the bond they established, had their origin in the relation of lord and follower in the courts of the barbarian chieftains and they had a powerful religious aura, though this did not always prevent them being broken.
&bquo;They are <head>faithful</head> to their lords&equo;, wrote William of Malmesbury of the Normans, &bquo;but swift to break faith for a slight occasion.
</context>
</instance>
<instance id="faithful.40019" docsrc="bnc_CH1_8233">
<context>
RACE ace Willie Carson keeps Genesis star Mike Rutherford spellbound with tales of his exciting Derby and St Leger wins.
I hear that Rutherford is now plotting his revenge.
He plans to hypnotise Carson with fascinating, and no doubt endless, Genesis stories. &bracket-slash-p; &bracket-slash-div2; &bracket-div2; &bracket-head;
OFF LIMITS &bracket-slash-head; &bracket-p;
BONO is to join GEORGE HARRISON, ERIC CLAPTON, SINEAD O'CONNOR and TOM PETTY at a show to mark BOB DYLAN's 30-year career in New York on October 16…
THE SISTERS OF MERCY have split up…
NENEH CHERRY has recorded a duet with R.E.M. 's MICHAEL STIPE…
DEXY'S MIDNIGHT RUNNERS are reuniting for an album and tour…
SUEDE'S first show was attended by 20 people. &bracket-slash-p; &bracket-slash-div2; &bracket-div2; &bracket-head;
OVER THE LIMIT &bracket-slash-head; &bracket-p;
HOLLYWOOD celebs may not be <head>faithful</head> to their partners but, according to Dr Andrew Geeley's new book Faithful Attraction, most Americans don't cheat.
</context>
</instance>
<instance id="faithful.40025" docsrc="bnc_B11_439">
<context>
One ardent supporter, no doubt worked up as a result of our pressure campaign, suggested a telegram to H.M.
the King!
It was reported that over-ripe tomatoes and some oldish eggs had been tossed at the door of two southern commercial stations, which was a smear tactic we deplored with emphasis.
We telephoned Mr Hector Charlesworth but he was, as yet, unable to take any remedial action, although I had the distinct impression that he would have liked to do so.
I organized a daily reminder in the press, on the &bquo;So Many Shopping Days to Christmas&equo; idea with a slow count-down until Radio Station 1OAB would be no more.
This motif was used on the air every day and we rang all possible changes on the &bquo;we're being done in by the big bogey-man&equo; theme, so much so that we had some of our <head>faithful</head> old lady listeners actually in tears.
</context>
</instance>
(Note, some minor transformations have been made to this sample. Occurrences of bracketted items in the text appear as XML entities, such as the &bracket-slash-p; which was [/p] in the original text. Also, an extra blank has been inserted after the </head> token.)
The first item to be disambiguated is ... his eldest sister, the <head>faithful</head> Cis ... WordNet offers three adjective senses of faithful:
Of these three alternatives, faithful%3.00.00:: would be the intended sense of the first two and the fourth instances, faithful%3:00:01:: for the third.
(WordNet organizes words in synsets. A synset represents a distinct concept or meaning. Each synset contains those words or collocations that can have that meaning. WordNet contains binary relations between synsets such as hypernym and between specific words within synsets such as antonym. See the WordNet web site or the WordNet book, MIT Press, 1999, edited by C. Fellbaum.)
References to word sense hereafter mean a synset or a synset word. References to a context means the data within the <context> ... </context> tags. References to target, target word, and context anchor mean the data within the <head> ... </head>.
The three entries by Woody Haynes all have the same general approach. In order to show how these methods operate, consider the second example above (id="faithful.40012"). Begin by finding all WordNet 1.7 synsets that might apply to the context. These include the three synsets above.
For each of these synsets, consider the WordNet relations for this synset or the words within the synset. If the relation is not a child relation, such as hyponym or has part, collect all synsets by performing a transitive closure over that relation. For child relations, just include the synsets that are direct children. (Two relations are handled specially. The verb group relation is completely ignored since it would add in examples from competing senses. No transitive closure is performed on the see also relation because it too frequently pulls significant numbers of duplicate synsets for competing senses.)
For faithful.40012, the collection of synsets for faithful is as follows:
faithful%3:00:00:: 1 Adj 00921783 faithful%3:00:00:: steadfast in affection or allegiance {years of faithful service}{faithful employees}{we do not doubt that England has a faithful patriot in the Lord Chancellor}
Similarity(&) to synset Adj 00922111 firm%5:00:00:faithful:00, loyal%5:00:00:faithful:00, truehearted%5:00:00:faithful:00, fast%5:00:00:faithful:00{a firm ally}{loyal supporters}{"the true-hearted soldier...of Tippecanoe" - Campaign song for William Henry Harrison;}{fast friends}
Adj 00922379 true%5:00:02:faithful:00 {true believers bonded together against all who disagreed with them}
Similarity(&) to synset Adj 00922379 true%5:00:02:faithful:00{true believers bonded together against all who disagreed with them}
Adj 00922111 firm%5:00:00:faithful:00, loyal%5:00:00:faithful:00, truehearted%5:00:00:faithful:00, fast%5:00:00:faithful:00 {a firm ally}{loyal supporters}{"the true-hearted soldier...of Tippecanoe" - Campaign song for William Henry Harrison;}{fast friends}
See Also(^) to synset Adj 00558924 constant%3:00:00::{a man constant in adherence to his ideals}{a constant lover}{constant as the northern star}
See Also(^) to synset Adj 02382381 true%3:00:00::{the story is true}{"it is undesirable to believe a proposition when there is no ground whatever for supposing it true" - B. Russell;}{the true meaning of the statement}
See Also(^) to synset Adj 02386310 trustworthy%3:00:00::, trusty%3:00:00::{a trustworthy report}{an experienced and trustworthy traveling companion}
Attribute(=) to synset N 04129069 fidelity%1:07:00::, faithfulness%1:07:00::
Adj 00922562 unfaithful%3:00:00:: {an unfaithful lover}
Antonym(!) to word unfaithful%3:00:00:: Adj 1 00922562{an unfaithful lover}
faithful%5:00:00:accurate:00 2 Adj 00023935 close%5:00:00:accurate:00, faithful%5:00:00:accurate:00 marked by fidelity to an original {a close translation}{a faithful copy of the portrait}{a faithful rendering of the observed facts}
Similarity(&) to synset Adj 00023500 accurate%3:00:00::{an accurate reproduction}{the accounting was accurate}{accurate measurements}{an accurate scale}
Adj 00024135 dead-on%5:00:00:accurate:00 {a dead-on feel for characterization}{"She avoids big scenes...preferring to rely on small gestures and dead-on dialogue" - Peter S.Prescott}
Adj 00024370 high-fidelity%5:00:00:accurate:00, hi-fi%5:00:00:accurate:00 {a high-fidelity recording}{a hi-fi system}
Adj 00024542 straight%5:00:00:accurate:00 {set the record straight}{made sure the facts were straight in the report}
Adj 00024707 true%5:00:00:accurate:00, dead_on_target%5:00:00:accurate:00 {his aim was true}{he was dead on target}
Adj 00024847 veracious%5:00:00:accurate:00 {a veracious account}
faithful%3:00:01:: 3 Adj 00923334 faithful%3:00:01:: not having sexual relations with anyone except your husband or wife, or your boyfriend or girlfriend {he remained faithful to his wife}
Similarity(&) to synset Adj 00923543 true_to%5:00:00:faithful:01{she was true to her significant other}
Antonym(!) to word unfaithful%3:00:01:: Adj 2 00923658{her husband was unfaithful}
The sets of related synsets is transformed to a set of the examples occurring as part of the synset definition. The result of this transformation is as follows:
faithful%3:00:00:: Adj 1 00921783 with 17 examples: steadfast in affection or allegiance [Adj 00558924 constant%3:00:00::/0] a man constant in adherence to his ideals [Adj 00558924 constant%3:00:00::/1] a constant lover [Adj 00558924 constant%3:00:00::/2] constant as the northern star [Adj 00921783 faithful%3:00:00::/0] years of faithful service [Adj 00921783 faithful%3:00:00::/1] faithful employees [Adj 00921783 faithful%3:00:00::/2] we do not doubt that England has a faithful patriot in the Lord Chancellor [Adj 00922111 firm%5:faithful:00, loyal%5:faithful:00, truehearted%5:faithful:00, fast%5:faithful:00/0] a firm ally [Adj 00922111 firm%5:faithful:00, loyal%5:faithful:00, truehearted%5:faithful:00, fast%5:faithful:00/1] loyal supporters [Adj 00922111 firm%5:faithful:00, loyal%5:faithful:00, truehearted%5:faithful:00, fast%5:faithful:00/2] "the true-hearted soldier...of Tippecanoe" - Campaign song for William Henry Harrison; [Adj 00922111 firm%5:faithful:00, loyal%5:faithful:00, truehearted%5:faithful:00, fast%5:faithful:00/3] fast friends [Adj 00922379 true%5:faithful:00/0] true believers bonded together against all who disagreed with them [Adj 00922562 unfaithful%3:00:00::/0] an unfaithful lover [Adj 02382381 true%3:00:00::/0] the story is true [Adj 02382381 true%3:00:00::/1] "it is undesirable to believe a proposition when there is no ground whatever for supposing it true" - B. Russell; [Adj 02382381 true%3:00:00::/2] the true meaning of the statement [Adj 02386310 trustworthy%3:00:00::, trusty%3:00:00::/0] a trustworthy report [Adj 02386310 trustworthy%3:00:00::, trusty%3:00:00::/1] an experienced and trustworthy traveling companion faithful%5:accurate:00 Adj 2 00023935 with 7 examples: marked by fidelity to an original [Adj 00023500 accurate%3:00:00::/0] an accurate reproduction [Adj 00023500 accurate%3:00:00::/1] the accounting was accurate [Adj 00023500 accurate%3:00:00::/2] accurate measurements [Adj 00023500 accurate%3:00:00::/3] an accurate scale [Adj 00023935 close%5:accurate:00, faithful%5:accurate:00/0] a close translation [Adj 00023935 close%5:accurate:00, faithful%5:accurate:00/1] a faithful copy of the portrait [Adj 00023935 close%5:accurate:00, faithful%5:accurate:00/2] a faithful rendering of the observed facts faithful%3:00:01:: Adj 3 00923334 with 3 examples: not having sexual relations with anyone except your husband or wife, or your boyfriend or girlfriend [Adj 00923334 faithful%3:00:01::/0] he remained faithful to his wife [Adj 00923543 true_to%5:faithful:01/0] she was true to her significant other [Adj 00923658 unfaithful%3:00:01::/0] her husband was unfaithful
Each of the examples in this set is compared to the context surrounding the context word, looking for the single best matching example. The synset for the target word that is related to that example is chosen as the answer.
The first example listed above will serve as an example of the context matching, a man constant in adherence to his ideals.
Since each of the examples contains one of the words being defined, consider the example to be a tagged instance of that word. For example, the constant synset example contains the word constant. It is reasonable to assume that the sense of constant activated by the example is the synset sense. (If the example does not contain one of the synset words, the example is discarded.) Henceforth, example anchor refers to the occurrence of the defined word in an example. This example of constant and the context instance of faithful.40012 is used for the following discussion.
All three systems restrict the portion of the context considered to the sentence containing the target word. So for this comparison, the context is restricted to "They are faithful to their lords", wrote William of Malmesbury of the Normans, "but swift to break faith for a slight occasion. In addition, tokens are only allowed to match within a search frame which was set to 10 words or two times the example length, which ever was more.
In comparing an example to the context, the systems assume a match between the example anchor and the context anchor, even though the words are not the same, so constant and faithful are matched. The assumption being made by the system is that words in related synsets will show similar syntactic behavior. A small scoring penalty is applied in all three systems for an example when, like constant, it is not the original synset faithful%3:00:00::.
Next iterate through the remaining words of the example moving out from the example anchor. The systems consider the remaining example words in the order man, in, a, adherence, to, his, and task. (That is, one to the left, one to the right, two to the left, two to the right, and so on). For each in turn, the systems look at the context for a match. The search begins at an offset from the context anchor that is the same as the example word to the example anchor. So for man which is one word to the left of the example anchor, search beginning one word to the left of the context anchor ( from a corresponding position in the context relative to the context anchor.
The order of consideration of context words move outwards from the starting position, ignoring any context words that have already been aligned. So the sequence of comparisons for man would be are, They, faithful (ignored, since it is already mapped), their, lords, \", \,, wrote, ...
Looking at context words continues until a "match" is found. The comparison for a match is performed as follows:
So the matching of a man constant in adherence to his ideals with "They are faithful to their lords", wrote William of Malmesbury of the Normans, "but swift to break faith for a slight occasion proceeds as follows:
The scoring methodologies are all based on a formula:
score = 1/( 1 + penalty( r ) )
where r is the set of match results and 0 <= penalty( r ). Thus 0 < score <= 1. A score of zero is explicitly assigned when nothing but the context anchor and example match (i.e. we have no evidence).
r has the following components that may contribute to the penalty:
| Example Word | Matching Context Word | Position of Context Word |
| a | the | 14 |
| man | lords | 6 |
| constant | faithful | 3 |
| in | to | 4 |
| adherence | break | 20 |
| to | of | 11 |
| his | their | 5 |
| ideals | swift | 18 |
Looking at the sequence 14, 6,3,4,20,11,5,18, the differences between adjacent items gives -8,-3,1,16,-9,-6,13. Assuming that the sequence should start in a positive state, we have 4 direction changes.
In all system, the examples are taken through the scoring algorithm. If there is a single winner, choose it. If there are multiple winners and the winning examples all happen to be the same (because the two senses were related to the same synset), eliminate all of the losing senses, drop the winning example and repeat the process of looking for the highest score. If there is a tie with distinct examples, report all winners (with no weight, so they will be judged equally).
If no example gets a score > 0 (i.e. all examples matched only the example anchor to the context anchor), the first word sense that has a sense number of 1 is chosen.
The first entry applies penalties independent of the distance of the example token from the example anchor. The penalty function is computed as the sum of the following:
The IIT1 system tokenizes the sentence containing the context word and the WordNet examples restricting the possible senses of all words to those that are plausible based on the inflectional morphology. In addition, it identifies some noun phrases (those beginning with a determiner) and some prepositional phrases (those where the preposition does not appear as an adverb in WordNet), allowing further restriction of possible part-of-speech. The input is not processed by a part-of-speech tagger.
IIT1 determines only the word sense of the context anchor. That is, it makes no effort to fully disambiguate other words in the sentence.
No tuning was done to the scoring algorithm against the sample data, training data, or test data. The debugging of the scoring algorithm was done against SEMCOR and the sample data distribution for Senseval2 English Lexical Sample and Senseval2 English All Word. The debugging of the algorithm was only to verify that the penalties were being calculated as intended. The formula was constructed based only on my intuition.
The second entry is similar to the first, but the impact of each penalty is reduced by the factor a, where
a = 1/( max( 1, | example anchor position - example token position | )
So the penalty function sums the following:
The tokenization and development methodology for IIT2 is the same as for IIT1.
As with IIT1, no tuning of the
The IIT3 methodology was only used in the English All Word competition. It uses the IIT1 scoring algorithm, but treats all context tokens as context anchors, processing from left to right. This reduces the relations available for matching for tokens to the left of a context anchor to those available to the disambiguated sense.
During the frenetic competition period, a very serious bug was introduced to the English Lexical Sample tokenization process which severely restricted the context words available for matching. In particular, the only word senses available for matching were those that matched the part-of-speech of the target word. The official results for IIT1 and IIT2 in the English Lexical Sample will have little to say about the potential of the methodology.
In addition, a tokenizer bug prevented proper handling of collocations, significantly affecting the performance on multi-word answers in the English Lexical Sample.
Due to time constraints (pitiful hardware constraints?) the English All Word entry only had answers for about the first 20% of the test data, so pitiful recall results will not be due to the algorithm deciding not to answer.
The systems are written in Java and a pointer to the source code and answer sets will be placed here soon.
The underlying philosophy of this method is to disambiguate by example. It seems plausible that we learn new words and word senses by filing away early encounters with it and only later, after its occurrence count or frequency reaches some threshold, derive or extend generalizations about it.
I hope this method will prove to help address the sparse training data problem, by providing evidence when evidence might not otherwise be present.
The method of matching example to context is a first step. It seems obvious that a number of improvements could be made: