The coreference editor identifies chains using the surface string of the longest annotation in a chain. This is the string that is displayed in the editor list of all chains, and also in the pick list when adding a new annotation.
This causes problems when two distinct chains share the same longest surface string. There are three problems:
(a) when you select the checkbox of one chain, all chains that share the same surface string get checked / unchecked by the gui.
(b) when you want to add something to a new chain, the pick list will only show one of the equivalent longest surface strings, and there is no way of telling which of the chains it is referring to.
(c) you can't distinguish between chains identified with the same string in the editor list
Maybe it's not often a problem when dealing with pronominal coreference against person and company names, as each chain will have a distinct longest surface string.
But it can be a problem when coreferring other things. For example, a clinical document with drugs and treatments marked, could mention "chemotherpay" twice, referring to two distinct chemotherapies. Both could also be rererred to as "this" or "it" in other parts of the text. When you look at the corefence editor, there will be two "chemotherapy" chains that are indistiguishable. Selecting either will result in both being checked, and the members of both chains being highlighted in the text. When adding another annotation to a chain, only one "chemotherapy" appears in the pick list, and it is unclear which it is.
thanks
Angus
Logged In: YES
user_id=1280870
Originator: NO
Changed this to a feature request - as it's not exactly a bug. I also set the priority quite high, so hopefully it will get fixed soonish (though probably not before the final 4.0 release, as it requires complete re-engineering of the coref editor).