EMPIRICAL INVESTIGATIONS OF ANAPHORA AND SALIENCE

CONTEXT DEPENDENCE

CONTEXT DEPENDENCE

Plan of these lectures

MOTIVATIONS FOR ANNOTATING ANAPHORIC INFORMATION

Chains of object mentions in text

The Big Issue

More difficult choices

Today’s lecture

Nominal anaphoric expressions

Interpretive differences between nominal expressions

Non-nominal anaphoric expressions

Not all ‘anaphoric’ expressions always anaphoric

REFERENCES TO VISUAL SITUATION (`EXOPHORA’) IN TRAINS

References to visual situation (‘exophora’ / deixis)

EXOPHORA IN THE MAPTASK

Discourse deixis

First-mention definites

Not all ‘anaphoric’ expressions always anaphoric

Types of anaphoric relations

Associative anaphora

Explicit and implicit antecedents

Explicit and implicit antecedents

Theoretical foundations

ANAPHORIC RELATIONS IN A DISCOURSE MODEL

ANAPHORIC RELATIONS IN A DISCOURSE MODEL

IMPLICIT OBJECTS IN A DISCOURSE MODEL: PLURALS

IMPLICIT OBJECTS IN A DISCOURSE MODEL: DISCOURSE DEIXIS

Some terminology

Anaphora ≠ Coreference

Coding schemes for context-dependence

Differences between coding schemes

MapTask Reference Coding
(Aylett, 2000)

MapTask Reference Coding
(Aylett, 2000)

MUC coreference scheme (Hirschman & Sundheim, 1997)

The coding scheme

Problems with the MUC scheme

‘Extended coreference’ in MUC

Problems with ‘extended coreference’

THE MATE PROJECT

EXAMPLE OF STANDOFF

COREFERENCE IN MATE

MATE coreference markup

Links in the Text Encoding Initiative

ANAPHORIC RELATIONS IN A DISCOURSE MODEL

INDEPENDENT LINKS IN MATE

IDENTITY AND PREDICATION

INDEPENDENT LINKS AND BRIDGING

Marking multiple semantic relations

Marking multiple semantic relations

COREFERENCE STANDOFF

AMBIGUITY VS. MULTIPLE RELATIONS

AMBIGUOUS ANAPHORIC EXPRESSIONS

Ambiguous anaphoric expressions in the MATE/GNOME scheme

Other markup ideas in MATE

THE GNOME ANNOTATION

FROM MATE TO GNOME

The GNOME markup scheme for anaphoric information

GUIDELINES

The  GNOME annotation manual

Limiting the amount of work

Agreement on annotation

A measure of agreement: the K statistic

Agreement on familiarity
 (Poesio and Vieira, 1998)

A `knowledge-based’ classification of bridging descriptions (Vieira, 1998)

… continued

Results

Achieving agreement (but not completeness) in GNOME

GNOME: Agreement results on bridging references

Problem: K for antecedents

The  GNOME corpus

An example museum text

Other information marked up in the GNOME corpus

The GNOME annotation of NEs

Coding for familiarity

Subsequent projects

VENEX
(Poesio, Bristot, Delmonte, Tonelli 2004)

DEVELOPMENTS FOR THE VENEX ANNOTATION

MMAX (Mueller and Strube, 2002, 2003)

Standoff in MMAX: Words

Standoff in MMAX: Markables

Standoff in MMAX: Anaphoric information

Larger scale anaphoric annotation efforts

PRAGUE DEPENDENCY TREEBANK

Coreference annotation in the PDT

Coreference annotation in the PDT

PDT standoff

ONTONOTES

Ontonotes coreference
(Ramshaw & Weischedel)

AGREEMENT ON ANAPHORA,  2

K for anaphora

The problem

From K to α

FROM K TO α

FROM K TO α

FROM K TO α

Distance metrics in  α

Distance metrics for anaphora

Example

K vs α

Caveats

α’s dependence on distance metric

AMBIGUOUS ANAPHORIC EXPRESSIONS

Summary of results

An example

Conclusions: some lessons

Open questions

URLs