Automatic Mapping Among Lexico-Grammatical Annotation Models (AMALGAM)

HOMEPAGE



School of Computer Studies Home PagePrevious PageNext Page


SCHOOL OF COMPUTER STUDIES HOME PAGE | PREVIOUS PAGENEXT PAGE


The AMALGAM project has developed a set of resources for qualitative comparisons between the main Part-of-Speech tagsets and phrase structure grammar schemes used in English corpus linguistics.

Software has been developed to tag text with up to 8 different PoS-tag schemes. This software was used to create a Multi-Tagged Corpus, a sample of text annotated with a range of alternative PoS-tagging schemes, to enable researchers to compare how the schemes apply to a common "gold standard" corpus. We have also collected a MultiTreebank, a set of sentences each annotated with a range of parse-trees from rival parsers and parsing schemes.


*
BRIEF OVERVIEW OF AMALGAM

*IN-DEPTH REVIEW OF AMALGAM

*
PUBLICATIONS

THE AMALGAM MULTI-TAGGED CORPUS
(A collection of 180 sentences tagged with 9 different tagging schemes)

A MULTI-TREEBANK
(A collection of 60 sentences tagged with several rival parsing schemes)

*LINKS TO OTHER SITES




School of Computer Science Home PagePrevious PageNext Page - Overview

SCHOOL OF COMPUTER STUDIES HOME PAGE | PREVIOUS PAGENEXT PAGE


This site (occasionally) maintained by Eric Atwell (eric@comp.leeds.ac.uk) using text provided by the staff and students of the NLP research group of the School of Computing at Leeds University.