SCHOOL OF COMPUTER STUDIES HOME PAGE | PREVIOUS PAGE| NEXT PAGE
The AMALGAM project has developed a set of resources for qualitative comparisons between the main Part-of-Speech tagsets and phrase structure grammar schemes used in English corpus linguistics.
Software has been developed to tag text with up
to 8 different PoS-tag schemes. This software was used to create a Multi-Tagged
Corpus, a sample of text annotated with a range of alternative PoS-tagging
schemes, to enable researchers to compare how the schemes apply to a common
"gold standard" corpus. We have also collected a MultiTreebank, a set of
sentences each annotated with a range of parse-trees from rival parsers
and parsing schemes.
BRIEF OVERVIEW OF AMALGAM
IN-DEPTH REVIEW OF AMALGAM
THE AMALGAM MULTI-TAGGED CORPUS
(A collection of 180 sentences tagged with 9 different tagging schemes)
(A collection of 60 sentences tagged with several rival parsing schemes)
LINKS TO OTHER SITES
SCHOOL OF COMPUTER
STUDIES HOME PAGE | PREVIOUS PAGE| NEXT
This site (occasionally) maintained by Eric Atwell (firstname.lastname@example.org) using text provided by the staff and students of the NLP research group of the School of Computing at Leeds University.