Contact

Send us your feedback

Thank you for your feedback. An email has been sent to the ESRC support team.

An error occured whilst sending your feedback. Please review the problems below.

Next generation tools for linguistic research in grammatical treebanks

  • Start date: 01 January 2006
  • End date: 31 December 2007

This research will develop, in conjunction with an existing community of researchers, a software environment for carrying out experimental research in corpus linguistics. The outcome will be a computer program to assist linguists in carrying out scientific experiments on parsed corpora - databases of sentences of naturally occurring speech and writing for which a grammatical tree analysis has been produced.

Previous research has focused on developing grammatical query systems permitting linguists to retrieve instances of particular grammatical patterns. However, formal experiments require careful planning, are time consuming and prone to error.

The program will support an entire experimental cycle including:

  • defining the structure of an experiment and its variables
  • extracting a sample from the corpus
  • carrying out a statistical analysis on this sample, and
  • evaluating the results by considering the original sentences

A range of techniques will be available to help researchers define and carry out experiments. The platform will be based on the user-friendly ICECUP (International Corpus of English Corpus Utility Program) software and new tools will be developed to help researchers focus their experiments, define new variables and interpret the results. Researchers can contrast explanations for the same event, while the software will try explanations unlikely to have been considered.

  • Outputs (7)
ICECUP IV (beta)

Creator: Sean Wallis Date: 31 March 2008 Software/multimedia package

ICE-GB Tagger

Creator: Sean Wallis Date: 18 October 2006 Software/multimedia package