Author: Sebastiaan Aarts Date: 18 April 2008 Full research report
Next generation tools for linguistic research in grammatical treebanks
- Start date: 01 January 2006
- End date: 31 December 2007
This research will develop, in conjunction with an existing community of researchers, a software environment for carrying out experimental research in corpus linguistics. The outcome will be a computer program to assist linguists in carrying out scientific experiments on parsed corpora - databases of sentences of naturally occurring speech and writing for which a grammatical tree analysis has been produced.
Previous research has focused on developing grammatical query systems permitting linguists to retrieve instances of particular grammatical patterns. However, formal experiments require careful planning, are time consuming and prone to error.
The program will support an entire experimental cycle including:
- defining the structure of an experiment and its variables
- extracting a sample from the corpus
- carrying out a statistical analysis on this sample, and
- evaluating the results by considering the original sentences
A range of techniques will be available to help researchers define and carry out experiments. The platform will be based on the user-friendly ICECUP (International Corpus of English Corpus Utility Program) software and new tools will be developed to help researchers focus their experiments, define new variables and interpret the results. Researchers can contrast explanations for the same event, while the software will try explanations unlikely to have been considered.
- Outputs (7)
Author: Sebastiaan Aarts Date: 18 April 2008 Research summary
Author: Sean Wallis Date: 31 March 2008 Book chapter
Creator: Sean Wallis Date: 31 March 2008 Software/multimedia package
Author: Sean Wallis Date: 31 March 2008 Booklet
Author: Sean Wallis Date: 03 March 2008 Journal article
Creator: Sean Wallis Date: 18 October 2006 Software/multimedia package