The CQPweb family
Submitted by xujiajin on Thu, 08/20/2015 - 8:50pm
CQPweb based corpus interfaces
(Please refer to http://cwb.sourceforge.net/cqpweb.php for the relationship between CQPweb and CWB. In simple terms, at the back-end, concordancing in particular, is CWB, but CQPweb is more than that.)
Britain
CQPweb at Lancaster, UK, 71 corpora as of 20 August 2015 (maintained by Andrew Hardie) (English, Arabic, Chinese, Punjabi, Norwegian, Latin, Russian, Italian, Hindi etc.)
http://cqpweb.lancs.ac.uk
BNCweb at Lancaster University (English)
http://bncweb.lancs.ac.uk/
Canada
Corpora of biomedical and health literature (Neil Millar of bmhlinguistics.org) (English)
http://cqpweb.wetware.ca
China
BFSU CQPweb, over 100 corpora, the National Research Centre for Foreign Language Education, Beijing Foreign Studies University (maintained by Jiajin Xu and Liangping Wu) (English, Chinese, Japanese, Arabic, Icelandic, German, Spanish, Russian, etc.)
http://114.251.154.212/cqp/
Department of English, The Hong Kong Polytechnic University, Hong Kong (English)
http://lamalcorpora.engl.polyu.edu.hk/cqpweb/
CQPweb at Huazhong Agricultural University, Wuhan, Hubei Province (English)
http://211.69.132.28/
National Taiwan Normal University (NTNU) (English and Chinese)
http://140.122.83.250/cqpweb
Germany (German, English and French)
Fußballlinguistik auf CQPweb: https://fussballlinguistik.linguistik.tu-berlin.de/
Fachgebiet Allgemeine Linguistik, Technische Universität Berlin (the Berlin Institute of Technology), Institut für Sprache und Kommunikation
Israel (Hebrew, English etc.)
CQPweb interface at MILA, Knowledge Center for Processing Hebrew, Technion Faculty of Computer Science, Technion City, Haifa and the Computational Linguistics Group (http://cl.haifa.ac.il/index.shtml), the University of Haifa (see http://yeda.cs.technion.ac.il/files/HebrewInterfaceToCQP.pdf for a quick user's guide). The MILA CQP interface is a lightly modified version of CQPweb v3.0.
http://yeda.cs.technion.ac.il/HebrewCqpWeb/
Italy
CQPweb at University for Foreigners Perugia (Perugia corpus - a reference corpus of written and spoken Italian and CAIL2 - a written learner corpus of Italian) (Italian)
https://www.unistrapg.it/cqpweb/
Malta
The Maltese Language Resource Server (MLRS) at the Institute of Linguistics and the Department of Intelligent Computer Systems of the University of Malta (Corpus of Learner English in Malta and Korpus Malti) (English)
http://mlrs.research.um.edu.mt/CQPweb/
Portugal
University of Lisbon
http://alfclul.clul.ul.pt/CQPweb
South Korea
http://cqpweb.kr
Korean, English, and German Corpora
It is created and maintained by Prof. Minhaeng Lee at Yonsei University
Spain
Universitat Autònoma de Barcelona,
http://sfncorpora.uab.es/CQPweb/cea
Switzerland
CQPweb at the Zurich Center for Linguistics, Universität Zürich
http://server.linguistik.uzh.ch/cqpweb
Turkey
Taner Sezer Turkish Corpus Server
http://cqpweb.tscorpus.com/cqpweb/
USA
CQPweb at Department of Linguistics, Georgetown University, 51 corpora as of 21 August 2015 (including BNC, Brown, ICE family, COCA, WaC family etc.)
https://corpling.uis.georgetown.edu/cqp/
**********
CWB/CQP based corpus interfaces
IMS Corpus Workbench (CWB), University of Stuttgart, Institute for Natural Language Processing
http://www.ims.uni-stuttgart.de/forschung/projekte/CorpusWorkbench.html
Britain
IntelliText Corpus Queries, the Centre for Translation Studies (CTS) at the University of Leeds
http://corpus.leeds.ac.uk/itweb/htdocs/Query.html
Denmark (Danish, Portuguese, German, English, French, Spanish, Esperanto, Italian, Romanian, Swedish, Norwegian, Icelandic etc.)
CorpusEye at the Institute of Language and Communication (ISK) at the University of Southern Denmark (http://corp.hum.sdu.dk/copyright.html)
http://corp.hum.sdu.dk
KorpusDK (Danish)
at the Department for Digital Dictionaries and Corpora, Society for Danish Language and Literature, Copenhagen
http://ordnet.dk/korpusdk
Germany
Colibri², the German Grammar Group, Freie Universität Berlin, developed by Roland Schäfer (German, English, Spanish, Dutch, and Swedish)
https://webcorpora.org/
ParaSol: A Parallel Corpus of Slavic and other languages, developed by Ruprecht von Waldenfels and hosted at the Humboldt University of Berlin
http://parasolcorpus.org/
http://parasolcorpus.org/ParaVoz/
PolMine Corpus Server at Universität Duisburg-Essen,
http://polmine.sowi.uni-due.de/cwb
Italy (Italian and English)
CQP based corpora at Corpus and Computational Linguistics Research Group, University of Bologna
http://corpora.ficlit.unibo.it/
Norway (Norwegian, French, etc.)
Corpora at the Text laboratory (e.g. The Corpus for Bokmål Lexicography LBK, The French Newspaper Corpus, Two Corpora with music reviews, NoWaC, SKRIV Corpus, The BigBrother Corpus, Corpus of Doctor-Patient Conversations from Ahus, Nordic Dialect Corpus, Norwegian in America, NoTa-Oslo, The Ruija Corpus, Talko, TAUS), The University of Oslo
http://www.hf.uio.no/iln/english/about/organization/text-laboratory/serv...
Portugal
Linguateca, AC/DC corpora, or Internet Access to Corpora: The AC/DC project, Oslo, Normay, Lisbon and other places.
http://www.linguateca.pt/ACDC/
Parallel corpora involving Portuguese
http://www.linguateca.pt/COMPARA
http://www.linguateca.pt/CorTrad
http://www.linguateca.pt/PoNTE
http://www.linguateca.pt/PANTERA
Sweden
OPUS: The open parallel corpus at the Department of Linguistics and Philology, Uppsala University
http://opus.lingfil.uu.se/
Korp (Språkbanken)
http://spraakbanken.gu.se/korp/
Switzerland
Center for the Study of Language and Society, University of Berne
Roland Meyer, Ruprecht von Waldenfels, Michal Wozniak, Andreas Zeman (2006-2015): ParaVoz: A simple web interface for querying parallel corpora. Second Version. Bern, Regensburg, Berlin, Krakow.
https://bitbucket.org/rvwfels/paravoz2 https://bitbucket.org/rvwfels/paravoz
Key references
Hardie, A. (2012). CQPweb – combining power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics, 17(3): 380-409.
Christ, Oliver. 1994. A modular and flexible architecture for an integrated corpus query system. Proceedings of COMPLEX'94, 3rd Conference on Computational Lexicography and Text Research, Budapest, HUngary, July 7-9, pp. 23-32.
More information about the birth and evolvement of CQPweb (Cited from 'About CQPweb--Who did it' section of http://cqpweb.lancs.ac.uk):
"CQPweb was created by Andrew Hardie, Lancaster University, UK.
Most of the architecture, the look-and-feel, and even some snippets of code were shamelessly half-inched from BNCweb.
BNCweb's most recent version was written by Sebastian Hoffmann (University of Trier) and Stefan Evert (University of Osnabrück). It was originally created by Hans-Martin Lehmann, Sebastian Hoffmann, and Peter Schneider.
The underlying technology of CQPweb is manifold.
Concordancing is done using the IMS Corpus Workbench with its CQP corpus query processor. Thus the name. Other functions (collocations, corpus management etc.) are powered by MySQL databases.
The system uses Stefan Evert's Simple Query (CEQL) parser, which is written in Perl. The web-scripts are written in PHP. Some JavaScript is used to create interactive links and forms. The look-and-feel relies on Cascading Style Sheets plus good old fashioned HTML."
(Updated 26 April, 2017 by Jiajin Xu)