当前位置: HOME >> FLERIC News >> Content

The CQPweb family

发布者: [发表时间]:2022-09-22 [来源]: [浏览次数]:

The CQPweb family

Submitted by xujiajin on Thu, 08/20/2015 - 8:50pm      



CQPweb based corpus interfaces

(Please refer to http://cwb.sourceforge.net/cqpweb.php for the relationship between CQPweb and CWB. In simple terms, at the back-end, concordancing in particular, is CWB, but CQPweb is more than that.)


Britain

CQPweb at Lancaster, UK, 71 corpora as of 20 August 2015 (maintained by Andrew Hardie) (English, Arabic, Chinese, Punjabi, Norwegian, Latin, Russian, Italian, Hindi etc.)

http://cqpweb.lancs.ac.uk

BNCweb at Lancaster University (English)

http://bncweb.lancs.ac.uk/


Canada

Corpora of biomedical and health literature (Neil Millar of bmhlinguistics.org) (English)

http://cqpweb.wetware.ca


China

BFSU CQPweb, over 100 corpora, the National Research Centre for Foreign Language Education, Beijing Foreign Studies University (maintained by Jiajin Xu and Liangping Wu) (English, Chinese, Japanese, Arabic, Icelandic, German, Spanish, Russian, etc.)

http://114.251.154.212/cqp/

Department of English, The Hong Kong Polytechnic University, Hong Kong (English)

http://lamalcorpora.engl.polyu.edu.hk/cqpweb/

CQPweb at Huazhong Agricultural University, Wuhan, Hubei Province (English)

http://211.69.132.28/

National Taiwan Normal University (NTNU) (English and Chinese)

http://140.122.83.250/cqpweb


Germany (German, English and French)

Fußballlinguistik auf CQPweb: https://fussballlinguistik.linguistik.tu-berlin.de/

Fachgebiet Allgemeine Linguistik, Technische Universität Berlin (the Berlin Institute of Technology), Institut für Sprache und Kommunikation


Israel (Hebrew, English etc.)


CQPweb interface at MILA, Knowledge Center for Processing Hebrew, Technion Faculty of Computer Science, Technion City, Haifa and the Computational Linguistics Group (http://cl.haifa.ac.il/index.shtml), the University of Haifa (see http://yeda.cs.technion.ac.il/files/HebrewInterfaceToCQP.pdf for a quick user's guide). The MILA CQP interface is a lightly modified version of CQPweb v3.0.

http://yeda.cs.technion.ac.il/HebrewCqpWeb/


Italy

CQPweb at University for Foreigners Perugia (Perugia corpus - a reference corpus of written and spoken Italian and CAIL2 - a written learner corpus of Italian) (Italian)

https://www.unistrapg.it/cqpweb/


Malta

The Maltese Language Resource Server (MLRS) at the Institute of Linguistics and the Department of Intelligent Computer Systems of the University of Malta (Corpus of Learner English in Malta and Korpus Malti) (English)

http://mlrs.research.um.edu.mt/CQPweb/


Portugal

University of Lisbon

http://alfclul.clul.ul.pt/CQPweb


South Korea

http://cqpweb.kr

Korean, English, and German Corpora

It is created and maintained by Prof. Minhaeng Lee at Yonsei University


Spain

Universitat Autònoma de Barcelona,

http://sfncorpora.uab.es/CQPweb/cea


Switzerland

CQPweb at the Zurich Center for Linguistics, Universität Zürich

http://server.linguistik.uzh.ch/cqpweb


Turkey

Taner Sezer Turkish Corpus Server

http://cqpweb.tscorpus.com/cqpweb/


USA

CQPweb at Department of Linguistics, Georgetown University, 51 corpora as of 21 August 2015 (including BNC, Brown, ICE family, COCA, WaC family etc.)

https://corpling.uis.georgetown.edu/cqp/


**********


CWB/CQP based corpus interfaces


IMS Corpus Workbench (CWB), University of Stuttgart, Institute for Natural Language Processing

http://www.ims.uni-stuttgart.de/forschung/projekte/CorpusWorkbench.html


Britain

IntelliText Corpus Queries, the Centre for Translation Studies (CTS) at the University of Leeds

http://corpus.leeds.ac.uk/itweb/htdocs/Query.html


Denmark (Danish, Portuguese, German, English, French, Spanish, Esperanto, Italian, Romanian, Swedish, Norwegian, Icelandic etc.)

CorpusEye at the Institute of Language and Communication (ISK) at the University of Southern Denmark (http://corp.hum.sdu.dk/copyright.html)

http://corp.hum.sdu.dk

KorpusDK (Danish)

at the Department for Digital Dictionaries and Corpora, Society for Danish Language and Literature, Copenhagen

http://ordnet.dk/korpusdk


Germany

Colibri², the German Grammar Group, Freie Universität Berlin, developed by Roland Schäfer (German, English, Spanish, Dutch, and Swedish)

https://webcorpora.org/

ParaSol: A Parallel Corpus of Slavic and other languages, developed by Ruprecht von Waldenfels and hosted at the Humboldt University of Berlin

http://parasolcorpus.org/

http://parasolcorpus.org/ParaVoz/

PolMine Corpus Server at Universität Duisburg-Essen,

http://polmine.sowi.uni-due.de/cwb


Italy (Italian and English)

CQP based corpora at Corpus and Computational Linguistics Research Group, University of Bologna

http://corpora.ficlit.unibo.it/


Norway (Norwegian, French, etc.)

Corpora at the Text laboratory (e.g. The Corpus for Bokmål Lexicography LBK, The French Newspaper Corpus, Two Corpora with music reviews, NoWaC, SKRIV Corpus, The BigBrother Corpus, Corpus of Doctor-Patient Conversations from Ahus, Nordic Dialect Corpus, Norwegian in America, NoTa-Oslo, The Ruija Corpus, Talko, TAUS), The University of Oslo

http://www.hf.uio.no/iln/english/about/organization/text-laboratory/serv...


Portugal

Linguateca, AC/DC corpora, or Internet Access to Corpora: The AC/DC project, Oslo, Normay, Lisbon and other places.

http://www.linguateca.pt/ACDC/

Parallel corpora involving Portuguese

http://www.linguateca.pt/COMPARA

http://www.linguateca.pt/CorTrad

http://www.linguateca.pt/PoNTE

http://www.linguateca.pt/PANTERA


Sweden

OPUS: The open parallel corpus at the Department of Linguistics and Philology, Uppsala University

http://opus.lingfil.uu.se/

Korp (Språkbanken)

http://spraakbanken.gu.se/korp/


Switzerland

Center for the Study of Language and Society, University of Berne

Roland Meyer, Ruprecht von Waldenfels, Michal Wozniak, Andreas Zeman (2006-2015): ParaVoz: A simple web interface for querying parallel corpora. Second Version. Bern, Regensburg, Berlin, Krakow.

https://bitbucket.org/rvwfels/paravoz2 https://bitbucket.org/rvwfels/paravoz


Key references


Hardie, A. (2012). CQPweb – combining power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics, 17(3): 380-409.


Christ, Oliver. 1994. A modular and flexible architecture for an integrated corpus query system. Proceedings of COMPLEX'94, 3rd Conference on Computational Lexicography and Text Research, Budapest, HUngary, July 7-9, pp. 23-32.


More information about the birth and evolvement of CQPweb (Cited from 'About CQPweb--Who did it' section of http://cqpweb.lancs.ac.uk):


"CQPweb was created by Andrew Hardie, Lancaster University, UK.

Most of the architecture, the look-and-feel, and even some snippets of code were shamelessly half-inched from BNCweb.

BNCweb's most recent version was written by Sebastian Hoffmann (University of Trier) and Stefan Evert (University of Osnabrück). It was originally created by Hans-Martin Lehmann, Sebastian Hoffmann, and Peter Schneider.


The underlying technology of CQPweb is manifold.

Concordancing is done using the IMS Corpus Workbench with its CQP corpus query processor. Thus the name. Other functions (collocations, corpus management etc.) are powered by MySQL databases.

The system uses Stefan Evert's Simple Query (CEQL) parser, which is written in Perl. The web-scripts are written in PHP. Some JavaScript is used to create interactive links and forms. The look-and-feel relies on Cascading Style Sheets plus good old fashioned HTML."


(Updated 26 April, 2017 by Jiajin Xu)