CLARIN-D for Computational Linguistics and Applied Linguistics
CLARIN-D supports research in the area of applied linguistics and computational linguistics by providing services to explore language resources, computationally analyse written and spoken text, and archive corpora and research results. The Working Group 6 »Applied Linguistics, Computational Linguistics« is a network of scholars within CLARIN-D whose aim is to support computational linguistics by providing a digital research infrastructure.
Data for research
The Virtual Language Observatory (VLO) provides access to many resources for research, e.g. annotated Tree-Bank corpora in German.
A dedicated full text search, the Federated Content Search (FCS) of CLARIN-D, allows researchers to search the full text of many of the CLARIN community's resources. This way, it is possible to find examples for the usage of terms. The search results and the source documents can be extracted and saved as a corpus for further analysis.
Software tools for research projects
CLARIN-D provides software and web services for the analysis and preparation of language data. This includes WebAnno for the manual and semi-automatic annotation of texts or WebLicht for the automatic annotation of texts with a variety of tools. These can be combined according to the needs and preferences of the user.
Providing your own research data
Apart from tools for the analysis of language data, the CLARIN network supports archiving one's own research data and providing it to the research community for reuse. By cooperating with a CLARIN centre, the data can be prepared in a way that it is sufficiently described. For example, one tool for describing data is the CMDI-Maker, which creates descriptions that allow easy access for the research community.
Would you like to provide your data through the CLARIN-D infrastructure? Contact a specialised centre or contact the CLARIN-D Helpdesk.
Contacts in the disciplines
Within CLARIN-D, the disciplines are organised in Working Groups (WGs). The task of WG6 »Applied Linguistics, Computational Linguistics« is to identify resources and tools to be included in the CLARIN-D e-Science infrastructure for natural language processing in the digital humanities. The working group
- develops criteria for the inclusion of resources and tools into the service architecture of CLARIN-D
- creates a roadmap for the inclusion of resources and tools by the CLARIN-D centres
- identifies gaps in the resource inventory for the German language and
- defines and supervises corresponding curation projects
To this end, we will identify common workflows for natural language processing in the digital humanities. These may range from fully automatic processing to selective annotation using annotation control methodologies. In this way, the working group functions as an interface between users in the digital humanities and the development of natural language processing (NLP) tools in the computational linguistics community. Our working group aims to ensure an ongoing communication with scientists in the areas of applied linguistics, language learning research and computational linguistics. This is strengthened by involving representatives from scientific organisations. In addition, suggestions about a shared development of NLP resources for the German language will be advertised in the scientific communities.
Responsible CLARIN-D Centres
- Prof. Dr. Erhard Hinrichs, Universität Tübingen
- Dr. Daniel de Kok, Universität Tübingen
- Dr. Scott Martens, Universität Tübingen
- Prof. Dr. Jonas Kuhn, Universität Stuttgart
- Jens Stegmann, Universität Stuttgart
Head and Contact
- Prof. Dr. Erhard Hinrichs, Universität Tübingen
- Dr. Thorsten Trippel, Universität Tübingen
- Prof. Dr. Michael Beißwenger, Universität Duisburg-Essen
- Prof. Dr. Beatrix Busse, Universität Heidelberg
- Prof. Dr. Stefanie Dipper, Ruhr-Universität Bochum
- Prof. Dr. Stefan Evert, Universität Erlangen
- Jun-Prof. Dr. Chris Biemann, Technische Universität Darmstadt
- Prof. Dr. Iryna Gurevych, Technische Universität Darmstadt
- Prof. Dr. Christiane Fellbaum, Princeton University
- Prof. Dr. Uli Heid, Universität Hildesheim
- Prof. Dr. Anke Lüdeling, Humboldt-Universität Berlin
- Prof. Dr. Sebastian Padó, Universität Stuttgart
- Dr. Nils Reiter, Universität Stuttgart
- PD Dr. Sabine Schulte im Walde, Universität Stuttgart
- Prof. Dr. Stefan Schierholz, Universität Erlangen
- Dr. Caroline Sporleder, Universität Göttingen
- Prof. Dr. Manfred Stede, Universität Potsdam
- Prof. Dr. Angelika Storrer, Universität Mannheim
- Dr. Yannick Versley, Universität Heidelberg
- Dr. Andrea Zielinski, Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung (IOSB), Karlsruhe
- Prof. Dr. Heike Zinsmeister, Universität Hamburg
- Dr. Andreas Witt, IDS Mannheim
- Prof. Dr. Philipp Cimiano, Universität Bielefeld
- Prof. Dr. Beatrix Busse, Univesität Heidelberg
Resources from the Discipline for the Discipline
During the implementation phase of CLARIN-D, the WG identified important resources and tools, which have been developed and prepared for reuse. These small projects are called curation projects within CLARIN-D.
- Curation project 1: Implementation of a web-based annotation platform for linguistic annotations | Information about the project
- Curation project 2: Linguistic Annotation of Non-standard Varieties – Guidelines and Best Practices | Information about the project
- Curation project 3: Semantic Annotation for Digital Humanities | Information about the project