CLARIN-D for Computational Linguistics and Applied Linguistics

CLARIN-D supports research in the area of applied linguistics and computational linguistics by providing services to explore language resources, computationally analyse written and spoken text, and archive corpora and research results. The Working Group 6 »Applied Linguistics, Computational Linguistics« is a network of scholars within CLARIN-D whose aim is to support computational linguistics by providing a digital research infrastructure.

Data for research

The Virtual Language Observatory (VLO) provides access to many resources for research, e.g. annotated Tree-Bank corpora in German.

A dedicated full text search, the Federated Content Search (FCS) of CLARIN-D, allows researchers to search the full text of many of the CLARIN community's resources. This way, it is possible to find examples for the usage of terms. The search results and the source documents can be extracted and saved as a corpus for further analysis.

→ More about "Accessing "

Software tools for research projects

CLARIN-D provides software and web services for the analysis and preparation of language data. This includes WebAnno for the manual and semi-automatic annotation of texts or WebLicht for the automatic annotation of texts with a variety of tools. These can be combined according to the needs and preferences of the user.

→ More about "Analysing"

Providing your own research data

Apart from tools for the analysis of language data, the CLARIN network supports archiving one's own research data and providing it to the research community for reuse. By cooperating with a CLARIN centre, the data can be prepared in a way that it is sufficiently described. For example, one tool for describing data is the CMDI-Maker, which creates descriptions that allow easy access for the research community.

Would you like to provide your data through the CLARIN-D infrastructure? Contact a specialised centre or contact the CLARIN-D Helpdesk.

→ More about "Preparation and Depositing"

Contacts in the disciplines

Within CLARIN-D, the disciplines are organised in Working Groups (WGs). The task of WG6 »Applied Linguistics, Computational Linguistics« is to identify resources and tools to be included in the CLARIN-D e-Science infrastructure for natural language processing in the digital humanities. The working group

develops criteria for the inclusion of resources and tools into the service architecture of CLARIN-D
creates a roadmap for the inclusion of resources and tools by the CLARIN-D centres
identifies gaps in the resource inventory for the German language and
defines and supervises corresponding curation projects

To this end, we will identify common workflows for natural language processing in the digital humanities. These may range from fully automatic processing to selective annotation using annotation control methodologies. In this way, the working group functions as an interface between users in the digital humanities and the development of natural language processing (NLP) tools in the computational linguistics community. Our working group aims to ensure an ongoing communication with scientists in the areas of applied linguistics, language learning research and computational linguistics. This is strengthened by involving representatives from scientific organisations. In addition, suggestions about a shared development of NLP resources for the German language will be advertised in the scientific communities.

Responsible CLARIN-D Centres

Prof. Dr. Erhard Hinrichs, Universität Tübingen
Dr. Daniel de Kok, Universität Tübingen
Dr. Scott Martens, Universität Tübingen
Prof. Dr. Jonas Kuhn, Universität Stuttgart
Jens Stegmann, Universität Stuttgart

Head and Contact

Prof. Dr. Erhard Hinrichs, Universität Tübingen
Dr. Thorsten Trippel, Universität Tübingen

Members

Prof. Dr. Michael Beißwenger, Universität Duisburg-Essen
Prof. Dr. Beatrix Busse, Universität Heidelberg
Prof. Dr. Stefanie Dipper, Ruhr-Universität Bochum
Prof. Dr. Stefan Evert, Universität Erlangen
Jun-Prof. Dr. Chris Biemann, Technische Universität Darmstadt
Prof. Dr. Iryna Gurevych, Technische Universität Darmstadt
Prof. Dr. Christiane Fellbaum, Princeton University
Prof. Dr. Uli Heid, Universität Hildesheim
Prof. Dr. Anke Lüdeling, Humboldt-Universität Berlin
Prof. Dr. Sebastian Padó, Universität Stuttgart
Dr. Nils Reiter, Universität Stuttgart
PD Dr. Sabine Schulte im Walde, Universität Stuttgart
Prof. Dr. Stefan Schierholz, Universität Erlangen
Dr. Caroline Sporleder, Universität Göttingen
Prof. Dr. Manfred Stede, Universität Potsdam
Prof. Dr. Angelika Storrer, Universität Mannheim
Dr. Yannick Versley, Universität Heidelberg
Dr. Andrea Zielinski, Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung (IOSB), Karlsruhe
Prof. Dr. Heike Zinsmeister, Universität Hamburg
Dr. Andreas Witt, IDS Mannheim
Prof. Dr. Philipp Cimiano, Universität Bielefeld
Prof. Dr. Beatrix Busse, Univesität Heidelberg

Resources from the Discipline for the Discipline

During the implementation phase of CLARIN-D, the WG identified important resources and tools, which have been developed and prepared for reuse. These small projects are called curation projects within CLARIN-D.

Curation projects

Curation project 1: Implementation of a web-based annotation platform for linguistic annotations | Information about the project
Curation project 2: Linguistic Annotation of Non-standard Varieties – Guidelines and Best Practices | Information about the project
Curation project 3: Semantic Annotation for Digital Humanities | Information about the project