Leipzig Corpora Collection / Wortschatz Leipzig / Deutscher Wortschatz

Search in more than 30 million sentences of German newspaper material

Welcome to the Leipzig Corpora Collection / Deutscher Wortschatz

a project of Leipzig University, the Saxon Academy of Sciences and Humanities in Leipzig and the Institute for Applied Informatics.

Corpora portal

The international corpora portal offers access to more than 900 corpora of the Leipzig Corpora Collection (LCC) in more than 250 languages.

To the corpora portal

CURL portal

On this website you can contribute to corpus collection for under-resourced languages by simply entering a URL.

To the CURL portal

Words of the day

The words of the day based on a selection of newspaper and news services are currently not available.

To the words of the day

Book "Wissensrohstoff Text"

Our new book explains how digital text can be prepared, processed and used in applications with the help of text mining. Download its glossary here.

Data and information about the book

ASV Online Toolbox

The ASV Toolbox is a modular collection of tools for the exploration of written language data.

To the online toolbox

Corpus statistics

The corpus and language statistics contain analyses about various aspects of natural language based on our corpora.

To the corpus statistics

RESTful webservices

Our REST web services allow direct access to our corpora by using any software.

To the RESTful webservices

Downloads

Some of our tools and large parts of our data are available for download.

To the download page

RDF portal

Some of our data are available in RDF.

To the RDF portal

Data is automatically collected from carefully selected public sources. The example sentences are automatically selected and are not expression of this project. The authors are solely responsible for the content and opinions contained therein.