Go back to main download site
To download a corpus select a corpus size - given in number of sentences - and download the corresponding data file.
Mixed-typical
Year Country Downloads
2017 10K 30K 100K 300K 1M
News
Year Country Downloads
2005-2009 10K 30K 100K 300K 1M
2010 10K 30K 100K 300K 1M
2019 10K 30K 100K 300K 1M
2020 10K 30K 100K 300K 1M
2021 10K 30K 100K 300K 1M
2022 10K 30K 100K 300K 1M
2023 10K 30K 100K 300K 1M
Newscrawl
Year Country Downloads
2015 10K 30K 100K 300K 1M
2016 10K 30K 100K 300K 1M
2019 10K 30K 100K 300K 1M
2020 10K 30K 100K 300K 1M
Web
Year Country Downloads
2011 10K 30K 100K 300K 1M
2015 San Marino 10K 30K 100K 300K 1M
2015 Switzerland 10K 30K 100K 300K 1M
2016 San Marino 10K 30K 100K 300K 1M
2016 Switzerland 10K 30K 100K 300K 1M
2017 Switzerland 10K 30K 100K 300K 1M
2020 Switzerland 10K 30K 100K 300K 1M
2023 Switzerland 10K 30K 100K 300K 1M
Web-public
Year Country Downloads
2019 Italy 10K 30K 100K 300K 1M
Wikipedia
Year Country Downloads
2010 10K 30K 100K 300K 1M
2014 10K 30K 100K 300K 1M
2016 10K 30K 100K 300K 1M
2021 10K 30K 100K 300K 1M
Go back to main download site