Go back to main download site
To download a corpus select a corpus size - given in number of sentences - and download the corresponding data file.
News
Year Downloads
2019 10K 30K 100K 300K
2020 10K 30K 100K 300K
Newscrawl
Year Downloads
2011 10K 30K 100K 300K
2012 10K 30K 100K 300K
Web
Year Downloads
2002 10K 30K 100K 300K
2011 10K 30K 100K 300K
Wikipedia
Year Downloads
2012 10K 30K 100K 300K
2014 10K 30K 100K 300K
2016 10K 30K 100K 300K
2021 10K 30K 100K 300K
Go back to main download site