Back Industry News

20 Big Data Repositories You Should Check Out Posted on Jan 04 - 2016

Share This :

This is an interesting listing created by Bernard Marr. I would add the following great sources:

§  DataScienceCentral selection of big data sets - check out the first itemized bullet list after clicking on this link

§  Data sets used in our data science apprenticeship - includes both real data and simulated data - and tips to create artificial, rich, big data sets for testing models

§  KDNuggets repository

§  Data sets used in Kaggle competitions

 Bernard's selection:


2.    US Census Bureau 

3.    European Union Open Data Portal 


5.    The CIA World Factbook 


7.    NHS Health and Social Care Information Centre 

8.    Amazon Web Services public datasets 

9.    Facebook Graph 

10.  Gapminder 

11.  Google Trends 

12.  Google Finance 

13.  Google Books Ngrams 

14.  National Climatic Data Center 

15.  DBPedia 

16.  Topsy 

17.  Likebutton 

18.  New York Times 

19.  Freebase 

20.   Million Song Data Set  Source


Get the Global Big Data Conference

Weekly insight from industry insiders.
Plus exclusive content and offers.