I will be continuously updating this blog as I encounter more interesting datasets.
- Movielens – Movie ratings data
- NOAA – National Oceanic and Atmospheric Administration
- Climate Data Online – climate and weather related datasets
- Imhotep Sample Data – can download sample web log data from NASA and Wikipedia
- Internet Traffic Archive – web logs available for download
- Million Song Dataset – collection of audio features and metadata for 1 million popular music tracks – 300GB