In my work with kurt on MyPySpace we've been dealing with fairly large amounts of data, at least compared to average loads for content based MIR research. As a point of reference, I'd say a fairly standard music similarity or classification study will have a data set of something on the order of 10^3 songs, while our initial research research efforts have had a data set comprised of approximately 16,000 artists over about 55,000 songs. Further, there have been some studies (this one for instance) that use entire commercial mp3 datasets (said paper used Yahoo!'s digital download library of order 10^7 songs). These papers tend to deal particularly with the issues of large datasets as when things get that large it becomes impossible to brute force your way out of the situation.
So anyway, all of this has me thinking, how much music data is out there? How many musical recordings exist? Anybody know? I could google it a bit but I'm lazy.
FEMA Administrator Must Reimburse U.S. for Misuse of Agency Vehicles - Brock Long will have to reimburse the government for misusing government vehicles to travel to and from his home — but will keep his job.
7 hours ago