Tuesday, 21 December 2010


I finally submitted my phd thesis. I'll post some bits of it over the next few weeks, and the whole thing after my viva. To start, here's the abstract:

It is not hyperbole to note that a revolution has occurred in the way that we as a society distribute data and information. This revolution has come about through the confluence of Web-related technologies and the approaching- universal adoption of internet connectivity. Add to this mix the normalised use of lossy compression in digital music and the uptick in digital music download and streaming services; the result is an environment where nearly anyone can listen to nearly any piece of music nearly anywhere. This is in many respects the pinnacle in music access and availability. Yet, a listener is now faced with a dilemma of choice. Without being familiar with the ever-expanding millions of songs available, how does a listener know what to listen to? If a near-complete collection of recorded music is available what does one listen to next? While the world of music distribution underwent a revolution, the ubiquitous access and availability it created brought new problems in recommendation and discovery.

In this thesis, a solution to these problems of recommendation and discovery is presented. We begin with an introduction to the core concepts around the playlist (i.e. sequential ordering of musical works). Next, we examine the history of the playlist as a recommendation technique, starting from before the invention of audio recording and moving through to modern automatic methods. This leads to an awareness that the creation of suitable playlists requires a high degree of knowledge of the relation between songs in a collection (e.g. song similarity). To better inform our base of knowledge of the relationships between songs we explore the use of social network analysis in combination with content-based music information retrieval. In an effort to show the promise of this more complex relational space, a fully automatic interactive radio system is proposed, using audio-content and social network data as a backbone. The implementation of the system is detailed. The creation of this system presents another problem in the area of evaluation. To that end, a novel distance metric between playlists is specified and tested. This distance method is then applied as a mean of evaluation to our interactive radio system. We then conclude with a discussion of what has been shown and what future work remains.

Sunday, 26 September 2010

womrad live blog part the last

last session: Long tail stuff:

Kibeom Lee (presenting), Woon Seung Yeo and Kyogu Lee
  • focusing on popularity bias - referencing oscar's thesis work (Help! I'm stuck in the head)
  • Goal: keep the awesome of collaborative filtering but sort out popularity bias
  • the mystery of unpopular but 'loved' songs on last.fm -- shouldn't loved songs be played frequently... perhaps an area of music the user likes but doesn't venture very far into
  • 'My tail is your head' - find the users who have a 'head' that overlaps with your 'tail' to draw recs from
  • personal story about how this idea came about -- one person's popularity bias is another person's novel rec.
  • refs oscar and paul's ISMIR 07 rec tutorial - this system is geared toward the top half of the user type pyramid
  • scraped last.fm to get more tracks per user (API gives 50/user scrape gives 500)
  • lots of tracks (about 9million)
  • eval by asking users how things worked out comparing recs from proposed algor v. trad model rate; used a 1-5 rating scale
  • promo'd the website in various ways, but not too much response
  • but, the limited response did show some improvement over traditional approach
  • overall - some improvement, much potential
Q how many users?: see above
Q so were your recs in the global head?:
sorta, mostly in the midsection

Mark Levy (presenting) and Klaas Bosteels
  • an overview of lit showing various rec bias especially the idea of positive feedback reinforcing the head (not this kind of bias though)
  • this work looks at 7 billion scrobbles all scrobbles from Jan - Mar this year (holy crap, that's some scale)
  • recs just from the last.fm radio
  • how do you define the long tail? use a fixed ref of overall artist ranks (number of listeners from last) + a fit model ~50-60k artists in the 'head'
  • looked at rec radio, non-rec radio, all music
  • the last.fm radio has less head bias then general listening, but only just
  • used an experimental cohort of listeners: new, active, but not insane spamming amounts of scrobbling. two subsets : radio users and not so much
  • this shows very little difference in the non-radio long tail listening among those who use last.fm radio v. those who don't
  • but: perhaps there's some demographic trouble
  • so split radio users into high users and low users
  • still no tail bias to speak of
  • perhaps from the fact that real systems only rec new tracks, mitigating reinforcement
  • so: built a simple item-based rec which limited candidates to the 'play direct-from-artist' scheme, not allowed to give artists with more than 10000 fans
  • deployed on playground.last.fm
  • eval based on a sample of the last.fm user traffic
  • effectively pushes curve out another order of magnitude
  • try online
  • [me: this is great!]
Q Do you see a problem, in terms of scholarship, with the fact that in practice you have access to all this data and the public does not?
well, hrm. how about being an intern
Q Does this make better recs?
Better, eh, interesting sure.

And WOMRAD done. feedback is elicited

afternoon papers

content-based stuff now:

Dmitry Bogdanov, Martín Haro, Ferdinand Fuhrmann, Emilia Gómez and Perfecto Herrera

Dmitry presenting
  • Sim is not rec. need similarity
  • can we improve content based rec by merging pref data?
  • gmm + pref model
  • process:
  1. ask user for small set of tracks that specify the user's preference by example
  2. get bag of frames on these
  3. SVMs to get sematics (probablistic)
  4. in this semantic space, search for tracks
  • can search in a variety of ways (use of Pearson's correlation is taken from prev work)
  • for eval compare our method to a bunch of existing methods, content-based , contextual, random
  • some users did a test get pref set (varies form 19 to 178 tracks for a user) this takes a long time
  • get lots of tracks from all the methods, shuffle, stick in front of user ask lots of Qs per track
  • created three categories based on the evals: Hits, trusts, fails
  1. Hits -user likes, is new
  2. trusts - user likes, is not new
  3. fail - no to all
  4. unclear - the rest (18%)
  • A good system should provide many hits and some trusts avoiding fails
  • in the results, last.fm (via api) is very good for hits and trusts
  • everyone else was bad at trusts
  • the new method was best for non-last.fm with hits, but last.fm is different drawing set of music so they're better
  • proposed semantics offer an improvement over pure timbral features
  • but still inferior to industrial approaches, though this proposed work improves considerably, a good way to cold start perhaps
Q (oscar) I dont' understand the last.fm? why didn't you use for sim?
we tried, couldn't get enough info
(oscar follow up) low trust on the content, do you think it's tied to a lack of transparency?
maybe, but our definition of trust just meant user likes and knows.

Q() was the SEM-ALL about finding songs that are close to any or all?

UPDATE (~5pm):

Pedro Mercado and Hanna Lukashevich
Hannah is presenting

  • clustering can help you swim in the sea of data
  • users can fix incorrect clusters, positive feedback
  • system diagram:

  • similarity can be given considered as a graph, then you can do random walks, calc eigen values etc.
  • but, what if this user doesn't care about somethings? User pref based feature selection.
  • in the given space, you can then find distance (paper uses Pearson's but other dist could be used)
  • contraint the space (tricky math, see paper...)
  • eval: used the MIREX 04content description data
  • constraints from genre labels
  • using test train as an example: what's in contraint space, what isn't
  • mutual information, something else I didn't catch
  • some graphs showing that there's more awesome with presented method
  • when looking at outliers, things are less clear but still seem positive
  • [graphs are page 6 of the pdf, have a look for details]
  • to wrap up: ML approaches can improve recs at least with our simulated user...
  • our clustering methods are speedy, though scale is tricky but since our matrix sparse should be doable
  • Way better than random constraints
  • future work: stick constraints in feature selector, we did this, to appear in ICML, gives significant imporvement, but causes some trouble, read paper for detail [excellent ICML tease...]
-- coffee and demos now...

WOMRAD Afternoon live blog

Afternoon live blog for WOMRAD. Intro in first post.

Afternoon session.

1500: Industrial panel

  • Òscar Celma (BMAT, Spain), moderator
  • Tom Butcher (Microsoft Bing, US)
  • Mark Levy (last.fm, UK)
  • Michael S. Papish (Rovi Corporation) subing for Henri-Pierre Mousset (Yozik, France)
  • Gilad Shlang (Meemix, Israel)

Q (asked by OC): Do we need recommenders anymore? Are they relevant? (SFMusicTech quote about music rec only needed for people w/o friends)

TB: still valid, but personally more interested in discovery now...
ML: don't sell things, but tremendous effort in this direction, important to users, builds trust. Plus we compliment not replace social connections
MP: no need to draw lines, reinforces complimentary service idea. Many users may not need them, but perhaps that's not who these systems are for
GS: What's wrong with not having friends? Also, a tight group of friends may not have discovery, as a group. The removal of place opens more possibility to access long tail or different parts of the head. Perhaps more personalization than rec, but this a fine line
MP: the opposition of the individual v. group. If you listen to music without a community you loss the social experience.
GS: fair enough. some points about individual optimization of education. Group important, but also personal growth.

Q (asked by OC): Netflix prize. What is a good recommendation (in music)? How do we evaluate (in music)?
GS: a good rec will get people interested. wow factor. acknowledge you and surprise you at same time. music is short , which makes it easier to tune a rec profile. sharing implies liking, that's useful. tagging; more tags=more popular
(...small aside...)
ML: we run controlled experiments. quietly divide users to test different methods. Netflix fails in that it evaluates with data that's already been seen not new data.
MP: good rec means different things at different times. gives an example of a good rec that is not interesting: an artist you've previously bought releases a new album. not interesting but good. this would be bad for radio.
TB: in industry there are many ways to test. more purchase is different than more enjoyment.
ML: we at last would love for some theory to be developed for rec based on user logs.
OC: more data please
ML: you can always ask...
(this goes back and forth)
MP: yet there will never be good data. sparse data is hard, but makes your a better human (eat your green)

Q (OC) discuss user interfaces, user exerience etc. :
TB: pandora is a winner, don't ignore the interaction
ML: thesixtyone is great. interface v. playful good long tail, would love for a last.fm interaction, but discussing issues
MP: name checks Paul lamere, who he cites saying thesixtyone is an exotic rec, but MP thinks this is the way people normally use music, we should work to have systems that act like this. Need a toolbelt not one ubiquitous tool
GS:we tackle similar things. in B2B you need systems that complete clients' existing systems. If you over rec, you can scare people off, social dynamics
MP: think about the inverse rec - what should you not rec? Also from a UX stand point, to build trust, change recs over time. General to personal as a user interacts with a system
GS: different recs can be very personal
ML: last.fm takes a sort of opposite
MP: this is possible from last.fm's transparency

Q (OC) What do you want solved?

TB: we're hiring. Also, algorithms must scale or they're scalable
ML: how to merged datasources? how to use human-to-human recs
MP: see my keynote, exploit user psych, What are good Qs to ask users to build profiles
GS: more info for recs, params in audiofiles. map user params to extractable params Moving techniques to non-western musics. What about china and india? We should be serve them.
MP: is the sonic data really the key? I don't think so, too much effort in this direction sim is different than rec
GS: but sim is a good start

Q(OC) Do you use content-based features (y/n please)? ISMIR fun, glass ceiling, do you follow this work in academics, do you think it's solved?
GS: Yes. see last discussion core to our business, vital to start a relationship, move to social and such over time.
OC: what sorts of things do you use
GS: of things. 10 (does not list). aggressiveness very important.
ML: I come from the MIR community. we do content-based ID, have tried to intro content-based stuff and it's never been successful. But our hearts are in it. We have enough users that cold start doesn't matter. auto-tagging would be sweet though for the holes in our social data (musicological tags for instance). maybe youtube
TB: yes for the most part to ML's comments. content-based is too costly, tags and metadata are super effective
GS: what about the new company, that doesn't have lots of data. is you're just getting into the game. these people need results. can't tell them go gather data for a year and we'll sort it out
MP: item to item is very different than personalized
ML: check that P2P paper from ismir
Hannah from Franhauffer: we have clients (like film producers) no data,what then?
MP: exactly.
(Eugenio Tacchini): GS is the DNA all of it? really?
GS: no, not really. music DNA for rich space, but still need personalized info

Q (OC) if you were to hire a researcher (aside: research cannot program) what kind skills do you want (not resume skills, fancy skills)?
TB: domain experience, audio music, computer vision, breadth better than depth. production coding skill in some language
ML: we're hiring as well. If you don't want to code, probably won't work. CS skills really important. Big database skills. hadoop win. strong C++ and research also python. data and viz as well.
MP: we are hiring as well. growing r & d group. we have offices all over. we like building things. though we have room for research. we like solving problems. again broad. can you pivot. don't need a PhD to be useful.
GS: we're also hiring (that's 4 for 4) we're a start up. data analysis and mining. core CS + creative skills, willing to sweat.
OC: perhaps also adoptability
GS: yes. you're there to invent. plus we're in Tel Aviv and that's sweet

Q (claudio): What is the relationship your company has with musicians are they just a commodity?
OC: our missing speaker (Henri) does this
GS: I spoke with him, he thinks: for young musicians it's hard to reach your audience.
OC: BMAT does this with jamendo. when they type in 'Michael Jackson' what do you do?
MP: but don't sell recsys as a way to push new artists only. In a certain context, ie. neg search, but careful. Don't exploit users or artists
GS: but the state of the art pushes new bits

(from audience) What about piracy?
GS: it's not good.
TB: there are 2 view points: piracy increases consumption. otherside: do we now that?
OC: now we're over time sorry.

in light of the near transcript I just typed, I'm starting a new post for the afternoon talks.

Updated (5:33) : corrected questioner ID

A womrad live blog

I'm in Barcelona today for the Workshop on Music Recommendation and Discovery (WOMRAD). The theme is 'Is Music Recommendation Broken? How Can We Fix It?'
I'm giving a talk at 11am ( in about 2 hours ) and I'll be doing some (mostly) live updates about the program...

Update (10:05am):
UPDATE ( 28 Sept 2010, 11:52am): Michael has posted the slides to his talk.
  • The view from outside, as his industrial has used and observed recs
  • Been there since the beginning (which appears to be about 2000)
  • Recommenders must combine humans and machines
  • understand both content and listeners, transparency, embrace emosocio aspects, optimize trust
  • What is science? Must be falsifiable (Popper) or Solvable, reproducible puzzles (gah, missed name)
  • Puzzle - understand the listeners preferences -- foundations (ISMIR 2001 resolution) - testable reusable
  • Lots of metrics though (too many?) (do we need a metric for metrics?)
  • MIREX (summary of AMS task) (haha it's automated, tell that to andy and mert) - very acoustically focused, not exactly recommendation similarity != recommendation
  • use of statistical measures across datasets e.g. Netflix prize -- but what about discovery? -- Netflix produces better numbers but does it produce better recommendations?
  • More holistic measures -- survey users about trust and satisfaction (Swearingen & Sinha) -- may miss UI issues -- practical 'business' metrics -- bottomline measurements -- does this remove the science?
  • appreciated history of MIR (from a rec POV) will stick pic here -- currently hitting 'Wall of Good Recs' since recs don't suck it's no harder to test
  • easy to test for bad recs -- hard to test for good recs
  • What if the emerging problems (like UI and trust) are no longer measurable
  • Is user preference too variable and unstable to be useful?
  • from science to art?
  • 2 options:
  • 1: focus on unsolved MIR: better encoding of preference (more socio-cultural research)
  • What are the limits of the avg listener (hey it's our playlist survey!)--playlist turing tests, understand artist v. album v. tracks -- can we build tools/games to expand this
  • listener profile -- can you quantify the sonic v. social preference -- add relevance layers to search and retrieval
  • 2: adjourn to the Beach
  • Questions:
  • Mark Levy: Do you think you're too embarrassed about good engineering? What about controlled experiments by people like google/last? -- Move from science to engineering (this confuses me slightly ISMIR has alway been Engineering not Pure Science) It is fruitful but is it science.
  • Claudio: Can you speak a bit about your experience combining human knowledge vs. algorithms --- yes. what do you do with human knowledge? it's tricky. look for the ideal rec experience - sit around with your friends and play records: how do you scale that in a system? It's not about classification - humans are good at putting things together - train people to be qualitative assessors
  • Oscar: Since you used to be in college radio, how do you think this experience could inform playlist? Do you use playlisters? Well only a 1.5yr experience, but made me think about the groups of listeners. Name checks John Peel. What about presentation - In terms of what rovi does: Minimally we can stop making bad playlists: gives example then breaks - v. hard to differentiate btwn good - v. good - excellent
  • Me: what about bypassing order by selecting good sets:
  • (Eugenio Tacchini): how much is the expert transparency necessary? yes give justification but need to avoid the feeling of stereotyping, weird vague directions, not just look at this user but look at this part of this user.
  • tom butcher: Is music rec really a unique snowflake? - Every domain is unique. -- One thing: a bad user rec in music costs 2 minutes, a bad film rec costs you 2hrs music has a lower penalty cost for bad recs. Also diff in features will sonic features get you to pref, prob not in music [I think this is a think which may improve...]
(update 2 10:31am)
session 1:
Time Dependency
  • personal ex. showing diff between early day v. late night playlist
  • trying to link 2 concepts - Day- hour - (weather?) and Music track selection
  • few papers on this idea -- take things from Human Dynamics -- trying to enable playing music 'at the right moment' -- explore circular stats
  • Circular stats (eqs in paper at link) basically transform raw data by a perodicity (days, weeks)
  • Circular stats have analogous tools to trad stats - hyp tests for instance
  • Data for eval is full listening history of 992 unique last.fm users with artist/title + time of day (ToD) also got genre via track.getTopTags, keeping genre -- discarded users w/o enough data
  • scraped about half the data
  • attempt to make predictions - use two years of data to predict the ToD of play in next year
  • results: by day about 2.5x better than chance, by hour about 3-5x better than chance (move from half hour to hour tolerance doubles data
  • note that the figures are overall, some users are v. predictable in this way, some are not.
  • Concl: temporal patterns can be predicted - not just what but when. plugs the last.fm clocks
  • Q (dunno who asked): what about user to user offsets (eg. if a user gets up at 6 v 8am 8 am means something different)? Currently can't do this, need sensor data. Would be sweet if we could, though not tha tthe predictions are peruser, so this is to some effect already dealt with
  • Q (again, people say who you are): Method issue - when comp day v. hour there's a percentile diff in the err tolerance? Sure this could work look at baseline compare...
  • Q (Eugenio Tacchini): I tried this awhile ago, aggregated data, didn't find much spread do you think aggregation is the issue? yeah, must be specific to the user, right time + right user not just right user
  • Q (Klaas):do you think it would work with less data (can't wait 2 years)? Probably. This was a very conservative methodology, could probably get by with maybe three months. For this work we wanted lots of data to make things clear
  • Q (seriously ID guys): did you use a popularity filter? No. tested if pref for a genre is different than the average for that genre
Break time then my talk. no notes for my talk as I'm talking...

Update (12:16): I was without my machine for the social tag session, not just my talk. I'll get my hand notes in another post but for now here are the papers:
next paper is being skipped since the author was unable to attend due to illness:

Now joining the presentation already in progress by Audrey Laplante:
  • qualitative study of adolescents
  • 'Did your music taste change significantly in the last three years?' Yes, whys: New boyfriend, New school therefore new friends, important discussion topic
  • "who in your 'gang' or group exert the most influence on others in terms of music?" -- 3 self-identified. Characteristics: highly invested in music, good comm, willing to share info. People who are opinion leaders want to stay opinion leaders, will invest heavily in effort
  • in other domains work shows that weak ties are more important then strong ties in finding new information works almost all the time -- for 2 participants weak ties important -- for others strong ties with significantly different social network are important -- music as vehicle for social interaction
  • strong ties have different roles -- not important for discover, but critical for legitimization of musical taste
  • similar and reliable social connections are critical
  • social network maps (pic forthcoming...)
  • unknown how common these results are (same survey) as yet unknown exact implications for recommenders
  • Q (unknown)- Weak ties v. Strong ties -- how do you define the difference?: not really about newness, but it's entirely possible with new detail
  • Q (claudio) - What kinds of systems are implied with this work? Not necessarily a different system for adolescents. tight connectivity is critical, perhaps the difference is that strong ties may become more critical
  • (claudio) - does the notion that music describes you change as you get older?: not really actually, adolescent are interested in individual uniqueness
  • Q (Mark L) are social networks online somewhat different?: yes and no. in facebook you can find relatives, but noise is a big problem. But trust is not known
  • I asked about using graph difference. Answer could work, also other automatic methods...
lunch now. I'll make a new post for the afternoon session.

updated again (5:14pm) Eugenio Tacchini is Italian not Finnish (oops)

Thursday, 9 September 2010

Roomba Recon - A musichackday brain dump

So this past weekend I attended the 2nd (annual?) London Music Hackday at the Guardian's offices at King's Cross. For the hack I created an algorithm that generates playlists between arbitrary start and end songs on soundcloud. It does this with almost no pre-indexing, allowing for playlists to cover the entire network and always use an up-to-date graph. It's (mostly) running live if you'd like to play with it.

Briefly, it performs a sort of bastardized A* search, bilaterially from both the start and the end song to form the playlist. There's a parameter to limit the length of the two playlist segments, by default this is 4 so the max playlistlength is 10 (2*4+2 for the end points).

The search algorithm collects social links of the artist corresponding to the given song. For each of these connections (you know, 'friends' or in soundcloud jargon 'followings') a determination of the cost of adding that song is calculated in the following way (for the half built from the start song):
where is the cost to add song m to list after song n, is some measure of distance from song n to song m and is the same measure of distance from song m to song e. Song e is the end song for the whole playlist. So basically the idea is that the cost of moving to a node is a ratio of how far away it is from where you were to how far it is to where you're trying to get. The whole thing is reversed for the other half, so the cost function makes it cheap to move toward the start song. If you simply want to randomly traverse social links the cost can be set to an arbitrary equal value (I used 1) for all links.

This leaves the matter of distance.

Starting with what I know best, I decided to try a content-based distance first. I should say that from the onset I figured this would be insanely slow, but none the less, I gave it a go. I implemented (available directly as well) a little object that will grab the echonest timbre features for any two soundcloud songs, summarize the features into a single multidimensional gaussian (mean and std) then take the cosine distance between the two tracks (other distance metrics could be computed as well, but cosine seemed reasonable). That takes something on the order of 45 seconds to do for every pair of tracks. When using it in the above playlister the whole thing would take maybe 4 hours (I think, I never actually let it complete). Clearly way too slow.

So taking inspiration from my about to be published work at WOMRAD, I thought some NLP could save the day. So the other distance measure I implemented (no direct access yet) is based on a tracks tags and comments. First I tokenize the comments and combine them with the tags to create a vector space model of a track's descriptive text. I then weighted everything using tfidf (the idf was populated with a random sample of tracks from across soundcloud that I gathered over the weekend, about 41,000 tracks in total. This is the only indexing that is done in advance). From the tfidf weighted terms in a vector space, I took the cosine distance. This is both quite quick and gives pretty good results.

Everything was built in python, the app is running in cherrypy, using numpy and scipy for the data handling and gensim for the tfidf related bits. Soundcloud and echonest interaction is all via their respective python wrappers. Also there's a more terse write up over at the musichackday wiki. I'll stick the code on my repository on github once it's cleaned up a bit (though that might be a little while as I seem to be rather busy with something at the moment...)

Right. Back to writing my thesis.

Saturday, 29 May 2010

publications on playlists in ISMIR

So for this year's ISMIR I'll be doing another tutorial. This one is entitled "Finding A Path Through The Jukebox – The Playlist Tutorial" and I'll be presenting it with Paul Lamere. As you may have guessed by the title it's all about playlists. So to frame some of my background work I thought I'd poke around the ISMIR proceedings to get a more complete idea of all of the papers that dealt with topic across the 10 years of proceedings (plus the just announced titles from this year).

First I did a simple title search using the tool at ismir.net. This shows that from 2000 - 2009 there were 14 papers with 'playlist' occurring somewhere in the title. Here they are over time:

Well, that doesn't show very much, just some interest, no trends or anything. So from there I took a look at the results of at the text search available from Rainer Typke's website. The full text search found some 123 papers mentioning playlist, certainly a few more than the title search. From there I wanted to see what the distribution of these papers was over time (as above), though this took a bit more work, as I couldn't sort out a means to export the search results... Anyway after a bit of counting I got this:

Well, now we're getting somewhere! Clearly there's an increasing number of papers discussing playlists at ISMIR. But wait, you say, this doesn't take into account the considerable expansion of the size of the conference over it's existence. So we can normalize to the number of papers per year that are known the the Cumulative ISMIR proceedings ( [35, 43, 62, 56, 108, 119, 99, 131, 111, 148] from 2000 - 2009 if anyone is interested). Below you can see both the title only and full paper search results normalized to the total number of papers:

The normalization didn't seem to change the trend much. But this leaves me wondering, what can be drawn from the the massive (and growing) disparity between title mentions and fulltext mentions? Obviously one would expect a higher number of hits, but a tenfold increase, seems very large. My first suspicion is that a great deal of this disparity comes from the fact that many papers at ISMIR that mention playlists are actually about something else (music similarity for instance) and then throw on a playlist as something of an afterthought. Perhaps this is an implicit acknowledgment of the great human-factor power of the playlist (as discussed in for instance this paper) or perhaps it's something else entirely.

Regardless of these finer points, it's clearly fair to say that there is a great deal of interest in playlist generation and analysis. If you're interested in these things, why not sign up for our tutorial?

Sunday, 21 March 2010


Hi blog readers. You may notice a slight change of scenery and the addition of some links just above the main text body. The links go the the other parts of my homepage and the color scheme shift is to keep everything consistent. I'm not much of a designer so I'll happily take any critique of the color scheme and such...

Monday, 1 March 2010

IEEE-THEMES --shameless self promotion--

I'm going to be presenting work at IEEE-THEMES, a workshop collocated with ICASSP, on March 15th in Dallas, TX. The talk is associated with an article to be published in the august issue of Select Topics in Signal Processing, which is a special issue on signal processing and social networks. Here's the title/abstract (note: link is to a preprint, camera-ready isn't due till after the talk so paper may well change a touch...) :

Abstract—This paper presents an extensive analysis of a sample of a social network of musicians. The network sample is first analyzed using standard complex network techniques to verify that it has similar properties to other web-derived complex networks. Content-based pairwise dissimilarity values between the musical data associated with the network sample are computed, and the relation- ship between those content-based distances and distances from network theory explored. Following this exploration, hybrid graphs and distance measures are constructed, and used to examine the community structure of the artist network. Finally, results of these investigations are presented and considered in the light of recommendation and discovery applications with these hybrid measures as their basis.
The paper mostly covers content that has been discussed elsewhere (much of it with Kurt Jacobson) refactored for a broader audience and with wider narratives in mind. That said there are some notable new findings in the paper as well. We have run another acoustic dissimilarity measure across the entire set (the 2009 MIREX entry in audio music similarity using marsyas) which for the most part confirms our earlier findings (that acoustic similarity and social similarity [mostly] aren't linearly correlated and that community genre labeling becomes more homogeneous [again, mostly] when using the audio sim as a weight). Additionally, we have broadened our comparison metrics to include an examination of the mutual information between the different dissimilarity sets. This also basically confirms our earlier findings, though mutual information provides a very satisfying level of nuance that is not possible from simply testing (using Pearsons) for linear correlation, especially given that our data is quite far from a normal distribution. So, if you're planning to be at ICASSP, I'd highly recommend IEEE-THEMES (the rest of the program looks to be very interesting as well...) and if you aren't going to be in Dallas, there are a few options for you.
  1. If you're in London right now, you can come to Goldsmiths today at 4pm to rm 144 in the main building, where I'll be giving a trail run of the talk.
  2. Slides (and perhaps some video) will be made available at some point (probably just after the talk is given).
  3. IEEE is running a pay-to-watch live stream of THEMES, so there's that as well.
Generally, if you're going to be in Dallas fr0m March 15-19, much discussion can happen in person. Also, between now and then I'll be doing some traveling (tomorrow till 6 March I'll be at UIUC, then from there till the 14th of March I'll be in San Diego) so if any readers are interested in some in person discussion and our locations overlap, let me know and perhaps something can be arranged.

Thursday, 11 February 2010

scipy and numpy from source, revisted

A while back I posted some instructions for getting scipy and numpy mostly up and running from current svn checkouts under python 2.6 with mac os x 10.5.8. I updated to 10.6 sometime back and have been using the preinstalled version of numpy (1.2) for my array needs without any scipy with solid results. However, I needed to get at some scipy functionality (doing some mutual information analysis via pyentropy) so I thought I'd give the process a go with the newer OS version. I'm pleased to report that everything works and was relatively easy to install/build. Basically the old instructions still hold with a couple points.

  1. It is necessary to update to a newer version of numpy, that you compile using the same fortran compiler you'll use with scipy.
  2. If you're using the build of macpython that comes with 10.6 (which is py2.6) you'll need to add the option --install-lib=/Library/Python/2.6/site-packages/ to any commands using distutil to install (eg. setup.py install)
And that's about it. I used fresh check outs of scipy (r6233, v0.8.0.dev) and numpy(r8106, v1.5.0.dev), but the same versions I've had for a while of Sparse and gFortran (the details of which are in the old post). As bonus this seems to result in less unittest failures in scipy (now only 10!) for whatever that's worth.

Wednesday, 27 January 2010

MusicHackday: Stockholm

So in a touch more than 48hrs I'll be hoping on a plane to go the Stockholm MusicHackday. It should be excellent, if the last one I went to is any judge. I'll be joined by fellow ISMS member Mike Jewell. The hack is being formulated, but may involve The World Bank's api and some yet to be determined sources of listener statistics. Also, somehow the echonest's api will be involved because I need to leave stockholm with one of these. We may need some further assistance to get something done in 24hrs, so if you're going to be at the hack and are looking for some folk to hack with drop a line in the comments...

A bit about playlists and similarity

Sorry about the general radio silence of late. Many things going on, most of them interesting.
Lately I've been spending quite a bit of time considering various aspects of playlist generation and how they all fit together. Here are some of my lines of thought:
  1. Evaluation of a playlist. How? Along which dimension? (Good v. Bad, Appropriate v. Offensive, Interesting v. Boring)
  2. How do people in various functions create playlists? How does this process and its output compare to common (or state of the art) methods employed in automatic playlist construction. This is to say, are we doing it right? Are the correct questions even being asked?
  3. What is the relationship between notions of music similarity (or pairwise relationship in the generic) and playlist construction?

While all these ideas are interrelated, for now I'm going to pick at point (3) a bit. I'm coming to believe this is central in understanding the other two points as well, at least to an extent. There are many ways to consider how two songs are related. In music informatics this similarity is almost always content-based, even if it isn't content derived. This can include methods based on timbral or harmonic features or most tags or similar labels (though these sometimes get away from content descriptors). This paints some kind of picture but leaves out something that can be critical to manual playlist construct as it is commonly understood (e.g. in radio or the creation of a 'mixtape'), socio-cultural context. In order to have the widest array of possible playlist constructions, it is necessary to have as complete an understanding of the relationship between member songs (not just neighbors...). Put another way, the complexity of your playlist is maximally bound by the complexity of your similarity measure.
Where M is some not yet existant measure of the possible semantic complexity of a playlist and s is a similar measure of the semantic complexity of the similarity measure used in the construction of that playlist. C is our fudge factor constant. Now, obviously there are plenty of situations where complex structure isn't required. But if the goal is to make playlists for a wide range of functions and settings, it will be required some times.

In practice what this means is that you can make a bag of songs from a bag of features. However, imparting long form structure is at a minimum dependant on a much more complex understanding of the relationships (eg. sim) between songs (say from social networks or radio logs...)

Anyway, this is all a bit vague right now. I'm working on some better formalization, we'll see how that goes. Anyone have any thoughts?