Compare average - SPARQL -
there dataset of users ranking movies. need find users similar taste user1. similar taste defined follows: consider average rank genre
user1
avgr1
, same genre user2
avgr2
, user1
, user2
have similar taste abs(avgr1-avgr2)<1
. far able names, genre , absolute value between averages, filtering comparison not working.
select ?p ?p1 ?genre (abs (avg(?rating)-avg(?ratingp1)) ?rdiff) where{ ?p movies:hasrated ?rate. ?p1 foaf:knows ?p. ?rate movies:ratedmovie ?mov. ?rate movies:hasrating ?rating. ?mov movies:hasgenre ?genre. ?p1 movies:hasrated ?ratep1. ?ratep1 movies:ratedmovie ?movp1. ?ratep1 movies:hasrating ?ratingp1. ?movp1 movies:hasgenre ?genre. filter (xsd:float(?rdiff)<1.0 && ?p=movies:user1) } group ?p ?p1 ?genre
it's hard answer these kinds of questions without sample data work with. here's sample data has 2 users have similar rankings on comedy, different rankings on romance:
@prefix : <urn:ex:> :a :ranks [ :genre :comedy ; :value 2 ], [ :genre :comedy ; :value 3 ], [ :genre :comedy ; :value 3 ], [ :genre :romance ; :value 7 ], [ :genre :romance ; :value 8 ], [ :genre :romance ; :value 9 ]. :b :ranks [ :genre :comedy ; :value 3 ], [ :genre :comedy ; :value 3 ], [ :genre :comedy ; :value 4 ], [ :genre :romance ; :value 0 ], [ :genre :romance ; :value 1 ], [ :genre :romance ; :value 0 ].
here's query computes difference of average rankings on genres:
prefix : <urn:ex:> select ?user1 ?user2 ?genre (abs(avg(?value1)-avg(?value2)) ?diff) { ?user1 :ranks [ :genre ?genre ; :value ?value1 ]. ?user2 :ranks [ :genre ?genre ; :value ?value2 ]. filter (str(?user1) < str(?user2)) #-- avoid duplicate user1/user2, user2/user1 results } group ?user1 ?user2 ?genre order ?diff
--------------------------------------------------------- | user1 | user2 | genre | diff | ========================================================= | :a | :b | :comedy | 0.666666666666666666666667 | | :a | :b | :romance | 7.666666666666666666666667 | ---------------------------------------------------------
now, can't filter on aggregate results, have use having, take values diff less particular value, you'd this:
prefix : <urn:ex:> select ?user1 ?user2 ?genre (abs(avg(?value1)-avg(?value2)) ?diff) { ?user1 :ranks [ :genre ?genre ; :value ?value1 ]. ?user2 :ranks [ :genre ?genre ; :value ?value2 ]. filter (str(?user1) < str(?user2)) } group ?user1 ?user2 ?genre having (?diff < 1) order ?diff
-------------------------------------------------------- | user1 | user2 | genre | diff | ======================================================== | :a | :b | :comedy | 0.666666666666666666666667 | --------------------------------------------------------
if don't care actual diff, except it's below threshold, can put expression in having directly, , do:
select ?user1 ?user2 ?genre { #-- ... } group ?user1 ?user2 ?genre having (abs(avg(?value1)-avg(?value2)) < 1)
Comments
Post a Comment