Friends of Friends

can you share it? assuming they aren’t real names, of course

attach a text file consisting of the dumped CREATE TABLE and INSERT statements

doesn’t have to be all 100, just a few will do



CREATE TABLE IF NOT EXISTS `friends_list` (
  `user_1` int(11) NOT NULL,
  `user_2` int(11) NOT NULL,
  `created_on` datetime NOT NULL,
  KEY `user_1` (`user_1`,`user_2`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `friends_list`
--

INSERT INTO `friends_list` (`user_1`, `user_2`, `created_on`) VALUES
(3, 2, '0000-00-00 00:00:00'),
(1, 5, '0000-00-00 00:00:00'),
(3, 1, '0000-00-00 00:00:00'),
(1, 2, '0000-00-00 00:00:00'),
(1, 4, '0000-00-00 00:00:00'),
(1, 7, '0000-00-00 00:00:00'),
(1, 8, '0000-00-00 00:00:00'),
(2, 7, '0000-00-00 00:00:00'),
(3, 8, '0000-00-00 00:00:00'),
(3, 4, '0000-00-00 00:00:00'),
(3, 10, '0000-00-00 00:00:00'),
(4, 7, '0000-00-00 00:00:00'),
(4, 8, '0000-00-00 00:00:00'),
(4, 9, '0000-00-00 00:00:00'),
(5, 8, '0000-00-00 00:00:00'),
(6, 3, '0000-00-00 00:00:00'),
(6, 2, '0000-00-00 00:00:00'),
(6, 10, '0000-00-00 00:00:00'),
(7, 8, '0000-00-00 00:00:00'),
(7, 9, '0000-00-00 00:00:00'),
(7, 5, '0000-00-00 00:00:00'),
(9, 2, '0000-00-00 00:00:00'),
(10, 1, '0000-00-00 00:00:00'),
(10, 2, '0000-00-00 00:00:00'),
(10, 5, '0000-00-00 00:00:00'),
(10, 7, '0000-00-00 00:00:00'),
(10, 9, '0000-00-00 00:00:00');


On the original data given on my first post this query worked fine. However using the data in the post above, the query seems to break down.

sorry for delay, here ya go…

first, let’s go get all my friends


SELECT 'x', user_2 AS my_friend
  FROM friends_list 
 WHERE user_1 = 1  -- that's me!!
UNION ALL 
SELECT 'y', user_1 
  FROM friends_list 
 WHERE user_2 = 1  -- that's me!!
ORDER
    BY my_friend

x  my_friend
x     2
y     3
x     4
x     5
x     7
x     8
y    10 

i used ‘x’ and ‘y’ to flag the results based on which “direction” the friendship is stored as, since you are storing only one row for each friendship

now let’s get the friends of my friends

SELECT my_f_of_fs.user_2  AS f_of_f
  FROM friends_list AS my_friends
INNER
  JOIN friends_list AS my_f_of_fs
    ON my_f_of_fs.user_1 = my_friends.user_2
 WHERE my_friends.user_1 = 1  -- that's me!!
UNION 
SELECT my_f_of_fs.user_2
  FROM friends_list AS my_friends
INNER
  JOIN friends_list AS my_f_of_fs
    ON my_f_of_fs.user_1 = my_friends.user_1
 WHERE my_friends.user_2 = 1  -- that's me!!
 ORDER 
     BY f_of_f

f_of_f
1
2
4
5
7
8
9
10

notice that 1 is included, which means i am a friend of one of my friends :wink:

so now all we have to do is find the f_of_f which aren’t my_friend

let’s use a LEFT OUTER JOIN with an IS NULL test to do this

SELECT l.f_of_f
  FROM ( SELECT my_f_of_fs.user_2  AS f_of_f         
           FROM friends_list AS my_friends           
         INNER                                       
           JOIN friends_list AS my_f_of_fs           
             ON my_f_of_fs.user_1 = my_friends.user_2
          WHERE my_friends.user_1 = 1  -- that's me!!
         UNION                                       
         SELECT my_f_of_fs.user_2                    
           FROM friends_list AS my_friends           
         INNER                                       
           JOIN friends_list AS my_f_of_fs           
             ON my_f_of_fs.user_1 = my_friends.user_1
          WHERE my_friends.user_2 = 1  -- that's me!!
      ) AS l
LEFT OUTER
  JOIN ( SELECT user_2 AS my_friend  
           FROM friends_list              
          WHERE user_1 = 1  -- that's me!!
         UNION ALL                        
         SELECT user_1               
           FROM friends_list              
          WHERE user_2 = 1  -- that's me!!
      ) AS r
    ON r.my_friend = l.f_of_f
 WHERE l.f_of_f <> 1 -- don't want myself
   AND r.my_friend IS NULL

f_of_f
  9

so 9 is a friend of a friend, but isn’t my friend

see the UNIONs involved in both subqueries? you wouldn’t have to do that if you stored two rows for each friendship

as i said, the SQL would be a lot simpler

:slight_smile:

Thanks for that solution, certainly beyond what I would have came up with ! Further down the road, with all those unions as you pointed out, I have a bad feeling this will not scale well when I have say 1000 friend connections ?

it is not necessary to quote my entire post in order to reply to it :slight_smile:

1000 friend connections is trivially small

but you are right, there is a chance of poor performance

as you may know, sometimes mysql will use a table scan if a query involves a UNION

the only way to ensure you have good performance is (1) do an EXPLAIN on the query, and (2) test it on a large volume

all things considered, if there is a chance that you could have less-than-optimal performance by storing only one row per relationship, and almost a sure guarantee that performance will be as good as possible by storing two rows per relationship, why wouldn’t you?

what does a terabyte of storage cost nowadays? under a hunnert bucks!!

you should not optimize storage, rather, you should use storage to optimize your performance

Right about the table scan, especially with the number of unions I am using.

I guess the only way I will be able to tell the real performace issues will be when I stress test the system under a heavy load.

Good point re the storage over performance.

I’m working the query you provided now so arrange the results by number of “friends of friends” in descending order so will post back here if I ever get a “clean” solution for any others that might need the same thing.

Cheers :slight_smile: