can you share it? assuming they aren’t real names, of course
attach a text file consisting of the dumped CREATE TABLE and INSERT statements
doesn’t have to be all 100, just a few will do
can you share it? assuming they aren’t real names, of course
attach a text file consisting of the dumped CREATE TABLE and INSERT statements
doesn’t have to be all 100, just a few will do
CREATE TABLE IF NOT EXISTS `friends_list` (
`user_1` int(11) NOT NULL,
`user_2` int(11) NOT NULL,
`created_on` datetime NOT NULL,
KEY `user_1` (`user_1`,`user_2`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `friends_list`
--
INSERT INTO `friends_list` (`user_1`, `user_2`, `created_on`) VALUES
(3, 2, '0000-00-00 00:00:00'),
(1, 5, '0000-00-00 00:00:00'),
(3, 1, '0000-00-00 00:00:00'),
(1, 2, '0000-00-00 00:00:00'),
(1, 4, '0000-00-00 00:00:00'),
(1, 7, '0000-00-00 00:00:00'),
(1, 8, '0000-00-00 00:00:00'),
(2, 7, '0000-00-00 00:00:00'),
(3, 8, '0000-00-00 00:00:00'),
(3, 4, '0000-00-00 00:00:00'),
(3, 10, '0000-00-00 00:00:00'),
(4, 7, '0000-00-00 00:00:00'),
(4, 8, '0000-00-00 00:00:00'),
(4, 9, '0000-00-00 00:00:00'),
(5, 8, '0000-00-00 00:00:00'),
(6, 3, '0000-00-00 00:00:00'),
(6, 2, '0000-00-00 00:00:00'),
(6, 10, '0000-00-00 00:00:00'),
(7, 8, '0000-00-00 00:00:00'),
(7, 9, '0000-00-00 00:00:00'),
(7, 5, '0000-00-00 00:00:00'),
(9, 2, '0000-00-00 00:00:00'),
(10, 1, '0000-00-00 00:00:00'),
(10, 2, '0000-00-00 00:00:00'),
(10, 5, '0000-00-00 00:00:00'),
(10, 7, '0000-00-00 00:00:00'),
(10, 9, '0000-00-00 00:00:00');
On the original data given on my first post this query worked fine. However using the data in the post above, the query seems to break down.
sorry for delay, here ya go…
first, let’s go get all my friends
SELECT 'x', user_2 AS my_friend
FROM friends_list
WHERE user_1 = 1 -- that's me!!
UNION ALL
SELECT 'y', user_1
FROM friends_list
WHERE user_2 = 1 -- that's me!!
ORDER
BY my_friend
x my_friend
x 2
y 3
x 4
x 5
x 7
x 8
y 10
i used ‘x’ and ‘y’ to flag the results based on which “direction” the friendship is stored as, since you are storing only one row for each friendship
now let’s get the friends of my friends
SELECT my_f_of_fs.user_2 AS f_of_f
FROM friends_list AS my_friends
INNER
JOIN friends_list AS my_f_of_fs
ON my_f_of_fs.user_1 = my_friends.user_2
WHERE my_friends.user_1 = 1 -- that's me!!
UNION
SELECT my_f_of_fs.user_2
FROM friends_list AS my_friends
INNER
JOIN friends_list AS my_f_of_fs
ON my_f_of_fs.user_1 = my_friends.user_1
WHERE my_friends.user_2 = 1 -- that's me!!
ORDER
BY f_of_f
f_of_f
1
2
4
5
7
8
9
10
notice that 1 is included, which means i am a friend of one of my friends
so now all we have to do is find the f_of_f which aren’t my_friend
let’s use a LEFT OUTER JOIN with an IS NULL test to do this
SELECT l.f_of_f
FROM ( SELECT my_f_of_fs.user_2 AS f_of_f
FROM friends_list AS my_friends
INNER
JOIN friends_list AS my_f_of_fs
ON my_f_of_fs.user_1 = my_friends.user_2
WHERE my_friends.user_1 = 1 -- that's me!!
UNION
SELECT my_f_of_fs.user_2
FROM friends_list AS my_friends
INNER
JOIN friends_list AS my_f_of_fs
ON my_f_of_fs.user_1 = my_friends.user_1
WHERE my_friends.user_2 = 1 -- that's me!!
) AS l
LEFT OUTER
JOIN ( SELECT user_2 AS my_friend
FROM friends_list
WHERE user_1 = 1 -- that's me!!
UNION ALL
SELECT user_1
FROM friends_list
WHERE user_2 = 1 -- that's me!!
) AS r
ON r.my_friend = l.f_of_f
WHERE l.f_of_f <> 1 -- don't want myself
AND r.my_friend IS NULL
f_of_f
9
so 9 is a friend of a friend, but isn’t my friend
see the UNIONs involved in both subqueries? you wouldn’t have to do that if you stored two rows for each friendship
as i said, the SQL would be a lot simpler
Thanks for that solution, certainly beyond what I would have came up with ! Further down the road, with all those unions as you pointed out, I have a bad feeling this will not scale well when I have say 1000 friend connections ?
it is not necessary to quote my entire post in order to reply to it
1000 friend connections is trivially small
but you are right, there is a chance of poor performance
as you may know, sometimes mysql will use a table scan if a query involves a UNION
the only way to ensure you have good performance is (1) do an EXPLAIN on the query, and (2) test it on a large volume
all things considered, if there is a chance that you could have less-than-optimal performance by storing only one row per relationship, and almost a sure guarantee that performance will be as good as possible by storing two rows per relationship, why wouldn’t you?
what does a terabyte of storage cost nowadays? under a hunnert bucks!!
you should not optimize storage, rather, you should use storage to optimize your performance
Right about the table scan, especially with the number of unions I am using.
I guess the only way I will be able to tell the real performace issues will be when I stress test the system under a heavy load.
Good point re the storage over performance.
I’m working the query you provided now so arrange the results by number of “friends of friends” in descending order so will post back here if I ever get a “clean” solution for any others that might need the same thing.
Cheers