Hi guys I am developing open source social network and my skills dont really lend themselves to mysql.
Trying to implement a feature to find the friends of a friend who I am not already friends with. Its not going well !
My DB looks like this:
TABLE: friends_list
user_1 user_2
2 1
3 2
1 4
1 5
So in theory if my userid is ‘1’ then finding the friends of friends who I am not friends with should return ‘3’.
My current query is the following, although I dont think its even close !
SELECT * FROM friends_list WHERE user_2 != ‘1’ AND user_1 IN(SELECT user_2 FROM friends_list WHERE user_1 = ‘1’) AND user_1 NOT IN(SELECT user_1 FROM friends_list WHERE user_2 = ‘1’);
This might work - and if it does, there might very well be an easier way to do it
SELECT
user_2
FROM
friends_list
WHERE
user_2 <> 1
AND
user_1
IN (
SELECT
user_2
FROM
friends_list
WHERE
user_1 = 1
UNION SELECT
user_1
FROM
friends_list
WHERE
user_2 = 1
)
UNION SELECT
user_1
FROM
friends_list
WHERE
user_1 <> 1
AND
user_2
IN (
SELECT
user_2
FROM
friends_list
WHERE
user_1 = 1
UNION SELECT
user_1
FROM
friends_list
WHERE
user_2 = 1
)
Because it’s highly impractical, if you use database tables properly. I can’t think of any example in which it is possible to properly design a database with a many-to-many relation between two tables, without a table in between to sort out the who’s-which.
Example:
[list][]A person may own many companies.
[]A company may be owned by many persons.[/list]
If you were to build this relation using only two tables, you would have two options:
[list][]The person table has as many fields as there are companies, registering potential ownership of each company under each person. This will result in absurdly large tables.
[]The person table has a single field with a comma-seperated list of companies owned. This will make it extremely inefficient and complicated to extract data using normal database queries, making the point of using a database in the first place moot.[/list]
(Note: I’m aware that this is technically four different options, as the person and company tables can be interchanged in the example, but either solution is equally impractical).
On the subject of relations, one-to-one relations should not be used either, as this is analogous to using a single table data model-wise, but much more complicated and slow when making queries.
Edit: Hmmm… Seing as you are an SQL consultant, I assume this is some sort of trick question?
okay, if you had said “without a table in between” then i would’ve understood immediately
Fair enough - my terminology is that of a hobbyist, and quite possibly a bit crude
make sense?
It does, yes. I agree that performance issues can call for a one-to-one relation. I will maintain that it’s still analogous data model-wise, though. I doubt that this will affect very many users here, however, compared to the numbers of users who would be negatively affected by using multiple tables where one is sufficient. Those who have to build such large tables are likely to know a lot more about databases than me anyway, and wouldn’t listen to anything I have to say
(I would appreciate your input on my SQL, by the way - I’m still learning :))
SELECT user_2
FROM friends_list
WHERE user_2 <> 1
AND user_1 IN
( SELECT user_2
FROM friends_list
WHERE user_1 = 1
UNION
SELECT user_1
FROM friends_list
WHERE user_2 = 1 )
UNION
SELECT user_1
FROM friends_list
WHERE user_1 <> 1
AND user_2 IN
( SELECT user_2
FROM friends_list
WHERE user_1 = 1
UNION
SELECT user_1
FROM friends_list
WHERE user_2 = 1 )
i’m not sure if that’s actually going to return what “justme” is looking for, but it sure looks like it may have some redundancy in it
let’s wait and find out if relationships are stored as two rows or one …
Hope you’re not as picky with the ladies, as you are with your SQL styling
Your suggestion will take some getting used to (been using my current style for some years now, and coding habits die hard), but I’ll definitely give it some though. Thanks for the suggestion
Thanks for all the replies. It certainly makes for an interesting read !
The relationships are stored as 1 row only. I saw most DB designs for friends to be something like User_ID, Friend_ID. However friendships are bidirectional, ie when user_2 logs in he is friends with user_1 and vice versa, so either user should be able to appear in the user_1 or user_2 column.
But i’m still not sure this is the best way to go about things. Kudos for the sql statement, so far works just as expected, thanks !
Agreed, the SQL queries have become increasingly complex, but so far ok !
Does having two rows affect performance at all ? Baring in mind I am using this for a open source social network, there is a possiblity of a large number of relationships.
Thanks for the heads up, I naturally assumed more records = longer query times. However thinking about all the bloated SQL queries I am having, you might just be on the money !
With regards to:
SELECT user_2
FROM friends_list
WHERE user_2 <> 1
AND user_1 IN
( SELECT user_2
FROM friends_list
WHERE user_1 = 1
UNION
SELECT user_1
FROM friends_list
WHERE user_2 = 1 )
UNION
SELECT user_1
FROM friends_list
WHERE user_1 <> 1
AND user_2 IN
( SELECT user_2
FROM friends_list
WHERE user_1 = 1
UNION
SELECT user_1
FROM friends_list
WHERE user_2 = 1 )
How can I tell which is the friend and which is the friend of the friend. I dont really want to make another query call to check who is the friend. Or would this complicated things even more ?
opps, so it isn’t, i saw the code that you styled and assumed.
I have a codebase of around 100 relationships for testing. Further down the road what I think I will need to do is sort the results by weighting the count of mutual friends. ie the more friends a user has in common with us who is not a friend already will appear as more relevant.