
时间:2022-01-21 00:32:55

I have a pretty basic question on which is the preferred way of storing data in my database.


I have a table called "users" with each user getting a username and user_id. Now, I want to make a table called "comments" for users to comment on news.


Is it better to have a column in comments called "username" and storing the logged in user's name, or have a column called "user_id". If I use user_id I would have to make my sql statement have another select statement. "(SELECT username FROM users WHERE users.id = comments.user_id) as username". It seems like performance would be better just storing the username.

在注释中有一个名为“username”的列并存储登录用户的名称,或者有一个名为“user_id”的列是否更好?如果我使用user_id,我将不得不让我的sql语句有另一个select语句。 “(SELECT username FROM users WHERE users.id = comments.user_id)as username”。似乎只需存储用户名,性能会更好。

I thought I read to avoid duplicate data in a database though.


Which is better?



8 个解决方案


Typically, you use ID fields to link tables together. The reason being (in your situation) that you might allow the person to change their username, but you don't want to try and update all the places that is at...


Therefore, put the user_id in your comments table and pull the username out on a join, as you've shown.



If the user_id is the primary key then you should use user_id instead of username, if you want to use username instead of user_id then why do you have a user_id in the first place?



If there's the potential of creating a large enough database, store the user_id in the comments table. Less overhead. Also consider that usernames my be modified easier this way.



Data should be stored in (at least) third normalized form, so you should use the user_id as the primary key in the users table, and as a foreign key in the comments table and use this to get the details:


SELECT comments.*, users.username  
FROM comments, users
WHERE users.user_id = comments.user_id;

If you are getting the comments based on an article, you could do this like this:


SELECT comments.*, users.username  
FROM comments, users
WHERE users.user_id = comments.user_id  
AND comments.article_id = '$current_article_id';


Storing the userid (integer) will mean faster JOINs later. Unless you plan on having people dig through the database by hand, there's really no reason to use the username



I'm pretty sure storing the user id in the comments table is sufficient. If you're returning rows from the comments table, just use the JOIN statement.




Which is going to be a unique identifier? The user_id, I'd bet, or you can't have two "John Smith"s in your system.

哪个是唯一的标识符? user_id,我打赌,或者你的系统中不能有两个“John Smith”。

And if volume is much of a concern, text matching the username field is going to be more expensive than linking to the users table in your query in the long term.



Numeric values are cheaper to join and index than an alphanumeric id. Use a number to uniquely identify a row. Another benefit is that the PK doesn't need to change if they need to change the user id. The last benefit is that this is the design of most modern web frameworks such as django and rails.



Typically, you use ID fields to link tables together. The reason being (in your situation) that you might allow the person to change their username, but you don't want to try and update all the places that is at...


Therefore, put the user_id in your comments table and pull the username out on a join, as you've shown.



If the user_id is the primary key then you should use user_id instead of username, if you want to use username instead of user_id then why do you have a user_id in the first place?



If there's the potential of creating a large enough database, store the user_id in the comments table. Less overhead. Also consider that usernames my be modified easier this way.



Data should be stored in (at least) third normalized form, so you should use the user_id as the primary key in the users table, and as a foreign key in the comments table and use this to get the details:


SELECT comments.*, users.username  
FROM comments, users
WHERE users.user_id = comments.user_id;

If you are getting the comments based on an article, you could do this like this:


SELECT comments.*, users.username  
FROM comments, users
WHERE users.user_id = comments.user_id  
AND comments.article_id = '$current_article_id';


Storing the userid (integer) will mean faster JOINs later. Unless you plan on having people dig through the database by hand, there's really no reason to use the username



I'm pretty sure storing the user id in the comments table is sufficient. If you're returning rows from the comments table, just use the JOIN statement.




Which is going to be a unique identifier? The user_id, I'd bet, or you can't have two "John Smith"s in your system.

哪个是唯一的标识符? user_id,我打赌,或者你的系统中不能有两个“John Smith”。

And if volume is much of a concern, text matching the username field is going to be more expensive than linking to the users table in your query in the long term.



Numeric values are cheaper to join and index than an alphanumeric id. Use a number to uniquely identify a row. Another benefit is that the PK doesn't need to change if they need to change the user id. The last benefit is that this is the design of most modern web frameworks such as django and rails.
