为MySQL数据库中的长varchar字段强制实施唯一值

时间:2022-09-27 09:00:22

I have a database field which is very long (2083 characters) and therefore I can't set the UNIQUE constraint on it in MySQL.

我有一个非常长的数据库字段(2083个字符),因此我无法在MySQL中设置UNIQUE约束。

I've tried to modify a stored procedure so I can check if the value being attempted to insert is unique and if not, to stop the insert. But the Stored procedure always errors. I've tried lots of different syntaxes. It's really strange, the stored procedure is fine with just the if statement or the insert but not with both.

我试图修改存储过程,以便检查尝试插入的值是否唯一,如果不是,则停止插入。但存储过程总是错误的。我尝试了很多不同的语法。这真的很奇怪,只有if语句或插入存储过程很好,但两者都没有。

Any help gratefully appreciated.

任何帮助感激不尽。

BEGIN
SET @insertedid := 0;
SET @alreadyfound := 0;
SELECT COUNT(lref_id) FROM listing_referrers WHERE lref_referring_url=in_lref_referring_url INTO @alreadyfound;

IF (@alreadyfound = 0) THEN
BEGIN
    INSERT INTO listing_referrers (lref_listing_id,lref_referring_url,lref_createdby_userid,lref_created) VALUES (in_lref_listing_id, in_lref_referring_url, in_lref_createdby_userid, in_lref_created);
    SET @insertedid = last_insert_id();
END
SELECT @insertedid;
END

1 个解决方案

#1


1  

Idea using a Trigger

I would not use a procedure here, but prefer using a trigger ON INSERT. You may find the my answer at How to prevent using digits in a VARCHAR column using mysql? interesting for this, which mainly combines the idea of a CREATE TRIGGER with the idea of SIGNALing it in case an error situation is detected.

我不会在这里使用过程,但更喜欢使用触发器ON INSERT。您可以在如何使用mysql阻止在VARCHAR列中使用数字找到我的答案?有趣的是,它主要结合了CREATE TRIGGER的想法和在检测到错误情况时SIGNAL的想法。

So, in your case it may be something like

所以,在你的情况下,它可能是这样的

DELIMITER //

CREATE TRIGGER trg_listing_referrers_unit_ins before insert on listing_referrers
for each row
begin
    IF EXISTS (SELECT * FROM listing_referrers WHERE lref_referring_url = new.lref_referring_url)
        THEN signal sqlstate '45000' set message_text = 'trying to insert duplicate lref_referring_url';
    END IF;
END
//

DELIMITER ;

Note, however, that this requires MySQL 5.5 or higher. If you have an earlier version of MySQL running, this might become ugly...

但请注意,这需要MySQL 5.5或更高版本。如果你有一个早期版本的MySQL运行,这可能会变得丑陋......

NB: Depending on the size of your table listing_referrers the SELECT statement in the trigger might become a performance hog. Make sure that, even if you can't define a UNIQUE index over lref_referring_url that have a NON-UNIQUE index in place. Otherwise you would end up in an O(n) algorithm instead of an O(log n) one, which may make a huge difference.

注意:根据表list_referrers的大小,触发器中的SELECT语句可能会成为性能损失。即使您无法在lref_referring_url上定义具有NON-UNIQUE索引的UNIQUE索引,也要确保这一点。否则你最终将使用O(n)算法而不是O(log n)算法,这可能会产生巨大的差异。

Idea using a Hashed Key Column

Instead of tracking uniqueness over the full VARCHAR, you also may check the uniqueness of a hashed value of your real key. For instance, you may add another column called key_hash typed CHAR(64) ASCII to your table, on which you will put a UNIQUE KEY constraint (you may even make this the primary key, if you want).

您还可以检查真实密钥的散列值的唯一性,而不是跟踪整个VARCHAR的唯一性。例如,您可以向表中添加另一个名为key_hash的CHAR(64)ASCII列,在该表上将放置一个UNIQUE KEY约束(如果需要,您甚至可以将其设为主键)。

When inserting, you let MySQL compute the SHA2-hashed value of it on the fly, like

插入时,让MySQL动态计算它的SHA2哈希值,比如

INSERT INTO listing_referrers (lref_referring_url, key_hash)
  VALUES("http://server.bogus/mylong/url", SHA2("http://server.bogus/mylong/url", 256));

If you enter the same URL twice, its SHA2 value will be the same and thus the UNIQUE KEY constraint of key_hash will start to rebel.

如果两次输入相同的URL,则其SHA2值将相同,因此key_hash的UNIQUE KEY约束将开始反叛。

Due to the cryptographically-verified property of collision resistance of this function, it is ensured that it is practically impossible that you will find two URLs which have the same hash value. If you encounter one, please post it to the cryptographers community - they will be very keen to know your case for sure (and many intelligence services as well).

由于此函数具有密码验证的抗冲突性,因此确保您找到两个具有相同散列值的URL实际上是不可能的。如果您遇到一个,请将其发布给密码学家社区 - 他们将非常热衷于了解您的情况(以及许多情报服务)。

NB: If the URL of a record changes (via UPDATE) you always need to make sure that also the hash_key gets updated accordingly. Otherwise, you will go crazy. If this is a real use case for you, you may also want to have a look at MySQL: Computed Column (I don't know by heart, if this also works on a primary key column).

注意:如果记录的URL发生变化(通过UPDATE),您总是需要确保hash_key也相应地更新。否则,你会发疯的。如果这是一个真实的用例,你可能还想看看MySQL:Computed Column(我不知道,如果这也适用于主键列)。

Idea of limiting the chars for uniqueness

If you are able to limit the uniqueness for the first n chars (with n < 255), then you might find Storing email VARCHAR(320) as UNIQUE, #1071 - Specified key was too long; max key length is 767 interesting.

如果您能够限制前n个字符的唯一性(n <255),那么您可能会发现存储电子邮件VARCHAR(320)为UNIQUE,#1071 - 指定的密钥太长;最大密钥长度是767有趣。

Idea with limiting to 3072 chars

If you can make sure that your URL won't exceed 3072 chars, changing the charset to ASCII will increase this boundary (compared to 256 chars in UTF-8) to 3072. Details are described in MySQL unique 1500 varchar field error (#1071 - Specified key was too long)

如果您可以确保您的URL不超过3072个字符,将字符集更改为ASCII将增加此边界(与UTF-8中的256个字符相比)到3072.详细信息在MySQL唯一1500 varchar字段错误中描述(#1071 - 指定密钥太长了)

#1


1  

Idea using a Trigger

I would not use a procedure here, but prefer using a trigger ON INSERT. You may find the my answer at How to prevent using digits in a VARCHAR column using mysql? interesting for this, which mainly combines the idea of a CREATE TRIGGER with the idea of SIGNALing it in case an error situation is detected.

我不会在这里使用过程,但更喜欢使用触发器ON INSERT。您可以在如何使用mysql阻止在VARCHAR列中使用数字找到我的答案?有趣的是,它主要结合了CREATE TRIGGER的想法和在检测到错误情况时SIGNAL的想法。

So, in your case it may be something like

所以,在你的情况下,它可能是这样的

DELIMITER //

CREATE TRIGGER trg_listing_referrers_unit_ins before insert on listing_referrers
for each row
begin
    IF EXISTS (SELECT * FROM listing_referrers WHERE lref_referring_url = new.lref_referring_url)
        THEN signal sqlstate '45000' set message_text = 'trying to insert duplicate lref_referring_url';
    END IF;
END
//

DELIMITER ;

Note, however, that this requires MySQL 5.5 or higher. If you have an earlier version of MySQL running, this might become ugly...

但请注意,这需要MySQL 5.5或更高版本。如果你有一个早期版本的MySQL运行,这可能会变得丑陋......

NB: Depending on the size of your table listing_referrers the SELECT statement in the trigger might become a performance hog. Make sure that, even if you can't define a UNIQUE index over lref_referring_url that have a NON-UNIQUE index in place. Otherwise you would end up in an O(n) algorithm instead of an O(log n) one, which may make a huge difference.

注意:根据表list_referrers的大小,触发器中的SELECT语句可能会成为性能损失。即使您无法在lref_referring_url上定义具有NON-UNIQUE索引的UNIQUE索引,也要确保这一点。否则你最终将使用O(n)算法而不是O(log n)算法,这可能会产生巨大的差异。

Idea using a Hashed Key Column

Instead of tracking uniqueness over the full VARCHAR, you also may check the uniqueness of a hashed value of your real key. For instance, you may add another column called key_hash typed CHAR(64) ASCII to your table, on which you will put a UNIQUE KEY constraint (you may even make this the primary key, if you want).

您还可以检查真实密钥的散列值的唯一性,而不是跟踪整个VARCHAR的唯一性。例如,您可以向表中添加另一个名为key_hash的CHAR(64)ASCII列,在该表上将放置一个UNIQUE KEY约束(如果需要,您甚至可以将其设为主键)。

When inserting, you let MySQL compute the SHA2-hashed value of it on the fly, like

插入时,让MySQL动态计算它的SHA2哈希值,比如

INSERT INTO listing_referrers (lref_referring_url, key_hash)
  VALUES("http://server.bogus/mylong/url", SHA2("http://server.bogus/mylong/url", 256));

If you enter the same URL twice, its SHA2 value will be the same and thus the UNIQUE KEY constraint of key_hash will start to rebel.

如果两次输入相同的URL,则其SHA2值将相同,因此key_hash的UNIQUE KEY约束将开始反叛。

Due to the cryptographically-verified property of collision resistance of this function, it is ensured that it is practically impossible that you will find two URLs which have the same hash value. If you encounter one, please post it to the cryptographers community - they will be very keen to know your case for sure (and many intelligence services as well).

由于此函数具有密码验证的抗冲突性,因此确保您找到两个具有相同散列值的URL实际上是不可能的。如果您遇到一个,请将其发布给密码学家社区 - 他们将非常热衷于了解您的情况(以及许多情报服务)。

NB: If the URL of a record changes (via UPDATE) you always need to make sure that also the hash_key gets updated accordingly. Otherwise, you will go crazy. If this is a real use case for you, you may also want to have a look at MySQL: Computed Column (I don't know by heart, if this also works on a primary key column).

注意:如果记录的URL发生变化(通过UPDATE),您总是需要确保hash_key也相应地更新。否则,你会发疯的。如果这是一个真实的用例,你可能还想看看MySQL:Computed Column(我不知道,如果这也适用于主键列)。

Idea of limiting the chars for uniqueness

If you are able to limit the uniqueness for the first n chars (with n < 255), then you might find Storing email VARCHAR(320) as UNIQUE, #1071 - Specified key was too long; max key length is 767 interesting.

如果您能够限制前n个字符的唯一性(n <255),那么您可能会发现存储电子邮件VARCHAR(320)为UNIQUE,#1071 - 指定的密钥太长;最大密钥长度是767有趣。

Idea with limiting to 3072 chars

If you can make sure that your URL won't exceed 3072 chars, changing the charset to ASCII will increase this boundary (compared to 256 chars in UTF-8) to 3072. Details are described in MySQL unique 1500 varchar field error (#1071 - Specified key was too long)

如果您可以确保您的URL不超过3072个字符,将字符集更改为ASCII将增加此边界(与UTF-8中的256个字符相比)到3072.详细信息在MySQL唯一1500 varchar字段错误中描述(#1071 - 指定密钥太长了)