如何在mysql数据库中找到类似的二进制字符串?

时间:2021-09-12 01:11:10

I have a database with binary these strings

我有一个二进制这些字符串的数据库

record no 1: 1111111111111011000100110001100100010000000000000011000000000000
record no 2: 1111111111111111111111100001100000010000000000000011000000000000
record no 3: 1110000011110000111010001110111011110000111100001100000011000000
...

So, i want to find out what record had similar bỉnary string with this: 1111111111111011000100110001100100010000000000000011000000001100

所以,我想找出哪个记录​​有类似的bỉnary字符串:1111111111111011000100110001100100010000000000000011000000001100

You can see, the record number 1 is 98% relevance. record number 2 is 70% relevance, and record number 3 is only 45% percent relevance.

你可以看到,1号记录的相关性为98%。记录编号2是70%相关性,记录编号3只有45%相关性。

This is huge database (200.000 records)...

这是庞大的数据库(200.000条记录)......

1 个解决方案

#1


1  

SELECT * FROM MY_TABLE ORDER BY BIT_COUNT(CAST(CONV(record,2,10) as unsigned integer) ^ CAST(b'11...0' as unsigned integer)) LIMIT 1;

The above query will return the most similar record.

上面的查询将返回最相似的记录。

You can also SELECT the BIT_COUNT, it's min=0 means identity (record=input) or 100%, it's max=64 means that all bits differ (record = ~input) or 0%.

你也可以选择BIT_COUNT,它的min = 0表示身份(记录=输入)或100%,它的max = 64表示所有位都不同(记录=〜输入)或0%。

#1


1  

SELECT * FROM MY_TABLE ORDER BY BIT_COUNT(CAST(CONV(record,2,10) as unsigned integer) ^ CAST(b'11...0' as unsigned integer)) LIMIT 1;

The above query will return the most similar record.

上面的查询将返回最相似的记录。

You can also SELECT the BIT_COUNT, it's min=0 means identity (record=input) or 100%, it's max=64 means that all bits differ (record = ~input) or 0%.

你也可以选择BIT_COUNT,它的min = 0表示身份(记录=输入)或100%,它的max = 64表示所有位都不同(记录=〜输入)或0%。