使用like通过查询实现数据库搜索

时间:2022-09-12 23:57:46

I am planning to implement database search through a website - I know there is full-text search offered by mysql, but turns out that it is not supported for innodb engine (which I need for transaction support). Other options are using sphinx or similar indexing applications. However they require some re factoring of the database structure and may take more time to implement than I have.

我打算通过一个网站实现数据库搜索 - 我知道mysql提供了全文搜索,但事实证明它不支持innodb引擎(我需要事务支持)。其他选项使用sphinx或类似的索引应用程序。但是,它们需要对数据库结构进行一些重新分解,并且可能需要比我更多的时间来实现。

So what I decided on was to take each table and concatenate all its relevant columns into a newly added QUERY column. This query column should also recruit from column of other relevant tables.

所以我决定采用每个表并将其所有相关列连接到新添加的QUERY列。此查询列还应从其他相关表的列中进行招募。

This accomplished, I will use the 'like' clause on query column of the table to be searched to search to return results of specific domains (group of related tables).

这完成后,我将使用要搜索的表的查询列上的'like'子句进行搜索以返回特定域(相关表组)的结果。

Since my database is not expected to be too huge (< 1mn rows in the biggest table), I am expecting reasonable query times.

由于我的数据库预计不会太大(最大表中<1mn行),我期待合理的查询时间。

Does any one agree with this method or have a better idea?

有没有人同意这种方法或有更好的想法?

3 个解决方案

#1


6  

You will not be happy with the solution of using LIKE with wildcards. It performs hundreds or thousands of times slower than using a fulltext search technology.

您对使用LIKE和通配符的解决方案不满意。它比使用全文搜索技术要慢几百或几千倍。

See my presentation Practical Full-Text Search in MySQL.

请参阅我的演示文稿MySQL中的实用全文搜索。

Instead of copying the values into a QUERY column, I would recommend copying the values into a MyISAM table where you have a FULLTEXT index defined. You could use triggers to do this.

我建议不要将值复制到QUERY列中,而是将值复制到MyISAM表中,在该表中定义了FULLTEXT索引。您可以使用触发器来执行此操作。

You don't need to concatenate the values together, you just need the primary key column and each of your searchable text columns.

您不需要将值连接在一起,只需要主键列和每个可搜索的文本列。

CREATE TABLE OriginalTable (
  original_id SERIAL PRIMARY KEY,
  author_id INT,
  author_date DATETIME,
  summary TEXT,
  body TEXT
) ENGINE=InnoDB;

CREATE TABLE SearchTable (
  original_id BIGINT UNSIGNED PRIMARY KEY, -- not auto-increment
  -- author_id INT,
  -- author_date DATETIME,
  summary TEXT,
  body TEXT,
  FULLTEXT KEY (summary, body)
) ENGINE=MyISAM;

#2


1  

You'll want to add an index to your query column. If there is a wildcard at the beginning of the search expression, MySQL cannot use the index.

您需要为查询列添加索引。如果搜索表达式的开头有通配符,则MySQL无法使用索引。

If you do any search other than "equals" (LIKE 'test') or "begins with" (LIKE 'test%'), MySQL will have to scan every row. For example, a "contains" search (LIKE '%test%') is unable to use the index.

如果您执行除“equals”(LIKE'test')或“以...开头”(LIKE'test%')之外的任何搜索,MySQL将必须扫描每一行。例如,“包含”搜索(LIKE'%test%')无法使用索引。

You could allow an "ends with" ('LIKE %test), but you'd have to build a reversed column to index on so you could actually do LIKE 'test%' in order to use the index.

您可以允许“以...结尾”('LIKE%测试),但您必须构建一个反向列以进行索引,以便您可以实际执行LIKE'test%'以便使用索引。

Any full scan is going to be slow, and the more rows, the slower it will be. The larger the field, the slower it will be.

任何完整扫描都会变慢,行越多,速度就越慢。场越大,它就越慢。

You can see the limitation of using LIKE. Therefore, you might create a table called Tags, where you link individual key words to each entry rather than using the entire text, but I would still stick to "equals" and "begins with", even with tags.

您可以看到使用LIKE的限制。因此,您可以创建一个名为Tags的表,您可以将单个关键字链接到每个条目而不是使用整个文本,但我仍然会坚持使用“等于”和“以...开头”,即使使用标签也是如此。

Using LIKE without the aid of an index should be limited to the rare ad-hoc query or very small data sets.

在没有索引的帮助下使用LIKE应该限于罕见的即席查询或非常小的数据集。

#3


0  

No, it is not optimal since it force to read all the row. But, if you table is small (i don't know what is the meaning of <1mn) then it could be acceptable in some extend.

不,它不是最佳的,因为它强制读取所有行。但是,如果你的桌子很小(我不知道<1mn的含义是什么)那么它在某种程度上是可以接受的。

Also, you can limit the search feature. For example, some sites limit to use the search feature no more that one request x minute while other force you to enter a captcha.

此外,您可以限制搜索功能。例如,某些网站限制使用搜索功能不再是一个请求x分钟而其他网站强制您输入验证码。

#1


6  

You will not be happy with the solution of using LIKE with wildcards. It performs hundreds or thousands of times slower than using a fulltext search technology.

您对使用LIKE和通配符的解决方案不满意。它比使用全文搜索技术要慢几百或几千倍。

See my presentation Practical Full-Text Search in MySQL.

请参阅我的演示文稿MySQL中的实用全文搜索。

Instead of copying the values into a QUERY column, I would recommend copying the values into a MyISAM table where you have a FULLTEXT index defined. You could use triggers to do this.

我建议不要将值复制到QUERY列中,而是将值复制到MyISAM表中,在该表中定义了FULLTEXT索引。您可以使用触发器来执行此操作。

You don't need to concatenate the values together, you just need the primary key column and each of your searchable text columns.

您不需要将值连接在一起,只需要主键列和每个可搜索的文本列。

CREATE TABLE OriginalTable (
  original_id SERIAL PRIMARY KEY,
  author_id INT,
  author_date DATETIME,
  summary TEXT,
  body TEXT
) ENGINE=InnoDB;

CREATE TABLE SearchTable (
  original_id BIGINT UNSIGNED PRIMARY KEY, -- not auto-increment
  -- author_id INT,
  -- author_date DATETIME,
  summary TEXT,
  body TEXT,
  FULLTEXT KEY (summary, body)
) ENGINE=MyISAM;

#2


1  

You'll want to add an index to your query column. If there is a wildcard at the beginning of the search expression, MySQL cannot use the index.

您需要为查询列添加索引。如果搜索表达式的开头有通配符,则MySQL无法使用索引。

If you do any search other than "equals" (LIKE 'test') or "begins with" (LIKE 'test%'), MySQL will have to scan every row. For example, a "contains" search (LIKE '%test%') is unable to use the index.

如果您执行除“equals”(LIKE'test')或“以...开头”(LIKE'test%')之外的任何搜索,MySQL将必须扫描每一行。例如,“包含”搜索(LIKE'%test%')无法使用索引。

You could allow an "ends with" ('LIKE %test), but you'd have to build a reversed column to index on so you could actually do LIKE 'test%' in order to use the index.

您可以允许“以...结尾”('LIKE%测试),但您必须构建一个反向列以进行索引,以便您可以实际执行LIKE'test%'以便使用索引。

Any full scan is going to be slow, and the more rows, the slower it will be. The larger the field, the slower it will be.

任何完整扫描都会变慢,行越多,速度就越慢。场越大,它就越慢。

You can see the limitation of using LIKE. Therefore, you might create a table called Tags, where you link individual key words to each entry rather than using the entire text, but I would still stick to "equals" and "begins with", even with tags.

您可以看到使用LIKE的限制。因此,您可以创建一个名为Tags的表,您可以将单个关键字链接到每个条目而不是使用整个文本,但我仍然会坚持使用“等于”和“以...开头”,即使使用标签也是如此。

Using LIKE without the aid of an index should be limited to the rare ad-hoc query or very small data sets.

在没有索引的帮助下使用LIKE应该限于罕见的即席查询或非常小的数据集。

#3


0  

No, it is not optimal since it force to read all the row. But, if you table is small (i don't know what is the meaning of <1mn) then it could be acceptable in some extend.

不,它不是最佳的,因为它强制读取所有行。但是,如果你的桌子很小(我不知道<1mn的含义是什么)那么它在某种程度上是可以接受的。

Also, you can limit the search feature. For example, some sites limit to use the search feature no more that one request x minute while other force you to enter a captcha.

此外,您可以限制搜索功能。例如,某些网站限制使用搜索功能不再是一个请求x分钟而其他网站强制您输入验证码。