哪个更快/更高效 - 很多小MySQL查询或一个大PHP数组?

时间:2022-05-19 21:03:39

I have a PHP/MySQL based web application which has multiple language support by way of a MySQL table "language_strings" with string_id, lang_id, lang_text fields. I then call the following function when I need to display a string in the selected language...

我有一个基于PHP / MySQL的Web应用程序,它通过带有string_id,lang_id,lang_text字段的MySQL表“language_strings”提供多种语言支持。然后,当我需要以所选语言显示字符串时,我调用以下函数...

public function get_lang_string($string_id,$lang_id) {
    $db=new Database();     
    $sql=sprintf("SELECT lang_string FROM language_strings WHERE lang_id IN (1, %s) AND string_id=%s ORDER BY lang_id DESC LIMIT 1", $db->escape($lang_id, "int"), $db->escape($string_id,"int"));
    $row=$db->query_first($sql);
    return $row['lang_string'];
} 

This works perfectly but I am concerned that there could be a lot of database queries going on. e.g. the main menu has 5 link texts, all of which call this function.

这很好用,但我担心可能会有很多数据库查询。例如主菜单有5个链接文本,所有这些都调用此功能。

Would it be faster to load the entire language_strings table results for the selected lang_id into a PHP array and then call that from the function? Potentially that would be a huge array with much of it redundant but clearly it would be one database query per pageload instead of lots.

将选定的lang_id的整个language_strings表结果加载到PHP数组然后从函数中调用它会更快吗?可能这将是一个巨大的阵列,其中大部分是多余的,但很明显,它将是每页面加载而不是批量的一个数据库查询。

Can anyone suggest another more efficient way of doing this?

任何人都可以提出另一种更有效的方法吗?

6 个解决方案

#1


7  

There isn't an answer that isn't case sensitive. You can really look at it on a case by case statement. Having said that, the majority of the time, it will be quicker to get all the data in one query, pop it into an array or object and refer to it from there.

没有一个不区分大小写的答案。你可以在个案陈述中真正看一下它。话虽如此,大多数情况下,将一个查询中的所有数据放入一个数组或对象并从那里引用它会更快。

The caveat is whether you can pull all your data that you need in one query as quickly as running the five individual ones. That is where the performance of the query itself comes into play.

需要注意的是,您是否可以像运行五个单独的查询一样快速地在一个查询中提取所需的所有数据。这就是查询本身的性能发挥作用的地方。

Sometimes a query that contains a subquery or two will actually be less time efficient than running a few queries individually.

有时,包含子查询或两个子查询的查询实际上比单独运行一些查询的时间效率更低。

My suggestion is to test it out. Get a query together that gets all the data you need, see how long it takes to execute. Time each of the other five queries and see how long they take combined. If it is almost identical, stick the output into an array and that will be more efficient due to not having to make frequent connections to the database itself.

我的建议是测试一下。获取一个查询,获取所需的所有数据,查看执行所需的时间。为其他五个查询计算时间并查看它们合并的时间。如果它几乎相同,则将输出粘贴到一个数组中,由于不必频繁连接数据库本身,因此效率更高。

If however, your combined query takes longer to return data (it might cause a full table scan instead of using indexes for example) then stick to individual ones.

但是,如果您的组合查询需要更长的时间来返回数据(例如,它可能导致全表扫描而不是使用索引),那么请坚持使用单个查询。

Lastly, if you are going to use the same data over and over - an array or object will win hands down every single time as accessing it will be much faster than getting it from a database.

最后,如果要反复使用相同的数据 - 数组或对象每次都会赢得访问权限,因为访问它将比从数据库中获取数据快得多。

#2


4  

OK - I did some benchmarking and was surprised to find that putting things into an array rather than using individual queries was, on average, 10-15% SLOWER.

好的 - 我做了一些基准测试,并且惊讶地发现将事物放入数组而不是使用单个查询,平均而言是10-15%SLOWER。

I think the reason for this was because, even if I filtered out the "uncommon" elements, inevitably there was always going to be unused elements as a matter of course.

我认为这样做的原因是,即使我过滤掉了“不常见”的元素,也不可避免地总会出现未使用的元素。

With the individual queries I am only ever getting out what I need and as the queries are so simple I think I am best sticking with that method.

通过单独的查询,我只能得到我需要的东西,因为查询非常简单,我认为我最好坚持使用该方法。

This works for me, of course in other situations where the individual queries are more complex, I think the method of storing common data in an array would turn out to be more efficient.

这对我有用,当然在其他情况下,个别查询更复杂,我认为将数据存储在数组中的方法会更有效。

#3


3  

Agree with what everybody says here.. it's all about the numbers.

同意每个人在这里说的话......这都与数字有关。

Some additional tips:

一些额外的提示:

  1. Try to create a single memory array which holds the minimum you require. This means removing most of the obvious redundancies.

    尝试创建一个保持所需最小值的单个内存阵列。这意味着删除大多数明显的冗余。

  2. There are standard approaches for these issues in performance critical environments, like using memcached with mysql. It's a bit overkill, but this basically lets you allocate some external memory and cache your queries there. Since you choose how much memory you want to allocate, you can plan it according to how much memory your system has.

    在性能关键环境中存在针对这些问题的标准方法,例如使用带有mysql的memcached。这有点矫枉过正,但这基本上可以让你分配一些外部内存并在那里缓存你的查询。由于您选择了要分配的内存量,因此可以根据系统的内存量进行规划。

  3. Just play with the numbers. Try using separate queries (which is the simplest approach) and stress your PHP script (like calling it hundreds of times from the command-line). Measure how much time this takes and see how big the performance loss actually is.. Speaking from my personal experience, I usually cache everything in memory and then one day when the data gets too big, I run out of memory. Then I split everything to separate queries to save memory, and see that the performance impact wasn't that bad in the first place :)

    只是玩数字。尝试使用单独的查询(这是最简单的方法)并强调您的PHP脚本(比如从命令行调用数百次)。测量这需要多长时间,看看实际上有多大的性能损失。从我个人的经验来看,我通常会将所有内容都缓存在内存中,然后有一天数据变得太大,我的内存耗尽。然后我将所有内容拆分为单独的查询以节省内存,并且首先看到性能影响并不是那么糟糕:)

#4


1  

I'm with Fluffeh on this: look into other options at your disposal (joins, subqueries, make sure your indexes reflect the relativity of the data -but don't over index and test). Most likely you'll end up with an array at some point, so here's a little performance tip, contrary to what you might expect, stuff like

我和Fluffeh在一起:查看其他选项(连接,子查询,确保您的索引反映数据的相关性 - 但不要过度索引和测试)。很可能你会在某个时候最终得到一个阵列,所以这里有一点性能提示,与你的预期相反,像

$all = $stmt->fetchAll(PDO::FETCH_ASSOC);

is less memory efficient compared too:

与此相比,内存效率更低:

$all = array();//or $all = []; in php 5.4
while($row = $stmt->fetch(PDO::FETCH_ASSOC);
{
    $all[] = $row['lang_string '];
}

What's more: you can check for redundant data while fetching the data.

更重要的是:您可以在获取数据时检查冗余数据。

#5


1  

My answer is to do something in between. Retrieve all strings for a lang_id that are shorter than a certain length (say, 100 characters). Shorter text strings are more likely to be used in multiple places than longer ones. Cache the entries in a static associative array in get_lang_string(). If an item isn't found, then retrieve it through a query.

我的答案是在两者之间做点什么。检索短于特定长度(例如,100个字符)的lang_id的所有字符串。较短的文本字符串更可能在多个地方使用而不是较长的文本字符串。在get_lang_string()中缓存静态关联数组中的条目。如果找不到项目,则通过查询检索它。

#6


0  

I am currently at the point in my site/application where I have had to put the brakes on and think very carefully about speed. I think these speed tests mentioned should consider the volume of traffic on your server as an important variable that will effect the results. If you are putting data into javascript data structures and processing it on the client machine, the processing time should be more regular. If you are requesting lots of data through mysql via php (for example) this is putting demand on one machine/server rather than spreading it. As your traffic grows you are having to share server resources with many users and I am thinking that this is where getting JavaScript to do more is going to lighten the load on the server. You can also store data in the local machine via localstorage.setItem(); / localstorage.getItem(); (most browsers have about 5mb of space per domain). If you have data in database that does not change that often then you can store it to client and then just check at 'start-up' if its still in date/valid.

我目前正处于我的网站/应用程序中,我不得不踩刹车并仔细考虑速度。我认为提到的这些速度测试应该将服务器上的流量视为影响结果的重要变量。如果要将数据放入javascript数据结构并在客户端计算机上处​​理它,则处理时间应更加规则。如果你通过php(例如)通过mysql请求大量数据,这就需要在一台机器/服务器上而不是传播它。随着流量的增长,您不得不与许多用户共享服务器资源,我认为这是让JavaScript做更多事情的地方,这将减轻服务器上的负担。您还可以通过localstorage.setItem()在本地计算机中存储数据; / localstorage.getItem(); (大多数浏览器每个域大约有5mb的空间)。如果数据库中的数据经常不会更改,那么您可以将其存储到客户端,然后只需检查“启动”是否仍处于日期/有效状态。

This is my first comment posted after having and using the account for 1 year so I might need to fine tune my rambling - just voicing what im thinking through at present.

这是我在使用和使用帐户1年后发布的第一条评论,所以我可能需要对我的漫无边际进行微调 - 只是表达我目前正在思考的内容。

#1


7  

There isn't an answer that isn't case sensitive. You can really look at it on a case by case statement. Having said that, the majority of the time, it will be quicker to get all the data in one query, pop it into an array or object and refer to it from there.

没有一个不区分大小写的答案。你可以在个案陈述中真正看一下它。话虽如此,大多数情况下,将一个查询中的所有数据放入一个数组或对象并从那里引用它会更快。

The caveat is whether you can pull all your data that you need in one query as quickly as running the five individual ones. That is where the performance of the query itself comes into play.

需要注意的是,您是否可以像运行五个单独的查询一样快速地在一个查询中提取所需的所有数据。这就是查询本身的性能发挥作用的地方。

Sometimes a query that contains a subquery or two will actually be less time efficient than running a few queries individually.

有时,包含子查询或两个子查询的查询实际上比单独运行一些查询的时间效率更低。

My suggestion is to test it out. Get a query together that gets all the data you need, see how long it takes to execute. Time each of the other five queries and see how long they take combined. If it is almost identical, stick the output into an array and that will be more efficient due to not having to make frequent connections to the database itself.

我的建议是测试一下。获取一个查询,获取所需的所有数据,查看执行所需的时间。为其他五个查询计算时间并查看它们合并的时间。如果它几乎相同,则将输出粘贴到一个数组中,由于不必频繁连接数据库本身,因此效率更高。

If however, your combined query takes longer to return data (it might cause a full table scan instead of using indexes for example) then stick to individual ones.

但是,如果您的组合查询需要更长的时间来返回数据(例如,它可能导致全表扫描而不是使用索引),那么请坚持使用单个查询。

Lastly, if you are going to use the same data over and over - an array or object will win hands down every single time as accessing it will be much faster than getting it from a database.

最后,如果要反复使用相同的数据 - 数组或对象每次都会赢得访问权限,因为访问它将比从数据库中获取数据快得多。

#2


4  

OK - I did some benchmarking and was surprised to find that putting things into an array rather than using individual queries was, on average, 10-15% SLOWER.

好的 - 我做了一些基准测试,并且惊讶地发现将事物放入数组而不是使用单个查询,平均而言是10-15%SLOWER。

I think the reason for this was because, even if I filtered out the "uncommon" elements, inevitably there was always going to be unused elements as a matter of course.

我认为这样做的原因是,即使我过滤掉了“不常见”的元素,也不可避免地总会出现未使用的元素。

With the individual queries I am only ever getting out what I need and as the queries are so simple I think I am best sticking with that method.

通过单独的查询,我只能得到我需要的东西,因为查询非常简单,我认为我最好坚持使用该方法。

This works for me, of course in other situations where the individual queries are more complex, I think the method of storing common data in an array would turn out to be more efficient.

这对我有用,当然在其他情况下,个别查询更复杂,我认为将数据存储在数组中的方法会更有效。

#3


3  

Agree with what everybody says here.. it's all about the numbers.

同意每个人在这里说的话......这都与数字有关。

Some additional tips:

一些额外的提示:

  1. Try to create a single memory array which holds the minimum you require. This means removing most of the obvious redundancies.

    尝试创建一个保持所需最小值的单个内存阵列。这意味着删除大多数明显的冗余。

  2. There are standard approaches for these issues in performance critical environments, like using memcached with mysql. It's a bit overkill, but this basically lets you allocate some external memory and cache your queries there. Since you choose how much memory you want to allocate, you can plan it according to how much memory your system has.

    在性能关键环境中存在针对这些问题的标准方法,例如使用带有mysql的memcached。这有点矫枉过正,但这基本上可以让你分配一些外部内存并在那里缓存你的查询。由于您选择了要分配的内存量,因此可以根据系统的内存量进行规划。

  3. Just play with the numbers. Try using separate queries (which is the simplest approach) and stress your PHP script (like calling it hundreds of times from the command-line). Measure how much time this takes and see how big the performance loss actually is.. Speaking from my personal experience, I usually cache everything in memory and then one day when the data gets too big, I run out of memory. Then I split everything to separate queries to save memory, and see that the performance impact wasn't that bad in the first place :)

    只是玩数字。尝试使用单独的查询(这是最简单的方法)并强调您的PHP脚本(比如从命令行调用数百次)。测量这需要多长时间,看看实际上有多大的性能损失。从我个人的经验来看,我通常会将所有内容都缓存在内存中,然后有一天数据变得太大,我的内存耗尽。然后我将所有内容拆分为单独的查询以节省内存,并且首先看到性能影响并不是那么糟糕:)

#4


1  

I'm with Fluffeh on this: look into other options at your disposal (joins, subqueries, make sure your indexes reflect the relativity of the data -but don't over index and test). Most likely you'll end up with an array at some point, so here's a little performance tip, contrary to what you might expect, stuff like

我和Fluffeh在一起:查看其他选项(连接,子查询,确保您的索引反映数据的相关性 - 但不要过度索引和测试)。很可能你会在某个时候最终得到一个阵列,所以这里有一点性能提示,与你的预期相反,像

$all = $stmt->fetchAll(PDO::FETCH_ASSOC);

is less memory efficient compared too:

与此相比,内存效率更低:

$all = array();//or $all = []; in php 5.4
while($row = $stmt->fetch(PDO::FETCH_ASSOC);
{
    $all[] = $row['lang_string '];
}

What's more: you can check for redundant data while fetching the data.

更重要的是:您可以在获取数据时检查冗余数据。

#5


1  

My answer is to do something in between. Retrieve all strings for a lang_id that are shorter than a certain length (say, 100 characters). Shorter text strings are more likely to be used in multiple places than longer ones. Cache the entries in a static associative array in get_lang_string(). If an item isn't found, then retrieve it through a query.

我的答案是在两者之间做点什么。检索短于特定长度(例如,100个字符)的lang_id的所有字符串。较短的文本字符串更可能在多个地方使用而不是较长的文本字符串。在get_lang_string()中缓存静态关联数组中的条目。如果找不到项目,则通过查询检索它。

#6


0  

I am currently at the point in my site/application where I have had to put the brakes on and think very carefully about speed. I think these speed tests mentioned should consider the volume of traffic on your server as an important variable that will effect the results. If you are putting data into javascript data structures and processing it on the client machine, the processing time should be more regular. If you are requesting lots of data through mysql via php (for example) this is putting demand on one machine/server rather than spreading it. As your traffic grows you are having to share server resources with many users and I am thinking that this is where getting JavaScript to do more is going to lighten the load on the server. You can also store data in the local machine via localstorage.setItem(); / localstorage.getItem(); (most browsers have about 5mb of space per domain). If you have data in database that does not change that often then you can store it to client and then just check at 'start-up' if its still in date/valid.

我目前正处于我的网站/应用程序中,我不得不踩刹车并仔细考虑速度。我认为提到的这些速度测试应该将服务器上的流量视为影响结果的重要变量。如果要将数据放入javascript数据结构并在客户端计算机上处​​理它,则处理时间应更加规则。如果你通过php(例如)通过mysql请求大量数据,这就需要在一台机器/服务器上而不是传播它。随着流量的增长,您不得不与许多用户共享服务器资源,我认为这是让JavaScript做更多事情的地方,这将减轻服务器上的负担。您还可以通过localstorage.setItem()在本地计算机中存储数据; / localstorage.getItem(); (大多数浏览器每个域大约有5mb的空间)。如果数据库中的数据经常不会更改,那么您可以将其存储到客户端,然后只需检查“启动”是否仍处于日期/有效状态。

This is my first comment posted after having and using the account for 1 year so I might need to fine tune my rambling - just voicing what im thinking through at present.

这是我在使用和使用帐户1年后发布的第一条评论,所以我可能需要对我的漫无边际进行微调 - 只是表达我目前正在思考的内容。