如何使我的应用程序扩展良好？

In general, what kinds of design decisions help an application scale well?

一般而言,哪种设计决策有助于应用程序很好地扩展?

(Note: Having just learned about Big O Notation, I'm looking to gather more principles of programming here. I've attempted to explain Big O Notation by answering my own question below, but I want the community to improve both this question and the answers.)

(注意:刚刚学习了Big O Notation,我想在这里收集更多的编程原理。我试图通过回答下面的问题解释Big O Notation,但我希望社区能够改进这个问题和答案。)

Responses so far
1) Define scaling. Do you need to scale for lots of users, traffic, objects in a virtual environment?
2) Look at your algorithms. Will the amount of work they do scale linearly with the actual amount of work - i.e. number of items to loop through, number of users, etc?
3) Look at your hardware. Is your application designed such that you can run it on multiple machines if one can't keep up?

到目前为止的响应1)定义缩放。您是否需要扩展虚拟环境中的大量用户,流量和对象? 2)看看你的算法。他们的工作量是否会与实际工作量成线性关系 - 即要循环的项目数量,用户数量等等? 3)看看你的硬件。您的应用程序是否设计为可以在多台计算机上运行,如果无法跟上?

Secondary thoughts
1) Don't optimize too much too soon - test first. Maybe bottlenecks will happen in unforseen places.
2) Maybe the need to scale will not outpace Moore's Law, and maybe upgrading hardware will be cheaper than refactoring.

次要想法1)不要过早优化太多 - 先测试一下。也许瓶颈会发生在不可预见的地方。 2)也许扩展的需要不会超过摩尔定律,也许升级硬件会比重构更便宜。

7 个解决方案

#1

The only thing I would say is write your application so that it can be deployed on a cluster from the very start. Anything above that is a premature optimisation. Your first job should be getting enough users to have a scaling problem.

我唯一要说的是编写你的应用程序,以便它可以从一开始就部署在集群上。以上任何事情都是过早的优化。您的第一份工作应该是让足够多的用户遇到扩展问题。

Build the code as simple as you can first, then profile the system second and optimise only when there is an obvious performance problem.

尽可能简单地构建代码,然后对系统进行概要分析,并仅在存在明显性能问题时进行优化。

Often the figures from profiling your code are counter-intuitive; the bottle-necks tend to reside in modules you didn't think would be slow. Data is king when it comes to optimisation. If you optimise the parts you think will be slow, you will often optimise the wrong things.

通常,分析代码的数字是违反直觉的。瓶颈往往存在于你认为不会缓慢的模块中。在优化方面,数据是最重要的。如果您优化了您认为会很慢的部件,您通常会优化错误的部件。

#2

Ok, so you've hit on a key point in using the "big O notation". That's one dimension that can certainly bite you in the rear if you're not paying attention. There are also other dimensions at play that some folks don't see through the "big O" glasses (but if you look closer they really are).

好的,所以你在使用“大O符号”时遇到了一个关键点。如果你没有注意的话,这个问题肯定会让你陷入困境。还有一些其他方面正在发挥作用,有些人无法通过“大O”眼镜看到(但如果你仔细观察他们真的是)。

A simple example of that dimension is a database join. There are "best practices" in constructing, say, a left inner join which will help to make the sql execute more efficiently. If you break down the relational calculus or even look at an explain plan (Oracle) you can easily see which indexes are being used in which order and if any table scans or nested operations are occurring.

该维度的一个简单示例是数据库连接。构建一个“最佳实践”,比如左内连接,这将有助于使sql更有效地执行。如果分解关系演算或甚至查看解释计划(Oracle),您可以轻松查看哪些索引正在以哪种顺序使用,以及是否正在进行任何表扫描或嵌套操作。

The concept of profiling is also key. You have to be instrumented thoroughly and at the right granularity across all the moving parts of the architecture in order to identify and fix any inefficiencies. Say for example you're building a 3-tier, multi-threaded, MVC2 web-based application with liberal use of AJAX and client side processing along with an OR Mapper between your app and the DB. A simplistic linear single request/response flow looks like:

分析的概念也很关键。您必须在架构的所有移动部分中以完全正确的粒度对仪器进行检测,以便识别和修复任何低效问题。比如说你正在构建一个3层,多线程,MVC2基于Web的应用程序,*使用AJAX和客户端处理以及应用程序和数据库之间的OR Mapper。简单的线性单一请求/响应流程如下所示:

browser -> web server -> app server -> DB -> app server -> XSLT -> web server -> browser JS engine execution & rendering

You should have some method for measuring performance (response times, throughput measured in "stuff per unit time", etc.) in each of those distinct areas, not only at the box and OS level (CPU, memory, disk i/o, etc.), but specific to each tier's service. So on the web server you'll need to know all the counters for the web server your're using. In the app tier, you'll need that plus visibility into whatever virtual machine you're using (jvm, clr, whatever). Most OR mappers manifest inside the virtual machine, so make sure you're paying attention to all the specifics if they're visible to you at that layer. Inside the DB, you'll need to know everything that's being executed and all the specific tuning parameters for your flavor of DB. If you have big bucks, BMC Patrol is a pretty good bet for most of it (with appropriate knowledge modules (KMs)). At the cheap end, you can certainly roll your own but your mileage will vary based on your depth of expertise.

您应该有一些方法来测量每个不同区域的性能(响应时间,以“每单位时间的东西”测量的吞吐量等),不仅在盒子和操作系统级别(CPU,内存,磁盘I / O,等),但具体到每一层的服务。因此,在Web服务器上,您需要知道您正在使用的Web服务器的所有计数器。在应用程序层中,您需要加上对正在使用的虚拟机(jvm,clr等)的可见性。大多数OR映射器都在虚拟机内部显示,因此如果在该层可见,那么请确保注意所有细节。在DB内部,您需要知道正在执行的所有内容以及适合您的DB风格的所有特定调整参数。如果你有大笔钱,BMC Patrol对于大多数人来说都是一个不错的选择(使用适当的知识模块(KM))。在便宜的一端,你当然可以自己动手,但你的里程将根据你的专业知识深度而有所不同。

Presuming everything is synchronous (no queue-based things going on that you need to wait for), there are tons of opportunities for performance and/or scalability issues. But since your post is about scalability, let's ignore the browser except for any remote XHR calls that will invoke another request/response from the web server.

假设一切都是同步的(没有你需要等待的基于队列的事情),性能和/或可伸缩性问题有很多机会。但由于您的帖子是关于可伸缩性的,所以让我们忽略浏览器,除了将从Web服务器调用另一个请求/响应的任何远程XHR调用。

So given this problem domain, what decisions could you make to help with scalability?

因此,鉴于此问题域,您可以做出哪些决定来帮助实现可伸缩性?

Connection handling. This is also bound to session management and authentication. That has to be as clean and lightweight as possible without compromising security. The metric is maximum connections per unit time.

连接处理。这也与会话管理和身份验证绑定在一起。必须尽可能干净,轻便,同时不影响安全性。度量标准是每单位时间的最大连接数。
Session failover at each tier. Necessary or not? We assume that each tier will be a cluster of boxes horizontally under some load balancing mechanism. Load balancing is typically very lightweight, but some implementations of session failover can be heavier than desired. Also whether you're running with sticky sessions can impact your options deeper in the architecture. You also have to decide whether to tie a web server to a specific app server or not. In the .NET remoting world, it's probably easier to tether them together. If you use the Microsoft stack, it may be more scalable to do 2-tier (skip the remoting), but you have to make a substantial security tradeoff. On the java side, I've always seen it at least 3-tier. No reason to do it otherwise.

每层的会话故障转移。是否有必要?我们假设每个层在一些负载平衡机制下将是一个水平集群。负载平衡通常非常轻量级,但会话故障转移的某些实现可能比期望的更重。此外,您是否使用粘性会话运行可能会影响您在架构中更深入的选项。您还必须决定是否将Web服务器绑定到特定的应用服务器。在.NET远程处理世界中,将它们连接在一起可能更容易。如果您使用Microsoft堆栈,那么执行2层(跳过远程处理)可能更具可扩展性,但您必须进行实质性的安全性权衡。在java方面,我总是看到它至少3层。没理由不这样做。
Object hierarchy. Inside the app, you need the cleanest possible, lightest weight object structure possible. Only bring the data you need when you need it. Viciously excise any unnecessary or superfluous getting of data.

对象层次结构在应用程序内部,您需要尽可能最轻,最轻的重量对象结构。只在需要时提供所需的数据。恶意删除任何不必要或多余的数据。
OR mapper inefficiencies. There is an impedance mismatch between object design and relational design. The many-to-many construct in an RDBMS is in direct conflict with object hierarchies (person.address vs. location.resident). The more complex your data structures, the less efficient your OR mapper will be. At some point you may have to cut bait in a one-off situation and do a more...uh...primitive data access approach (Stored Procedure + Data Access Layer) in order to squeeze more performance or scalability out of a particularly ugly module. Understand the cost involved and make it a conscious decision.

或映射器效率低下。对象设计和关系设计之间存在阻抗不匹配。 RDBMS中的多对多构造与对象层次结构(person.address与location.resident)直接冲突。您的数据结构越复杂,OR映射器的效率就越低。在某些时候,你可能不得不在一次性情况下削减诱饵并做更多...呃......原始数据访问方法(存储过程+数据访问层),以便从特定的方面挤出更多的性能或可扩展性丑陋的模块。了解所涉及的成本并使其成为有意识的决定。
XSL transforms. XML is a wonderful, normalized mechanism for data transport, but man can it be a huge performance dog! Depending on how much data you're carrying around with you and which parser you choose and how complex your structure is, you could easily paint yourself into a very dark corner with XSLT. Yes, academically it's a brilliantly clean way of doing a presentation layer, but in the real world there can be catastrophic performance issues if you don't pay particular attention to this. I've seen a system consume over 30% of transaction time just in XSLT. Not pretty if you're trying to ramp up 4x the user base without buying additional boxes.

XSL转换。 XML是一种很好的,规范化的数据传输机制,但是人类可以成为一个巨大的性能狗!根据您携带的数据量以及您选择的解析器以及结构的复杂程度,您可以轻松地使用XSLT将自己描绘成一个非常黑暗的角落。是的,在学术上它是一种非常干净的表达层方式,但在现实世界中,如果你不特别注意这一点,可能会出现灾难性的性能问题。我已经看到一个系统在XSLT中消耗了超过30%的事务时间。如果你试图增加4倍的用户群而不购买额外的盒子,那就不太好了。
Can you buy your way out of a scalability jam? Absolutely. I've watched it happen more times than I'd like to admit. Moore's Law (as you already mentioned) is still valid today. Have some extra cash handy just in case.

你能从可伸缩性堵塞中找到出路吗?绝对。我看过它发生的次数比我想承认的要多。摩尔定律(正如您已经提到的)今天仍然有效。有一些额外的现金,以防万一。
Caching is a great tool to reduce the strain on the engine (increasing speed and throughput is a handy side-effect). It comes at a cost though in terms of memory footprint and complexity in invalidating the cache when it's stale. My decision would be to start completely clean and slowly add caching only where you decide it's useful to you. Too many times the complexities are underestimated and what started out as a way to fix performance problems turns out to cause functional problems. Also, back to the data usage comment. If you're creating gigabytes worth of objects every minute, it doesn't matter if you cache or not. You'll quickly max out your memory footprint and garbage collection will ruin your day. So I guess the takeaway is to make sure you understand exactly what's going on inside your virtual machine (object creation, destruction, GCs, etc.) so that you can make the best possible decisions.

缓存是减少发动机压力的一个很好的工具(提高速度和吞吐量是一个方便的副作用)。在内存占用方面以及在缓存过时使缓存失效的复杂性方面需要付出代价。我的决定是开始彻底清理并在你认为对你有用的地方慢慢添加缓存。很多时候,复杂性被低估了,最初作为解决性能问题的方法最终会导致功能问题。另外,回到数据使用评论。如果你每分钟创建一个价值千兆字节的对象,那么无论你是否缓存都没关系。您将快速最大化内存占用,垃圾收集将毁掉您的一天。因此,我猜想要确保您确切了解虚拟机内部的内容(对象创建,破坏,GC等),以便您做出最佳决策。

Sorry for the verbosity. Just got rolling and forgot to look up. Hope some of this touches on the spirit of your inquiry and isn't too rudimentary a conversation.

很抱歉。刚刚滚动,忘了抬头。希望其中一些涉及到你的探究精神,并不是一个基本的谈话。

#3

Well there's this blog called High Scalibility that contains a lot of information on this topic. Some useful stuff.

那么这个名为High Scalibility的博客包含了很多关于这个主题的信息。一些有用的东西。

#4

Often the most effective way to do this is by a well thought through design where scaling is a part of it.

通常,最有效的方法是通过深思熟虑的设计,其中缩放是其中的一部分。

Decide what scaling actually means for your project. Is infinite amount of users, is it being able to handle a slashdotting on a website is it development-cycles?

确定缩放对您的项目实际意味着什么。是无限量的用户,是否能够处理网站上的slashdotting是开发周期?

Use this to focus your development efforts

使用它来集中您的开发工作

#5

Jeff and Joel discuss scaling in the Stack Overflow Podcast #19.

Jeff和Joel在Stack Overflow Podcast#19中讨论扩展问题。

#6

FWIW, most systems will scale most effectively by ignoring this until it's a problem- Moore's law is still holding, and unless your traffic is growing faster than Moore's law does, it's usually cheaper to just buy a bigger box (at $2 or $3K a pop) than to pay developers.

FWIW,大多数系统都会通过忽略它来实现最有效的扩展,直到它出现问题 - 摩尔定律仍然存在,除非你的流量增长速度超过摩尔定律,否则购买更大的盒子通常会更便宜(2美元或3千美元一个pop)比支付开发者。

That said, the most important place to focus is your data tier; that is the hardest part of your application to scale out, as it usually needs to be authoritative, and clustered commercial databases are very expensive- the open source variations are usually very tricky to get right.

也就是说,最重要的关注点是您的数据层;这是您的应用程序中最难扩展的部分,因为它通常需要具有权威性,并且集群商业数据库非常昂贵 - 开源变体通常非常难以实现。

If you think there is a high likelihood that your application will need to scale, it may be intelligent to look into systems like memcached or map reduce relatively early in your development.

如果您认为应用程序很可能需要扩展,那么在开发过程中相对较早地查看memcached或map等系统可能是明智的。

#7

One good idea is to determine how much work each additional task creates. This can depend on how the algorithm is structured.

一个好主意是确定每个附加任务创建的工作量。这可能取决于算法的结构。

For example, imagine you have some virtual cars in a city. At any moment, you want each car to have a map showing where all the cars are.

例如,假设您在城市中有一些虚拟汽车。在任何时候,您都希望每辆车都有一张地图,显示所有车辆的位置。

One way to approach this would be:

解决这个问题的一种方法是:

    for each car {
       determine my position;  
       for each car {  
         add my position to this car's map;  
       }
    }

This seems straightforward: look at the first car's position, add it to the map of every other car. Then look at the second car's position, add it to the map of every other car. Etc.

这看起来很简单:看看第一辆车的位置,将其添加到其他车的地图上。然后看看第二辆车的位置,将其添加到其他车的地图上。等等。

But there is a scalability problem. When there are 2 cars, this strategy takes 4 "add my position" steps; when there are 3 cars, it takes 9 steps. For each "position update," you have to cycle through the whole list of cars - and every car needs its position updated.

但是存在可扩展性问题。当有2辆车时,这个策略需要4个“添加我的位置”步骤;当有3辆车时,需要9步。对于每个“位置更新”,您必须循环遍历整个汽车列表 - 每辆汽车都需要更新其位置。

Ignoring how many other things must be done to each car (for example, it may take a fixed number of steps to calculate the position of an individual car), for N cars, it takes N² "visits to cars" to run this algorithm. This is no problem when you've got 5 cars and 25 steps. But as you add cars, you will see the system bog down. 100 cars will take 10,000 steps, and 101 cars will take 10,201 steps!

忽略每辆汽车必须完成多少其他事情(例如,可能需要一定数量的步骤来计算单个汽车的位置),对于N辆汽车,需要N2“访问汽车”来运行该算法。当你有5辆车和25步时,这没问题。但是当你添加汽车时,你会看到系统陷入困境。 100辆汽车将需要10,000步,101辆汽车将需要10,201步!

A better approach would be to undo the nesting of the for loops.

更好的方法是撤消for循环的嵌套。

    for each car {  
      add my position to a list;  
    }  
    for each car {    
      give me an updated copy of the master list;  
    }

With this strategy, the number of steps is a multiple of N, not of N². So 100 cars will take 100 times the work of 1 car - NOT 10,000 times the work.

使用此策略,步数是N的倍数,而不是N2的倍数。因此,100辆汽车将需要100倍于1辆汽车的工作 - 而不是工作量的10,000倍。

This concept is sometimes expressed in "big O notation" - the number of steps needed are "big O of N" or "big O of N²."

这个概念有时用“大O符号”表示 - 所需的步骤数是“N的大O”或“N2的大O”。

Note that this concept is only concerned with scalability - not optimizing the number of steps for each car. Here we don't care if it takes 5 steps or 50 steps per car - the main thing is that N cars take (X * N) steps, not (X * N²).

请注意,此概念仅涉及可扩展性 - 不优化每辆车的步数。在这里我们不关心每辆车需要5步或50步 - 主要是N车采取(X * N)步,而不是(X * N2)。

#1