为什么MPI被认为比共享内存更难,而Erlang在消息传递时更容易被认为更容易?

时间:2022-05-19 13:51:59

There's a lot of interest these days in Erlang as a language for writing parallel programs on multicore. I've heard people argue that Erlang's message-passing model is easier to program than the dominant shared-memory models such as threads.

如今,Erlang作为一种在多核上编写并行程序的语言,现在引起了很多人的兴趣。我听说有人认为Erlang的消息传递模型比主流共享内存模型(如线程)更容易编程。

Conversely, in the high-performance computing community the dominant parallel programming model has been MPI, which also implements a message-passing model. But in the HPC world, this message-passing model is generally considered very difficult to program in, and people argue that shared memory models such as OpenMP or UPC are easier to program in.

相反,在高性能计算社区中,主要的并行编程模型是MPI,它也实现了消息传递模型。但是在HPC领域,这种消息传递模型通常被认为很难编程,人们认为OpenMP或UPC等共享内存模型更容易编程。

Does anybody know why there is such a difference in the perception of message-passing vs. shared memory in the IT and HPC worlds? Is it due to some fundamental difference in how Erlang and MPI implement message passing that makes Erlang-style message-passing much easier than MPI? Or is there some other reason?

有谁知道为什么在IT和HPC世界中对消息传递与共享内存的看法存在这样的差异?是否由于Erlang和MPI如何实现消息传递的一些根本区别,使得Erlang风格的消息传递比MPI更容易?还是有其他原因吗?

7 个解决方案

#1


36  

I agree with all previous answers, but I think a key point that is not made totally clear is that one reason that MPI might be considered hard and Erlang easy is the match of model to the domain.

我同意之前的所有答案,但我认为一个未完全明确的关键点是,MPI可能被认为很难并且Erlang容易的一个原因是模型与域的匹配。

Erlang is based on a concept of local memory, asynchronous message passing, and shared state solved by using some form of global database that all threads can get to. It is designed for applications that do not move a whole lot of data around, and that is not supposed to explode out to a 100k separate nodes that need coordination.

Erlang基于本地内存,异步消息传递和共享状态的概念,通过使用所有线程可以获得的某种形式的全局数据库来解决。它专为不会移动大量数据的应用程序而设计,并且不应该爆炸到需要协调的100k单独节点。

MPI is based on local memory and message passing, and is intended for problems where moving data around is a key part of the domain. High-performance computing is very much about taking the dataset for a problem, and splitting it up among a host of compute resources. And that is pretty hard work in a message-passing system as data has to be explicitly distributed with balancing in mind. Essentially, MPI can be viewed as a grudging admittance that shared memory does not scale. And it is targeting high-performance computation spread across 100k processors or more.

MPI基于本地内存和消息传递,旨在解决移动数据是域的关键部分的问题。高性能计算非常关注数据集的问题,并将其分解为大量计算资源。这在消息传递系统中非常困难,因为数据必须在记住平衡的情况下明确分发。从本质上讲,MPI可以被视为一种勉强的准入,共享内存不会扩展。它的目标是跨越100k或更多处理器的高性能计算。

Erlang is not trying to achieve the highest possible performance, rather to decompose a naturally parallel problem into its natural threads. It was designed with a totally different type of programming tasks in mind compared to MPI.

Erlang并没有尝试实现最高性能,而是将自然并行的问题分解为自然线程。与MPI相比,它的设计考虑了完全不同类型的编程任务。

So Erlang is best compared to pthreads and other rather local heterogeneous thread solutions, rather than MPI which is really aimed at a very different (and to some extent inherently harder) problem set.

因此,与pthreads和其他相当本地的异构线程解决方案相比,Erlang是最好的,而不是MPI,它实际上针对的是一个非常不同(在某种程度上本来就更难)的问题集。

#2


12  

Parallelism in Erlang is still pretty hard to implement. By that I mean that you still have to figure out how to split up your problem, but there's a few minor things that ease this difficulty when compared to some MPI library in C or C++.

Erlang中的并行性仍然很难实现。我的意思是你仍然需要弄清楚如何分解你的问题,但是与C或C ++中的某些MPI库相比,有一些小问题可以缓解这个难题。

First, since Erlang's message-passing is a first-class language feature, the syntactic sugar makes it feel easier.

首先,由于Erlang的消息传递是一流的语言特性,因此语法糖使其变得更容易。

Also, Erlang libraries are all built around Erlang's message passing. This support structure helps give you a boost into parallel-processling land. Take a look at the components of OTP like gen_server, gen_fsm, gen_event. These are very easy to use structures that can help your program become parallel.

此外,Erlang库都是围绕Erlang的消息传递构建的。这种支撑结构有助于提升并行处理的土地。看看OTP的组件,如gen_server,gen_fsm,gen_event。这些是非常易于使用的结构,可以帮助您的程序变得平行。

I think it's more the robustness of the available standard library that differentiates erlang's message passing from other MPI implementations, not really any specific feature of the language itself.

我认为可用标准库的稳健性更多地区分了erlang从其他MPI实现传递的消息,而不是语言本身的任何特定功能。

#3


9  

Usually concurrency in HPC means working on large amounts of data. This kind of parallelism is called data parallelism and is indeed easier to implement using a shared memory approach like OpenMP, because the operating system takes care of things like scheduling and placement of tasks, which one would have to implement oneself if using a message passing paradigm.

通常,HPC中的并发意味着处理大量数据。这种并行性称为数据并行性,并且使用像OpenMP这样的共享内存方法确实更容易实现,因为操作系统会处理诸如调度和任务放置之类的事情,如果使用消息传递范例,则需要自己实现这一点。 。

In contrast, Erlang was designed to cope with task parallelism encountered in telephone systems, where different pieces of code have to be executed concurrently with only a limited amount of communication and strong requirements for fault tolerance and recovery.

相比之下,Erlang旨在应对电话系统中遇到的任务并行性,其中不同的代码片段必须同时执行,只有有限的通信量和对容错和恢复的强烈要求。

This model is similar to what most people use PThreads for. It fits applications like web servers, where each request can be handled by a different thread, while HPC applications do pretty much the same thing on huge amounts of data which also have to be exchanged between workers.

这个模型类似于大多数人使用PThreads的模型。它适用于像Web服务器这样的应用程序,其中每个请求都可以由不同的线程处理,而HPC应用程序对大量数据执行几乎相同的操作,这些数据也必须在工作者之间进行交换。

#4


8  

I think it has something to do with the mind-set when you're programming with MPI and when you're programming with Erlang. For instance, MPI is not built-into the language whereas Erlang has built-in support for message passing. Another possible reason is the disconnect between merely sending/receiving messages and partitioning solutions into concurrent units of execution.

我认为当你使用MPI进行编程以及使用Erlang进行编程时,它与思维方式有关。例如,MPI不是内置于语言中,而Erlang内置了对消息传递的支持。另一个可能的原因是仅发送/接收消息和将解决方案划分为并发执行单元之间的断开。

With Erlang you are forced to think in a functional programming frame where data actually zips by from function call to function call -- and receiving is an active act which looks like a normal construct in the language. This gives you a closer connection between the computation you're actually performing and the act of sending/receiving messages.

使用Erlang,您不得不在函数式编程框架中思考数据实际上从函数调用到函数调用 - 并且接收是一种看起来像语言中的正常构造的活动行为。这使您可以在实际执行的计算与发送/接收消息的行为之间建立更紧密的联系。

With MPI on the other hand you are forced to think merely about the actual message passing but not really the decomposition of work. This frame of thinking requires somewhat of a context switch between writing the solution and the messaging infrastructure in your code.

另一方面,使用MPI,您不得不仅考虑实际的消息传递,而不是真正的工作分解。这种思考框架需要在代码中编写解决方案和消息传递基础结构之间进行某种上下文切换。

The discussion can go on but the common view is that if the construct for message passing is actually built into the programming language and paradigm that you're using, usually that's a better means of expressing the solution compared to something else that is "tacked on" or exists as an add-on to a language (in the form of a library or extension).

讨论可以继续进行,但常见的观点是,如果消息传递的构造实际上构建在您正在使用的编程语言和范例中,那么通常这是表达解决方案的更好方法,而不是“其他” “或作为语言的附加物存在(以图书馆或扩展名的形式存在)。

#5


4  

Does anybody know why there is such a difference in the perception of message-passing vs. shared memory in the IT and HPC worlds? Is it due to some fundamental difference in how Erlang and MPI implement message passing that makes Erlang-style message-passing much easier than MPI? Or is there some other reason?

有谁知道为什么在IT和HPC世界中对消息传递与共享内存的看法存在这样的差异?是否由于Erlang和MPI如何实现消息传递的一些根本区别,使得Erlang风格的消息传递比MPI更容易?还是有其他原因吗?

The reason is simply parallelism vs concurrency. Erlang is bred for concurrent programming. HPC is all about parallel programming. These are related but different objectives.

原因只是并行性与并发性。 Erlang是用于并发编程的。 HPC是关于并行编程的。这些是相关但不同的目标。

Concurrent programming is greatly complicated by heavily non-deterministic control flow and latency is often an important objective. Erlang's use of immutable data structures greatly simplifies concurrent programming.

严重的非确定性控制流程使并发编程变得非常复杂,并且延迟通常是一个重要的目标。 Erlang使用不可变数据结构极大地简化了并发编程。

Parallel programming has much simpler control flow and the objective is all about maximal total throughput and not latency. Efficient cache usage is much more important here, which renders both Erlang and immutable data structures largely unsuitable. Mutating shared memory is both tractable and substantially better in this context. In effect, cache coherence is providing hardware-accelerated message passing for you.

并行编程具有更简单的控制流程,目标是最大总吞吐量而不是延迟。高效的缓存使用在这里更为重要,这使得Erlang和不可变数据结构在很大程度上都不合适。在这种情况下,变异共享内存既易于处理,也更好。实际上,缓存一致性为您提供了硬件加速的消息传递。

Finally, in addition to these technical differences there is also a political issue. The Erlang guys are trying to ride the multicore hype by pretending that Erlang is relevant to multicore when it isn't. In particular, they are touting great scalability so it is essential to consider absolute performance as well. Erlang scales effortlessly from poor absolute performance on one core to poor absolute performance on any number of cores. As you can imagine, that does not impress the HPC community (but it is adequate for a lot of heavily concurrent code).

最后,除了这些技术差异之外,还存在一个政治问题。 Erlang家伙正试图通过假装Erlang与多核相关而不是多核的方式来进行多核炒作。特别是,他们正在宣传极佳的可扩展性,因此必须考虑绝对性能。 Erlang可以毫不费力地从一个核心上的绝对性能差到任意数量核心上的绝对性能差。可以想象,这并没有给HPC社区留下深刻的印象(但对于大量并发代码来说它已经足够了)。

#6


0  

Regarding MPI vs OpenMP/UPC: MPI forces you to slice the problem in small pieces and take responsibility for moving data around. With OpenMP/UPC, "all the data is there", you just have to dereference a pointer. The MPI advantage is that 32-512 CPU clusters are much cheaper than 32-512 CPU single machines. Also, with MPI the expense is upfront, when you design the algorithm. OpenMP/UPC can hide the latencies that you'll get at runtime, if your system uses NUMA (and all big systems do) - your program won't scale and it will take a while to figure out why.

关于MPI与OpenMP / UPC:MPI迫使您将问题分成小块并负责移动数据。使用OpenMP / UPC,“所有数据都存在”,您只需取消引用指针即可。 MPI的优势在于32-512 CPU集群比32-512 CPU单机便宜得多。此外,在设计算法时,使用MPI可以预先支出费用。如果您的系统使用NUMA(以及所有大型系统都可以),OpenMP / UPC可以隐藏您在运行时获得的延迟 - 您的程序将无法扩展,并且需要一段时间才能找出原因。

#7


0  

This article actually explaines it well, Erlang is best when we are sending small pieces of data arround and MPI does much better on more complex things. Also The Erlang model is easy to understand :-)

本文实际上很好地解释了它,当我们发送小块数据时,Erlang是最好的,而MPI在更复杂的事情上做得更好。 Erlang模型也很容易理解:-)

Erlang Versus MPI - Final Results and Source Code

Erlang与MPI - 最终结果和源代码

#1


36  

I agree with all previous answers, but I think a key point that is not made totally clear is that one reason that MPI might be considered hard and Erlang easy is the match of model to the domain.

我同意之前的所有答案,但我认为一个未完全明确的关键点是,MPI可能被认为很难并且Erlang容易的一个原因是模型与域的匹配。

Erlang is based on a concept of local memory, asynchronous message passing, and shared state solved by using some form of global database that all threads can get to. It is designed for applications that do not move a whole lot of data around, and that is not supposed to explode out to a 100k separate nodes that need coordination.

Erlang基于本地内存,异步消息传递和共享状态的概念,通过使用所有线程可以获得的某种形式的全局数据库来解决。它专为不会移动大量数据的应用程序而设计,并且不应该爆炸到需要协调的100k单独节点。

MPI is based on local memory and message passing, and is intended for problems where moving data around is a key part of the domain. High-performance computing is very much about taking the dataset for a problem, and splitting it up among a host of compute resources. And that is pretty hard work in a message-passing system as data has to be explicitly distributed with balancing in mind. Essentially, MPI can be viewed as a grudging admittance that shared memory does not scale. And it is targeting high-performance computation spread across 100k processors or more.

MPI基于本地内存和消息传递,旨在解决移动数据是域的关键部分的问题。高性能计算非常关注数据集的问题,并将其分解为大量计算资源。这在消息传递系统中非常困难,因为数据必须在记住平衡的情况下明确分发。从本质上讲,MPI可以被视为一种勉强的准入,共享内存不会扩展。它的目标是跨越100k或更多处理器的高性能计算。

Erlang is not trying to achieve the highest possible performance, rather to decompose a naturally parallel problem into its natural threads. It was designed with a totally different type of programming tasks in mind compared to MPI.

Erlang并没有尝试实现最高性能,而是将自然并行的问题分解为自然线程。与MPI相比,它的设计考虑了完全不同类型的编程任务。

So Erlang is best compared to pthreads and other rather local heterogeneous thread solutions, rather than MPI which is really aimed at a very different (and to some extent inherently harder) problem set.

因此,与pthreads和其他相当本地的异构线程解决方案相比,Erlang是最好的,而不是MPI,它实际上针对的是一个非常不同(在某种程度上本来就更难)的问题集。

#2


12  

Parallelism in Erlang is still pretty hard to implement. By that I mean that you still have to figure out how to split up your problem, but there's a few minor things that ease this difficulty when compared to some MPI library in C or C++.

Erlang中的并行性仍然很难实现。我的意思是你仍然需要弄清楚如何分解你的问题,但是与C或C ++中的某些MPI库相比,有一些小问题可以缓解这个难题。

First, since Erlang's message-passing is a first-class language feature, the syntactic sugar makes it feel easier.

首先,由于Erlang的消息传递是一流的语言特性,因此语法糖使其变得更容易。

Also, Erlang libraries are all built around Erlang's message passing. This support structure helps give you a boost into parallel-processling land. Take a look at the components of OTP like gen_server, gen_fsm, gen_event. These are very easy to use structures that can help your program become parallel.

此外,Erlang库都是围绕Erlang的消息传递构建的。这种支撑结构有助于提升并行处理的土地。看看OTP的组件,如gen_server,gen_fsm,gen_event。这些是非常易于使用的结构,可以帮助您的程序变得平行。

I think it's more the robustness of the available standard library that differentiates erlang's message passing from other MPI implementations, not really any specific feature of the language itself.

我认为可用标准库的稳健性更多地区分了erlang从其他MPI实现传递的消息,而不是语言本身的任何特定功能。

#3


9  

Usually concurrency in HPC means working on large amounts of data. This kind of parallelism is called data parallelism and is indeed easier to implement using a shared memory approach like OpenMP, because the operating system takes care of things like scheduling and placement of tasks, which one would have to implement oneself if using a message passing paradigm.

通常,HPC中的并发意味着处理大量数据。这种并行性称为数据并行性,并且使用像OpenMP这样的共享内存方法确实更容易实现,因为操作系统会处理诸如调度和任务放置之类的事情,如果使用消息传递范例,则需要自己实现这一点。 。

In contrast, Erlang was designed to cope with task parallelism encountered in telephone systems, where different pieces of code have to be executed concurrently with only a limited amount of communication and strong requirements for fault tolerance and recovery.

相比之下,Erlang旨在应对电话系统中遇到的任务并行性,其中不同的代码片段必须同时执行,只有有限的通信量和对容错和恢复的强烈要求。

This model is similar to what most people use PThreads for. It fits applications like web servers, where each request can be handled by a different thread, while HPC applications do pretty much the same thing on huge amounts of data which also have to be exchanged between workers.

这个模型类似于大多数人使用PThreads的模型。它适用于像Web服务器这样的应用程序,其中每个请求都可以由不同的线程处理,而HPC应用程序对大量数据执行几乎相同的操作,这些数据也必须在工作者之间进行交换。

#4


8  

I think it has something to do with the mind-set when you're programming with MPI and when you're programming with Erlang. For instance, MPI is not built-into the language whereas Erlang has built-in support for message passing. Another possible reason is the disconnect between merely sending/receiving messages and partitioning solutions into concurrent units of execution.

我认为当你使用MPI进行编程以及使用Erlang进行编程时,它与思维方式有关。例如,MPI不是内置于语言中,而Erlang内置了对消息传递的支持。另一个可能的原因是仅发送/接收消息和将解决方案划分为并发执行单元之间的断开。

With Erlang you are forced to think in a functional programming frame where data actually zips by from function call to function call -- and receiving is an active act which looks like a normal construct in the language. This gives you a closer connection between the computation you're actually performing and the act of sending/receiving messages.

使用Erlang,您不得不在函数式编程框架中思考数据实际上从函数调用到函数调用 - 并且接收是一种看起来像语言中的正常构造的活动行为。这使您可以在实际执行的计算与发送/接收消息的行为之间建立更紧密的联系。

With MPI on the other hand you are forced to think merely about the actual message passing but not really the decomposition of work. This frame of thinking requires somewhat of a context switch between writing the solution and the messaging infrastructure in your code.

另一方面,使用MPI,您不得不仅考虑实际的消息传递,而不是真正的工作分解。这种思考框架需要在代码中编写解决方案和消息传递基础结构之间进行某种上下文切换。

The discussion can go on but the common view is that if the construct for message passing is actually built into the programming language and paradigm that you're using, usually that's a better means of expressing the solution compared to something else that is "tacked on" or exists as an add-on to a language (in the form of a library or extension).

讨论可以继续进行,但常见的观点是,如果消息传递的构造实际上构建在您正在使用的编程语言和范例中,那么通常这是表达解决方案的更好方法,而不是“其他” “或作为语言的附加物存在(以图书馆或扩展名的形式存在)。

#5


4  

Does anybody know why there is such a difference in the perception of message-passing vs. shared memory in the IT and HPC worlds? Is it due to some fundamental difference in how Erlang and MPI implement message passing that makes Erlang-style message-passing much easier than MPI? Or is there some other reason?

有谁知道为什么在IT和HPC世界中对消息传递与共享内存的看法存在这样的差异?是否由于Erlang和MPI如何实现消息传递的一些根本区别,使得Erlang风格的消息传递比MPI更容易?还是有其他原因吗?

The reason is simply parallelism vs concurrency. Erlang is bred for concurrent programming. HPC is all about parallel programming. These are related but different objectives.

原因只是并行性与并发性。 Erlang是用于并发编程的。 HPC是关于并行编程的。这些是相关但不同的目标。

Concurrent programming is greatly complicated by heavily non-deterministic control flow and latency is often an important objective. Erlang's use of immutable data structures greatly simplifies concurrent programming.

严重的非确定性控制流程使并发编程变得非常复杂,并且延迟通常是一个重要的目标。 Erlang使用不可变数据结构极大地简化了并发编程。

Parallel programming has much simpler control flow and the objective is all about maximal total throughput and not latency. Efficient cache usage is much more important here, which renders both Erlang and immutable data structures largely unsuitable. Mutating shared memory is both tractable and substantially better in this context. In effect, cache coherence is providing hardware-accelerated message passing for you.

并行编程具有更简单的控制流程,目标是最大总吞吐量而不是延迟。高效的缓存使用在这里更为重要,这使得Erlang和不可变数据结构在很大程度上都不合适。在这种情况下,变异共享内存既易于处理,也更好。实际上,缓存一致性为您提供了硬件加速的消息传递。

Finally, in addition to these technical differences there is also a political issue. The Erlang guys are trying to ride the multicore hype by pretending that Erlang is relevant to multicore when it isn't. In particular, they are touting great scalability so it is essential to consider absolute performance as well. Erlang scales effortlessly from poor absolute performance on one core to poor absolute performance on any number of cores. As you can imagine, that does not impress the HPC community (but it is adequate for a lot of heavily concurrent code).

最后,除了这些技术差异之外,还存在一个政治问题。 Erlang家伙正试图通过假装Erlang与多核相关而不是多核的方式来进行多核炒作。特别是,他们正在宣传极佳的可扩展性,因此必须考虑绝对性能。 Erlang可以毫不费力地从一个核心上的绝对性能差到任意数量核心上的绝对性能差。可以想象,这并没有给HPC社区留下深刻的印象(但对于大量并发代码来说它已经足够了)。

#6


0  

Regarding MPI vs OpenMP/UPC: MPI forces you to slice the problem in small pieces and take responsibility for moving data around. With OpenMP/UPC, "all the data is there", you just have to dereference a pointer. The MPI advantage is that 32-512 CPU clusters are much cheaper than 32-512 CPU single machines. Also, with MPI the expense is upfront, when you design the algorithm. OpenMP/UPC can hide the latencies that you'll get at runtime, if your system uses NUMA (and all big systems do) - your program won't scale and it will take a while to figure out why.

关于MPI与OpenMP / UPC:MPI迫使您将问题分成小块并负责移动数据。使用OpenMP / UPC,“所有数据都存在”,您只需取消引用指针即可。 MPI的优势在于32-512 CPU集群比32-512 CPU单机便宜得多。此外,在设计算法时,使用MPI可以预先支出费用。如果您的系统使用NUMA(以及所有大型系统都可以),OpenMP / UPC可以隐藏您在运行时获得的延迟 - 您的程序将无法扩展,并且需要一段时间才能找出原因。

#7


0  

This article actually explaines it well, Erlang is best when we are sending small pieces of data arround and MPI does much better on more complex things. Also The Erlang model is easy to understand :-)

本文实际上很好地解释了它,当我们发送小块数据时,Erlang是最好的,而MPI在更复杂的事情上做得更好。 Erlang模型也很容易理解:-)

Erlang Versus MPI - Final Results and Source Code

Erlang与MPI - 最终结果和源代码