Google App Engine中的单身人士(或更常见的分布式服务器环境)如何运作?

时间:2020-11-27 02:17:02

I am intrigued as to how singletons work in Google App Engine (or any distributed server environment). Given your application can be running in multiple processes (on multiple machines) at once, and requests can get routed all off the place, what actually happens under the hood when an app does something like: 'CacheManager.getInstance()'?

关于单身人士如何在Google App Engine(或任何分布式服务器环境)中工作,我很感兴趣。鉴于您的应用程序可以同时在多个进程中运行(在多台计算机上),并且请求可以在所有位置进行路由,当应用程序执行以下操作时,实际上会发生什么:'CacheManager.getInstance()'?

I'm just using the (GAE) CacheManager as an example, but my point is, there is a single global application instance of a singleton somewhere, so where does it live? Is an RPC invoked? In fact, how is global application state (like sessions) actually handled generally?

我只是以(GAE)CacheManager为例,但我的观点是,单个地方有一个单一的全局应用程序实例,所以它在哪里生活?是否调用了RPC?事实上,全局应用程序状态(如会话)实际上是如何实际处理的?

Regards, Shane

3 个解决方案

#1


The singletons in App Engine Java are per-runtime, not per-webapp. Their purpose is simply to provide a single point of access to the underlying service (which in the case of both Memcache and Users API, is accessed via an RPC), but that's purely a design pattern for the library - there's no per-app singleton anywhere that these methods access.

App Engine Java中的单例是每个运行时,而不是每个webapp。他们的目的只是提供对底层服务的单点访问(在Memcache和Users API的情况下,通过RPC访问),但这纯粹是库的设计模式 - 没有每个应用程序单例这些方法访问的任何地方。

#2


Caches are generally linked up with some sort of distributed replicated cache. For example, GAE uses a custom version of memcached to handle maintaining a shared cache of objects across a cluster, while maintaining the storage state in a consistent state. In general there are lots of solutions for this problem with lots of different tradeoffs to be made in terms of performance and cache coherence (eg, is it critical that all caches match 100% of the time, must the cache be written to disk to protect against loss, etc).

缓存通常与某种分布式复制缓存相关联。例如,GAE使用自定义版本的memcached来处理跨群集维护对象的共享缓存,同时将存储状态保持在一致状态。一般来说,对于这个问题有很多解决方案,在性能和高速缓存一致性方面要做出许多不同的权衡(例如,所有高速缓存在100%的时间内匹配是至关重要的,必须将高速缓存写入磁盘以保护反对损失等)。

Here are some sample products with distributed caching features (most have documentation describing the tradeoffs of various approaches in great detail:

以下是一些具有分布式缓存功能的示例产品(大多数都有文档详细描述了各种方法的权衡:

  • memcached - C with lots of client APIs and language ports
  • memcached - 具有大量客户端API和语言端口的C.

  • Ehcache - OSS Java cache, with widespread adoption
  • Ehcache - OSS Java缓存,广泛采用

  • JBoss Cache - Another popular Java OSS solution
  • JBoss Cache - 另一种流行的Java OSS解决方案

  • Oracle Coherence (formerly Tangosol Coherence) - Probably the best known Java commercial cache.
  • Oracle Coherence(以前称为Tangosol Coherence) - 可能是最着名的Java商业缓存。

  • Indexus Cache - A popular .Net OSS solution
  • Indexus Cache - 一种流行的.Net OSS解决方案

  • NCache - Likely the most popular .Net commercial caching solution
  • NCache - 可能是最受欢迎的.Net商业缓存解决方案

As you can see, there have been many projects that have approached this problem. One possible solution is to simply share a single cache on a single machine, however, most projects make some sort of replication and distributed failover possible.

如您所见,已经有许多项目已经解决了这个问题。一种可能的解决方案是在单个机器上共享单个缓存,但是,大多数项目可以进行某种复制和分布式故障转移。

#3


I'm not sure on the specifics of GAE, but typically in a web app this size, you'll have multiple processes running over a number of machines (and then load balance between them). Within each process, if you're using a multi-threaded web server, you can be handling multiple requests. So this would allow you to share objects between requests within the same web server (and a singleton, for example, you would instantiate when the web app process starts).

我不确定GAE的具体细节,但通常在这个大小的Web应用程序中,您将在多台计算机上运行多个进程(然后在它们之间进行负载平衡)。在每个流程中,如果您使用的是多线程Web服务器,则可以处理多个请求。因此,这将允许您在同一Web服务器(和单例,例如,您将在Web应用程序进程启动时实例化)之间的请求之间共享对象。

If the web server is not multi-threaded, but rather multi-process, then you can't share objects between requests as far as I know, without talking to a separate caching process.

如果Web服务器不是多线程的,而是多进程的,那么就我所知,您无法在请求之间共享对象,而无需与单独的缓存进程通信。

The GAE docs seem to support what they call "App Caching" which essentially allows you to do the same thing, but it wasn't clear to me from the docs whether they're doing this by using multi-threaded web servers, or some other caching process that is running alongside the web servers.

GAE文档似乎支持他们所谓的“App Caching”,它本质上允许你做同样的事情,但我从文档中不清楚他们是通过使用多线程Web服务器,还是一些与Web服务器一起运行的其他缓存进程。

I'd be intrigued to know if CacheManager.getInstance() always resolves to the same object, or if it's only the same object for requests handled by the same web server. In reality, it doesn't matter as it's only being used to talk to the separate memcached process anyway.

我很想知道CacheManager.getInstance()是否总是解析为同一个对象,或者它是否只是同一个Web服务器处理的请求的同一个对象。实际上,它并不重要,因为它只用于与单独的memcached进程通信。

#1


The singletons in App Engine Java are per-runtime, not per-webapp. Their purpose is simply to provide a single point of access to the underlying service (which in the case of both Memcache and Users API, is accessed via an RPC), but that's purely a design pattern for the library - there's no per-app singleton anywhere that these methods access.

App Engine Java中的单例是每个运行时,而不是每个webapp。他们的目的只是提供对底层服务的单点访问(在Memcache和Users API的情况下,通过RPC访问),但这纯粹是库的设计模式 - 没有每个应用程序单例这些方法访问的任何地方。

#2


Caches are generally linked up with some sort of distributed replicated cache. For example, GAE uses a custom version of memcached to handle maintaining a shared cache of objects across a cluster, while maintaining the storage state in a consistent state. In general there are lots of solutions for this problem with lots of different tradeoffs to be made in terms of performance and cache coherence (eg, is it critical that all caches match 100% of the time, must the cache be written to disk to protect against loss, etc).

缓存通常与某种分布式复制缓存相关联。例如,GAE使用自定义版本的memcached来处理跨群集维护对象的共享缓存,同时将存储状态保持在一致状态。一般来说,对于这个问题有很多解决方案,在性能和高速缓存一致性方面要做出许多不同的权衡(例如,所有高速缓存在100%的时间内匹配是至关重要的,必须将高速缓存写入磁盘以保护反对损失等)。

Here are some sample products with distributed caching features (most have documentation describing the tradeoffs of various approaches in great detail:

以下是一些具有分布式缓存功能的示例产品(大多数都有文档详细描述了各种方法的权衡:

  • memcached - C with lots of client APIs and language ports
  • memcached - 具有大量客户端API和语言端口的C.

  • Ehcache - OSS Java cache, with widespread adoption
  • Ehcache - OSS Java缓存,广泛采用

  • JBoss Cache - Another popular Java OSS solution
  • JBoss Cache - 另一种流行的Java OSS解决方案

  • Oracle Coherence (formerly Tangosol Coherence) - Probably the best known Java commercial cache.
  • Oracle Coherence(以前称为Tangosol Coherence) - 可能是最着名的Java商业缓存。

  • Indexus Cache - A popular .Net OSS solution
  • Indexus Cache - 一种流行的.Net OSS解决方案

  • NCache - Likely the most popular .Net commercial caching solution
  • NCache - 可能是最受欢迎的.Net商业缓存解决方案

As you can see, there have been many projects that have approached this problem. One possible solution is to simply share a single cache on a single machine, however, most projects make some sort of replication and distributed failover possible.

如您所见,已经有许多项目已经解决了这个问题。一种可能的解决方案是在单个机器上共享单个缓存,但是,大多数项目可以进行某种复制和分布式故障转移。

#3


I'm not sure on the specifics of GAE, but typically in a web app this size, you'll have multiple processes running over a number of machines (and then load balance between them). Within each process, if you're using a multi-threaded web server, you can be handling multiple requests. So this would allow you to share objects between requests within the same web server (and a singleton, for example, you would instantiate when the web app process starts).

我不确定GAE的具体细节,但通常在这个大小的Web应用程序中,您将在多台计算机上运行多个进程(然后在它们之间进行负载平衡)。在每个流程中,如果您使用的是多线程Web服务器,则可以处理多个请求。因此,这将允许您在同一Web服务器(和单例,例如,您将在Web应用程序进程启动时实例化)之间的请求之间共享对象。

If the web server is not multi-threaded, but rather multi-process, then you can't share objects between requests as far as I know, without talking to a separate caching process.

如果Web服务器不是多线程的,而是多进程的,那么就我所知,您无法在请求之间共享对象,而无需与单独的缓存进程通信。

The GAE docs seem to support what they call "App Caching" which essentially allows you to do the same thing, but it wasn't clear to me from the docs whether they're doing this by using multi-threaded web servers, or some other caching process that is running alongside the web servers.

GAE文档似乎支持他们所谓的“App Caching”,它本质上允许你做同样的事情,但我从文档中不清楚他们是通过使用多线程Web服务器,还是一些与Web服务器一起运行的其他缓存进程。

I'd be intrigued to know if CacheManager.getInstance() always resolves to the same object, or if it's only the same object for requests handled by the same web server. In reality, it doesn't matter as it's only being used to talk to the separate memcached process anyway.

我很想知道CacheManager.getInstance()是否总是解析为同一个对象,或者它是否只是同一个Web服务器处理的请求的同一个对象。实际上,它并不重要,因为它只用于与单独的memcached进程通信。