转 What is Redis and what do I use it for?

时间:2021-12-31 15:44:07

原文: http://*.com/questions/7888880/what-is-redis-and-what-do-i-use-it-for

Redis = Remote Dictionary Service

TL;DR: If you can map a use case to Redis and discover you aren't at risk of running out of RAM by using Redis there is a good chance you should probably use Redis.

It's a "NoSQL" key-value data store. More precisely, it is a data structure server.

Not like MongoDB (which is a disk-based document store), though MongoDB could be used for similar key/value use cases.

The closest analog is probably to think of Redis as Memcached, but with built-in persistence (snapshotting or journaling to disk) and more datatypes.

Those two additions may seem pretty minor, but they are what make Redis pretty incredible. Persistence to disk means you can use Redis as a real database instead of just a volatile cache. The data won't disappear when you restart, like with memcached.

The additional data types are probably even more important. Key
values can be simple strings, like you'll find in memcached, but they
can also be more complex types like Hashes, Lists (ordered collection,
makes a great queue), Sets (unordered collection of non-repeating
values), or Sorted Sets (ordered/ranked collection of non-repeating
values).

This is only the tip of the Redis iceberg, as there are other
powerful features like built-in pub/sub, transactions (with optimistic
locking), and Lua scripting.

The entire data set, like memcached, is stored in-memory so it is
extremely fast (like memcached)... often even faster than memcached.
Redis had virtual memory, where rarely used values would be swapped out
to disk, so only the keys had to fit into memory, but this has been
deprecated. Going forward the use cases for Redis are those where its
possible (and desirable) for the entire data set to fit into memory.

Redis is a fantastic choice if you want a highly scalable data store
shared by multiple processes, multiple applications, or multiple
servers. As just an inter-process communication mechanism it is tough to
beat. The fact that you can communicate cross-platform, cross-server,
or cross-application just as easily makes it a pretty great choice for
many many use cases. Its speed also makes it great as a caching layer.

Update 4/1/2015: Redis 3.0 (stable) was released today. This version of Redis brings cluster support, which makes it much easier to scale Redis.

@acidzombie24 Its possible you could use Redis in-place-of MySQL but it really depends on the use case. If your data set could grow to 20GB or you need to use some business analytic tools, you need to make heavy use of joins, etc. then it might not make sense. It's really hard to make a blanket statement except to say that there are certainly cases where Redis would be appropriate in place of MySQL.

@KO. Just like Redis offers features which memcached doesn't, RDBMS offers many many features which Redis does not. Redis should be FAR faster than any RDBMS, but it also can't do as much. If you can map your RDBMS use case to Redis then it might be worth taking a look, but if you can't don't force it. Best tool for the job and all that. A No-SQL store that has a better chance of replacing your RDBMS is MongoDB, but even that needs to be evaluated carefully and you should go with the best fit, which may be an RDBMS

A very thought-provoking thread here. Being an old dog, it was quite a thunder-clap when I realized the reason most data is now persisted to DBs is simply because a whole generation of programmers have grown up with cheap, ubiquitous DBs and they don't KNOW any other way of persisting data. No clue what qsort() & bsearch() are capable of for example. RE: joins with Redix, if people knew how simple it is to do joins in memory they'd be shocked. The data value you join on simply becomes a token that is replaced by the data the FK indexes. It's ~ string replacement problem

DBs are wonderful, and very flexible, but horribly expensive in terms of resources, and very slow. Slow, not because the internals aren't fast, but because they have a large fixed cost which is realized as a lot of latency. As an example, my employer has an ENUM server with 45 ms latency - good by industry standards. I am writing a CPS throttle with .067 microsecond latency. Not even in the same ballpark. Likewise, bsearch() against a pointer array, the result of a ptr qsort(), and fetching data from SSDs, is thousands of times faster than any DB - even in-memory ones like ours.

With respect to Redix only holding indexes in memory, with SSDs, I hope this option is still available. The latest generation of SSD focused RAIDs by Adaptec (ASR-8885 RAID) and LSI perform at 12Gbps - spectacular for 256 -2k byte random I/Os a data structure server would be fetching. The reason so many alternative to SQL, like NoSQL, are showing up, is because SQL is the problem. Too much parsing, too much data conversion, metadata driving the very internals of your database, and too much conversion again on the backend. With data structures you have the answer before the SQL gets to the NIC

What it can be use for? Few examples from http://highscalability.com/blog/2011/7/6/11-common-web-use-cases-solved-in-redis.html:

  1. Show latest items listings in your home page. This is a live in-memory cache and is very fast. LPUSH is used to insert a content ID at the head of the list stored at a key. LTRIM is used to limit the number of items in the list to 5000. If the user needs to page beyond this cache only then are they sent to the database.
  2. Deletion and filtering. If a cached article is deleted it can be removed from the cache using LREM.
  3. Leaderboards and related problems. A leader board is a set sorted by score. The ZADD commands implements this directly and the ZREVRANGE command can be used to get the top 100 users by score and ZRANK can be used to get a users rank. Very direct and easy.
  4. Order by user votes and time. This is a leaderboard like Reddit where the score is formula the changes over time. LPUSH + LTRIM are used to add an article to a list. A background task polls the list and recomputes the order of the list and ZADD is used to populate the list in the new order. This list can be retrieved very fast by even a heavily loaded site. This should be easier, the need for the polling code isn't elegant.
  5. Implement expires on items. To keep a sorted list by time then use unix time as the key. The difficult task of expiring items is implemented by indexing current_time+time_to_live. Another background worker is used to make queries using ZRANGE ... with SCORES and delete timed out entries.
  6. Counting stuff. Keeping stats of all kinds is common, say you want to know when to block an IP addresss. The INCRBY command makes it easy to atomically keep counters; GETSET to atomically clear the counter; the expire attribute can be used to tell when an key should be deleted.
  7. Unique N items in a given amount of time. This is the unique visitors problem and can be solved using SADD for each pageview. SADD won't add a member to a set if it already exists.
  8. Real time analysis of what is happening, for stats, anti spam, or whatever. Using Redis primitives it's much simpler to implement a spam filtering system or other real-time tracking system.
  9. Pub/Sub. Keeping a map of who is interested in updates to what data is a common task in systems. Redis has a pub/sub feature to make this easy using commands like SUBSCRIBE, UNSUBSCRIBE, and PUBLISH.
  10. Queues. Queues are everywhere in programming. In addition to the push and pop type commands, Redis has blocking queue commands so a program can wait on work being added to the queue by another program. You can also do interesting things implement a rotating queue of RSS feeds to update.
  11. Caching. Redis can be used in the same manner as memcache.

Yet another one: A very simple, but useful tool for programmers. A build-counter server. Eg. your application version may be 1.2.7 (build #473). It's very easy to add a new build number for your project, and your build-script can easily request a new (unique) number.

Oh, and it's awfully easy to use this for making two scripts that run on two different platforms communicate with eachother. For instance one of those Cortex-A based TV-boxes can send data back and forth to my desktop computer - so I'm using it like a FIFO.