我应该使用什么类型的收藏?

时间:2021-10-29 20:02:17

I have approximately 10,000 records. Each records has 2 fields: one field is a string up to 300 characters in length and the other field is a decimal value. This is like a product catalog with product names and the price of each product.

我有大约10,000条记录。每个记录有2个字段:一个字段是长度最多300个字符的字符串,另一个字段是十进制值。这就像产品目录,其中包含产品名称和每种产品的价格。

What I need to do is allow the user to type any word and display all products containing that word together with their prices in a listbox. That's all.

我需要做的是允许用户键入任何单词并在列表框中显示包含该单词的所有产品及其价格。就这样。

  1. What type of collection is best for this scenario?
  2. 什么类型的收集最适合这种情况?
  3. If I need to sort based on either product name or price, will the choice still be the same?
  4. 如果我需要根据产品名称或价格进行排序,那么选择是否仍然相同?

Right now I am using an XML file, but I thought using a collection so that I can embed all the values in the code is simpler. Thanks for your suggestions.

现在我正在使用XML文件,但我想使用一个集合,以便我可以在代码中嵌入所有值更简单。谢谢你的建议。

2 个解决方案

#1


10  

A Dictionary will do the job. However, if you are doing rapid partial matches (e.g. search as the user types) you may get better performance by creating multiple keys which point to the same item. For example, the word "Apple" could be located with "Ap", "App", "Appl", and "Apple".

字典将完成这项工作。但是,如果您正在进行快速部分匹配(例如,搜索为用户类型),则可以通过创建指向同一项目的多个键来获得更好的性能。例如,单词“Apple”可以与“Ap”,“App”,“Appl”和“Apple”一起定位。

I have used this approach on a similar number of records with very good results. I have turned my 10K source items into about 50K unique keys. Each of these Dictionary entries points to a list containing references to all matches for that term. You can then search this much smaller list more efficiently. Despite the large number of lists this creates, the memory footprint is quite reasonable.

我在相似数量的记录上使用了这种方法,效果非常好。我已将我的10K源项目转换为大约50K的唯一键。这些词典条目中的每一个都指向一个列表,其中包含对该术语的所有匹配的引用。然后,您可以更有效地搜索这个更小的列表。尽管创建了大量列表,但内存占用非常合理。

You can also make up your own keys if desired to redirect common misspellings or point to related items. This also eliminates most of the issues with unique keys because each key points to a list. A single item may be classified by each of the words in its name; this is extremely useful if you have long product names with multiple words in it. When classifying your items, each word in the name can be mapped to one or more keys.

如果需要,您还可以自行修改常见的拼写错误或指向相关项目。这也消除了使用唯一键的大多数问题,因为每个键都指向列表。单个项目可以按其名称中的每个单词进行分类;如果您的产品名称中包含多个单词,则此功能非常有用。对项目进行分类时,名称中的每个单词都可以映射到一个或多个键。

I should also point out that building and classifying 10K items shouldn't take long if done correctly (couple hundred milliseconds is reasonable). The results can be cached for as long as you want using Application, Cache, or static members.

我还应该指出,如果正确完成(几百毫秒是合理的),建立和分类10K项目不应该花费很长时间。只要您想使用Application,Cache或静态成员,就可以缓存结果。

To summarize, the resulting structure is a Dictionary<string, List<T>> where the string is a short (2-6 characters works well) but unique key. Each key points to a List<T> (or other collection, if you are so inclined) of items which match that key. When a search is performed, you locate the key which matches the term provided by the user. Depending on the length of your keys, you may truncate the user's search to your maximum key length. After locating the correct child collection, you then search that collection for a complete or partial match using whatever methodology you wish.

总而言之,结果结构是Dictionary >,其中字符串是短(2-6个字符效果很好)但是唯一键。每个键指向与该键匹配的List (或其他集合,如果您愿意)。执行搜索时,您将找到与用户提供的术语匹配的键。根据密钥的长度,您可能会将用户的搜索截断为最大密钥长度。找到正确的子集合后,您可以使用您希望的任何方法搜索该集合以进行完整或部分匹配。 ,list>

Lastly, you may wish to create a lightweight structure for each item in the list so that you can store additional information about the item. For example, you might create a small Product class which stores the name, price, department, and popularity of the product. This can help you refine the results you show to the user.

最后,您可能希望为列表中的每个项目创建一个轻量级结构,以便您可以存储有关该项目的其他信息。例如,您可以创建一个小型Product类,用于存储产品的名称,价格,部门和受欢迎程度。这可以帮助您优化向用户显示的结果。

All-in-all, you can perform intelligent, detailed, fuzzy searches in real-time.

总而言之,您可以实时执行智能,详细,模糊的搜索。

The aforementioned structures should provide functionality roughly equivalent to a trie.

上述结构应提供大致相当于trie的功能。

#2


9  

10K records is not that much.

10K记录并不是那么多。

An Dictionary<string,decimal> would fit the bill. You can sort by key or by value using LINQ, as well as do searches.

字典 适合账单。您可以使用LINQ按键或按值排序,也可以进行搜索。 ,decimal>

This assumes that product names are unique.

这假设产品名称是唯一的。

#1


10  

A Dictionary will do the job. However, if you are doing rapid partial matches (e.g. search as the user types) you may get better performance by creating multiple keys which point to the same item. For example, the word "Apple" could be located with "Ap", "App", "Appl", and "Apple".

字典将完成这项工作。但是,如果您正在进行快速部分匹配(例如,搜索为用户类型),则可以通过创建指向同一项目的多个键来获得更好的性能。例如,单词“Apple”可以与“Ap”,“App”,“Appl”和“Apple”一起定位。

I have used this approach on a similar number of records with very good results. I have turned my 10K source items into about 50K unique keys. Each of these Dictionary entries points to a list containing references to all matches for that term. You can then search this much smaller list more efficiently. Despite the large number of lists this creates, the memory footprint is quite reasonable.

我在相似数量的记录上使用了这种方法,效果非常好。我已将我的10K源项目转换为大约50K的唯一键。这些词典条目中的每一个都指向一个列表,其中包含对该术语的所有匹配的引用。然后,您可以更有效地搜索这个更小的列表。尽管创建了大量列表,但内存占用非常合理。

You can also make up your own keys if desired to redirect common misspellings or point to related items. This also eliminates most of the issues with unique keys because each key points to a list. A single item may be classified by each of the words in its name; this is extremely useful if you have long product names with multiple words in it. When classifying your items, each word in the name can be mapped to one or more keys.

如果需要,您还可以自行修改常见的拼写错误或指向相关项目。这也消除了使用唯一键的大多数问题,因为每个键都指向列表。单个项目可以按其名称中的每个单词进行分类;如果您的产品名称中包含多个单词,则此功能非常有用。对项目进行分类时,名称中的每个单词都可以映射到一个或多个键。

I should also point out that building and classifying 10K items shouldn't take long if done correctly (couple hundred milliseconds is reasonable). The results can be cached for as long as you want using Application, Cache, or static members.

我还应该指出,如果正确完成(几百毫秒是合理的),建立和分类10K项目不应该花费很长时间。只要您想使用Application,Cache或静态成员,就可以缓存结果。

To summarize, the resulting structure is a Dictionary<string, List<T>> where the string is a short (2-6 characters works well) but unique key. Each key points to a List<T> (or other collection, if you are so inclined) of items which match that key. When a search is performed, you locate the key which matches the term provided by the user. Depending on the length of your keys, you may truncate the user's search to your maximum key length. After locating the correct child collection, you then search that collection for a complete or partial match using whatever methodology you wish.

总而言之,结果结构是Dictionary >,其中字符串是短(2-6个字符效果很好)但是唯一键。每个键指向与该键匹配的List (或其他集合,如果您愿意)。执行搜索时,您将找到与用户提供的术语匹配的键。根据密钥的长度,您可能会将用户的搜索截断为最大密钥长度。找到正确的子集合后,您可以使用您希望的任何方法搜索该集合以进行完整或部分匹配。 ,list>

Lastly, you may wish to create a lightweight structure for each item in the list so that you can store additional information about the item. For example, you might create a small Product class which stores the name, price, department, and popularity of the product. This can help you refine the results you show to the user.

最后,您可能希望为列表中的每个项目创建一个轻量级结构,以便您可以存储有关该项目的其他信息。例如,您可以创建一个小型Product类,用于存储产品的名称,价格,部门和受欢迎程度。这可以帮助您优化向用户显示的结果。

All-in-all, you can perform intelligent, detailed, fuzzy searches in real-time.

总而言之,您可以实时执行智能,详细,模糊的搜索。

The aforementioned structures should provide functionality roughly equivalent to a trie.

上述结构应提供大致相当于trie的功能。

#2


9  

10K records is not that much.

10K记录并不是那么多。

An Dictionary<string,decimal> would fit the bill. You can sort by key or by value using LINQ, as well as do searches.

字典 适合账单。您可以使用LINQ按键或按值排序,也可以进行搜索。 ,decimal>

This assumes that product names are unique.

这假设产品名称是唯一的。

相关文章