核心数据插入和保存缓慢

时间:2023-01-30 16:31:25

I'm parsing data from a JSON file that has approximately 20000 objects. I've been running the time profiler to figure out where my bottlenecks are and speed up the parse and I've managed to reduce the parse time by 45%, however according to the time profiler 78% of my time is being taken by the context.save() and much of the heavy portions throughout the parse are sourcing from where I call NSEntityDescription.insertNewObjectForEntityForName.

我正在解析JSON文件中的数据,该文件有大约20000个对象。我已经运行时间分析器找出我的瓶颈,加快解析和我设法减少解析时间45%,然而根据时间分析器context.save采取的78%的时间是(),解析的部分从哪里我叫NSEntityDescription.insertNewObjectForEntityForName采购。

Does anyone have any idea if theres any way to speed this up? I'm currently batching my saves every 5000 objects. I tried groupings of 100,1000,2000,5000,10000 and I found that 5000 was the most optimal on the device I'm running. I've read through the Core Data Programming Guide but have found most of the advice it gives is to optimizing fetching on large numbers of data and not parsing or inserting.

有没有人知道有没有什么方法可以加快这个速度?我现在每保存5000个对象。我尝试了100、1000、2000、5000、10000的分组,我发现5000是我运行的设备中最优的。我阅读了核心数据编程指南,但发现它提供的大多数建议是优化对大量数据的抓取,而不是解析或插入。

The answer could very well be, Core Data has its limitations, but I wanted to know if anyone has found ways to further optimize inserting thousands of objects.

答案很可能是,Core Data有它的局限性,但我想知道是否有人找到了进一步优化插入数千个对象的方法。

UPDATE

更新

As requested some sample code on how I handle parsing

请提供一些关于我如何处理解析的示例代码

class func parseCategories(data: NSDictionary, context: NSManagedObjectContext, completion: ((success: Bool) -> Void)) {

    let totalCategories = data.allValues.count
    var categoriesParsed = 0

    for (index, category) in data.allValues.enumerate() {
        let privateContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
        privateContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator!
        privateContext.mergePolicy = NSMergeByPropertyStoreTrumpMergePolicy

        //Do the parsing for this iteration on a separate background thread
        privateContext.performBlock({ () -> Void in

            guard let categoryData = category.valueForKey("category") as? NSArray else{
                print("Fatal Error: could not parse the category data into an NSArray. This should never happen")
                completion(success: false)
                return
            }

            let newCategory: Categories?
            do {
                let newCategory = NSEntityDescription.insertNewObjectForEntityForName("Categories", inManagedObjectContext: privateContext) as! Categories
                newCategory.name = category.valueForKey("name") as? String ?? ""
                newCategory.sortOrder = category.valueForKey("sortOrder") as? NSNumber ?? -1

                SubCategory.parseSubcategories(category.valueForKey("subcategories") as! NSArray, parentCategory: newCategory, context: privateContext)
            } catch {
                print("Could not create the Category object as expected \(error)")
                completion(success: false)
            }

            do {
                print("Num Objects Inserted: \(privateContext.insertedObjects.count)") //Num is between 3-5k
                try privateContext.save()
            } catch {
                completion(success: false)
                return
            }

            categoriesParsed+=1
            if categoriesParsed == totalCategories{
                completion(success: true)
            }
        })
    }
}

In the above code, I look through the top level data objects which I call a "Category", I spin off background threads for each object to parse concurrently. There are only 3 of this top level object, so it doesn't get too thread heavy.

在上面的代码中,我查看了我称之为“类别”的顶层数据对象,我将每个对象的后台线程分离出来并发地解析。这个*对象中只有3个,所以线程量不会太大。

Each Category has SubCategories, and several other levels of child objects which yield several thousand objects each getting inserted.

每个类别都有子类别,以及多个其他级别的子对象,这些子对象产生数千个对象,每个对象都被插入。

My core data stack is configured with one sqlite database the standard way that is configured when you create an app with CoreData

我的核心数据堆栈使用一个sqlite数据库进行配置,这是在创建具有CoreData的应用程序时配置的标准方式

1 个解决方案

#1


1  

One reason is that you're saving the managed object context in each single iteration, which is expensive and not needed. Save it after the last item has been inserted.

一个原因是您在每次迭代中保存了托管对象上下文,这是昂贵的,并且不需要。在插入最后一项之后保存它。

#1


1  

One reason is that you're saving the managed object context in each single iteration, which is expensive and not needed. Save it after the last item has been inserted.

一个原因是您在每次迭代中保存了托管对象上下文,这是昂贵的,并且不需要。在插入最后一项之后保存它。