Laravel chunk and delete

穿精又带淫゛_ 提交于 2019-12-24 22:33:50

问题


I have a large number of items (1M+) that i want to delete from a database, i fork a background job to take care of that, so that the user won't have to wait for it to finish to carry on whatever he/she was doing, the problem is, the app becomes unresponsive while the items are being deleted, so i thought that i would process the items chunk by chunk and sleep for a couple of seconds then carry on.

Here is the code that handles the delete:

// laravel job class
// ...
public function handle()
{
    $posts_archive = PostArchive::find(1); // just for the purpose of testing ;)
    Post::where('arch_id', $posts_archive->id)->chunk(1000, function ($posts) {
        //go through the collection and delete every post.
        foreach($posts as $post) {
            $post->delete();
        }
        // throttle
        sleep(2);
    });
}

Expected result: the posts are chunked and each chunk is processed, then idle for 2 seconds, repeat that until all the items are deleted.

Actual result: a random number of items is deleted once, then the process ends. no errors no indicators, no clue ?

is there a better way to implement this?


回答1:


There is nothing Laravel specific about the way you'd handle this. It sounds like your database server needs review or optimization if a delete query in a job is freezing the rest of the UI.

Retrieving each model and running a delete query individually definitely isn't a good way to optimize this as you'd be executing millions of queries. You could use a while loop with a delete limit if you wish to try to limit the load per second in your application instead of optimizing your database server to handle this query:

do {
    $deleted = Post::where('arch_id', $posts_archive->id)->limit(1000)->delete();
    sleep(2);
} while ($deleted > 0);



回答2:


The reason your actual outcome is different to the expected outcome is to do with how Laravel chunks your dataset.

Laravel paginates through your dataset 1-page at a time, and passes the Collection of Post models to your callback.

Since you're deleting the records in the set, Laravel effectively skips a page of data on each iteration, therefore you end up missing roughly half the data that was in the original query.

Take the following scenario – there are 24 records that you wish to delete in chunks of 10:

Expected

+-------------+--------------------+---------------------------+
|  Iteration  |   Eloquent query   | Rows returned to callback |
+-------------+--------------------+---------------------------+
| Iteration 1 | OFFSET 0 LIMIT 10  |                        10 |
| Iteration 2 | OFFSET 10 LIMIT 10 |                        10 |
| Iteration 3 | OFFSET 20 LIMIT 10 |                         4 |
+-------------+--------------------+---------------------------+

Actual

+-------------+--------------------+----------------------------+
|  Iteration  |   Eloquent query   | Rows returned to callback  |
+-------------+--------------------+----------------------------+
| Iteration 1 | OFFSET 0 LIMIT 10  |                         10 | (« but these are deleted)
| Iteration 2 | OFFSET 10 LIMIT 10 |                          4 |
| Iteration 3 | NONE               |                       NONE |
+-------------+--------------------+----------------------------+

After the 1st iteration, there were only 14 records left, so when Laravel fetched page 2, it only found 4 records.

The result, is that 14 records out of 24 were deleted, and this feels a bit random but makes sense in terms of how Laravel processes the data.

Another solution to the problem would be to use a cursor to process your query, this will step through your DB result-set 1 record at a time, which is better use of memory.

E.g.

// laravel job class
// ...
public function handle()
{
    $posts_archive = PostArchive::find(1); // just for the purpose of testing ;)
    $query = Post::where('arch_id', $posts_archive->id);

    foreach ($query->cursor() as $post) {
        $post->delete();
    }
}

NB: The other solutions here are better if you only want to delete the records in the DB. If you have any other processing that needs to occur, then using a cursor would be a better option.




回答3:


If i understand correctly, the issue is that deleting a large amount of entries takes too much ressources. doing it one post at a time will take too long too.

try getting the min and the max of post.id then chunk on those like

for($i = $minId; $i <= $maxId-1000; $i+1000) {
    Post::where('arch_id', $posts_archive->id)->whereBetween('id', [$i, $i+1000])->delete();
    sleep(2);
}

customize the chunk and the sleep period as it suites your server ressources.



来源:https://stackoverflow.com/questions/52483342/laravel-chunk-and-delete

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!