问题
I would like to know what is the difference between laravel chunk and laravel cursor method. Which method is more suitable to use? What will be the use cases for both of them? I know that you should use cursor to save memory but how it actually works in the backend?
A detailed explanation with example would be useful because I have searched on stackoverflow and other sites but I didn't found much information.
Here is the code snippet's from the laravel documentation.
Chunking Results
Flight::chunk(200, function ($flights) {
foreach ($flights as $flight) {
//
}
});
Using Cursors
foreach (Flight::where('foo', 'bar')->cursor() as $flight) {
//
}
回答1:
Indeed This question might attract some opinionated answer, however the simple answer is here in Laravel Docs
Just for reference:
This is chunk:
This is Cursor:
Chunk retrieves the records from the database, and load it into memory while setting a cursor on the last record retrieved so there is no clash.
So the advantage here is if you want to reformat the large record before they are sent out, or you want to perform an operation on an nth number of records per time then this is useful. An example is if you are building a view out/excel sheet, so you can take the record in counts till they are done so that all of them are not loaded into the memory at once and thereby hitting the memory limit.
Cursor uses PHP Generators, you can check the php generators page however here is an interesting caption:
While I cannot guarantee that I understand fully the concept of Cursor, but for Chunk, chunk runs the query at every record size, retrieving it, and passing it into the closure for further works on the records.
Hope this is useful.
回答2:
We have a comparison: chunk() vs cursor()
- cursor(): High Speed
- chunk(): Constant Memory Usage
10,000 records:
+-------------+-----------+------------+
| | Time(sec) | Memory(MB) |
+-------------+-----------+------------+
| get() | 0.17 | 22 |
| chunk(100) | 0.38 | 10 |
| chunk(1000) | 0.17 | 12 |
| cursor() | 0.16 | 14 |
+-------------+-----------+------------+
100,000 records:
+--------------+------------+------------+
| | Time(sec) | Memory(MB) |
+--------------+------------+------------+
| get() | 0.8 | 132 |
| chunk(100) | 19.9 | 10 |
| chunk(1000) | 2.3 | 12 |
| chunk(10000) | 1.1 | 34 |
| cursor() | 0.5 | 45 |
+--------------+------------+------------+
- TestData: users table of Laravel default migration
- Homestead 0.5.0
- PHP 7.0.12
- MySQL 5.7.16
- Laravel 5.3.22
回答3:
chunk
is based on pagination, it maintains a page number, and do the looping for you.
For example, DB::table('users')->select('*')->chunk(100, function($e) {})
will do multiple queries until the result set is smaller than the chunk size(100
):
select * from `users` limit 100 offset 0;
select * from `users` limit 100 offset 100;
select * from `users` limit 100 offset 200;
select * from `users` limit 100 offset 300;
select * from `users` limit 100 offset 400;
...
cursor
is based on PDOStatement::fetch
and Generator.
$cursor = DB::table('users')->select('*')->cursor()
foreach ($cursor as $e) { }
will do issue a single query:
select * from `users`
But the driver doesn't fetch the result set at once.
回答4:
Cursor()
- only single query
- fetch result by call
PDOStatement::fetch()
- by default buffered query is used and fetch all result at once.
- turned only current row into eloquent model
Pros
- minimize eloquent model memory overhead
- easy to manipulate
Cons
- huge result leads to out of memory
- buffered or unbuffered is a trade-off
Chunk()
- chunk query in to queries with limit and offset
- fetch result by call
PDOStatement::fetchAll
- turned results into eloquent models batchly
Pros
- controllable used memory size
Cons
- turned results in to eloquent models batchly may cause some memory overhead
- queries and memory usage is a traid-off
TL;DR
I used to think cursor() will do query each time and only keep one row result in memory. So when I saw @mohammad-asghari's comparison table I got really confused. It must be some buffer behind the scenes.
By tracking Laravel Code as below
/**
* Run a select statement against the database and returns a generator.
*
* @param string $query
* @param array $bindings
* @param bool $useReadPdo
* @return \Generator
*/
public function cursor($query, $bindings = [], $useReadPdo = true)
{
$statement = $this->run($query, $bindings, function ($query, $bindings) use ($useReadPdo) {
if ($this->pretending()) {
return [];
}
// First we will create a statement for the query. Then, we will set the fetch
// mode and prepare the bindings for the query. Once that's done we will be
// ready to execute the query against the database and return the cursor.
$statement = $this->prepared($this->getPdoForSelect($useReadPdo)
->prepare($query));
$this->bindValues(
$statement, $this->prepareBindings($bindings)
);
// Next, we'll execute the query against the database and return the statement
// so we can return the cursor. The cursor will use a PHP generator to give
// back one row at a time without using a bunch of memory to render them.
$statement->execute();
return $statement;
});
while ($record = $statement->fetch()) {
yield $record;
}
}
I understood Laravel build this feature by wrap PDOStatement::fetch(). And by search buffer PDO fetch and MySQL, I found this document.
https://www.php.net/manual/en/mysqlinfo.concepts.buffering.php
Queries are using the buffered mode by default. This means that query results are immediately transferred from the MySQL Server to PHP and then are kept in the memory of the PHP process.
so by doing PDOStatement::execute() we actually fetch whole result rows at ones and stored in the memory, not only one row. So if the result is too huge, this will lead to out of memory exception.
Though the Document shown we could use $pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
to get rid of buffered query. But the drawback should be caution.
Unbuffered MySQL queries execute the query and then return a resource while the data is still waiting on the MySQL server for being fetched. This uses less memory on the PHP-side, but can increase the load on the server. Unless the full result set was fetched from the server no further queries can be sent over the same connection. Unbuffered queries can also be referred to as "use result".
回答5:
I made some benchmark using cursor and where
foreach (\App\Models\Category::where('type','child')->get() as $res){
}
foreach (\App\Models\Category::where('type', 'child')->cursor() as $flight) {
//
}
return view('welcome');
here is the result:
来源:https://stackoverflow.com/questions/45464676/what-is-the-difference-between-laravel-cursor-and-laravel-chunk-method