PHP 節省記憶體的好幫手:生成器 (Generator)、迭代器 (Iterrators) 與 yield


#PHP#Laravel ORM#Generators#yield#Iterrators#LazyCollection

Generator


Generators provide an easy way to implement simple iterators without the overhead or complexity of implementing a class that implements the Iterator interface.

A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate. Instead, you can write a generator function, which is the same as a normal function, except that instead of returning once, a generator can yield as many times as it needs to in order to provide the values to be iterated over.

生成器 (Generator) 提供了一個簡單的方法實現迭代器 (Iterrators),讓你能在 foreach 時能迭代使用資料而非都將資料存在記憶體 (Memory) 造成記憶體不足的窘境。

只要包含 yield 關鍵字的 function 一律都為生成器 (Generator)。

yield


The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.

yield 語句很像 return,但不同的點是 yield 不是 function 的終止與返回,而是回傳後暫停,下次讀取時再執行到下一個 yield 回傳,直到 function 內沒有 yield

<?php

function yieldDemo(): \Generator
{
    for ($i = 0; $i < 3; $i++) {
        yield $i;
    }
    yield 'A';
    yield 'B';
    yield 'C';
}

foreach (yieldDemo() as $val) {
    echo $val;
}

// output 012ABC

xrange


官方提供了一個淺顯易懂的例子,原生的 range() 回傳的是一個 array,可想而知如果呼叫 range(1, 5000000) 那將會產生一個五百萬行資料的陣列,肯定很耗費記憶體。這時就可以使用生成器來完成節省記憶體的目標。

以下例子可看出使用前後記憶體消耗的差距來到 700 倍之多!

<?php
$max = 5000000;

var_dump('start: ' . m());

foreach (xrange(1, $max) as $value) {
    if ($value == 1) var_dump('yield: ' . m());
}

foreach (range(1, $max) as $value) {
    if ($value == 1) var_dump('array: ' . m());
}

var_dump('end: ' . m());

function xrange($start, $limit, $step = 1): \Generator
{
    if ($start <= $limit) {
        if ($step <= 0) {
            throw new LogicException('Step must be positive');
        }

        for ($i = $start; $i <= $limit; $i += $step) {
            yield $i;
        }
    } else {
        if ($step >= 0) {
            throw new LogicException('Step must be negative');
        }

        for ($i = $start; $i >= $limit; $i += $step) {
            yield $i;
        }
    }
}

function m()
{
    return round(memory_get_usage() / 1024 / 1024, 5) . ' MB';
}
string(17) "start: 0.37267 MB"
string(17) "yield: 0.37347 MB"
string(19) "array: 256.37669 MB"
string(14) "end: 0.3727 MB"

試試看吧:https://paiza.io/projects/hVn0q3N7Vkch9T23YJowdw

使用實例


吹這麼多當然要找一下實際使用的地方,像是 Laravel 的 ORM 內 cursor() 回傳的就是 LazyCollection,而 LazyCollection 就是使用生成器製作的高效能 Facade,有興趣的人可以研究研究。

同理讀取大檔案時也可以使用,File::lines() 也是使用 LazyCollection


參考:
https://www.php.net/manual/zh/language.generators.overview.php
https://www.php.net/manual/en/language.generators.syntax.php
https://docfunc.com/posts/74
https://ithelp.ithome.com.tw/articles/10194457