Can You See a Slow Spot?
Tuesday, September 24, 2024I create a project with Laravel Framework called Larva Interactions, a common use cases using Larva stack (Laravel + Livewire + AlpineJS + TailwindCSS).
Hold on! There’s a term called TALL stack. Do you know that?
Yes I know. But, I want to create a another term and Larva sounds cool!
One of use cases in Larva Interactions is Job Batching that shipped in Laravel since version 8. In this case, I use CSV file as an example for Job Batching. The flow is simple:
- Upload CSV;
- Press “import” button for import CSV records into a table in database;
- Process the CSV file behind the scene using Job Batching, keep progress of jobs using
$batch
andwire:poll
.
Ok, now I want to give you two codes and I want you analize both of them. Which one is a slow spot?
3
2
1
The answer is Code 1. The reason is we push data into array inside while
loops. Then, after while
loops finish, we chunk it per 1000 then add the chunk into job batching. For 100 to 1000 lines (exclude the header) are not the problem. But, after 1001 until million or billion lines of record it will lead a big problem! At that time, I wasn’t realize that after I see an answers from Stackoverflow about store a large dataset inside array will increase memory.
Now repeat after me 3 times.
Store data inside array then memory increased!
Store data inside array then memory increased!
Store data inside array then memory increased!
That’s why you and I need to learn fundamentals of Computer Science. Sometimes, I learn the fundamentals but I cannot figure out when to use these things in real world. Wow! I learn a thing today!
Let’s take a look in Code 2. Instead of wait while
loop finish, let’s interrupt it by set a limit named $chunkSize
. Then, push the $chunk
into the job batching when the amount of chunk matches with $chunkSize
, set $chunk
into empty array again and loop again. If there’s still any left, just execute it outside the loop. That’s it.
My friend, Wayan Jimmy told to me that I can use yield
keyword. I will quote it from PHP documentation about the purpose of yield
because it’s the best explaination.
The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.
In the end, yield
act as play and pause execution of function. It acts as it needs to. We usually see yield
in looping statement like for
, foreach
, and while
. Let’s make combine it with LazyCollection
feature in Laravel.
That’s it! Now, the question is: Do I really need to use user interface to process job batching for above than one million records or billion records? Hmm, maybe not. It takes a lot of time! Let’s process it behind the scene and just use single Job instead of Job Batching for that case.
Reference