This specific project was inherited from another developer, who choose the Laravel as the framework. While he had coded the project, small datasets were used (so it was fast), but since inheriting the project, the dataset grew as we started working with real data.
Most of the code is running efficiently, however there is this one part (which I will explain below) that is simply too slow to be usable (regular timeouts after minutes of processing - sometimes it finish within the allowed 2 minutes, sometimes it does not).
The website is a booking website. Resorts will register and list their stands (in stand groups) on the website, allowing people to book the available stands.
The part that determines if a resort have available stands for the date range selected (and if so, how many are available), is the one that is extremely slow - and the worst part is that I have no idea how to make it go faster (taking 2 minutes just for a number is way too long).
As per Laravel convention, I am running it locally on a Homestead, and push it to both a staging and public server using Git (dev branch, and master branch respectively).
Now for how the code works (not my code, it was inherited):
- Find all resorts (with the keyword in the name, or all resorts; with their stand groups).
- Get the date range start and finish from the request.
- Loop through each of the resorts to check their availability:
3.1. Get all the stand group for this resort, and loop through each of them:
3.1.1. Now we get a collection of DateRate (the table in the database that contains an entry for each stand, for every day, and is also what is used to link the stand to a booking on a date - currently this table contains ~50k rows, and this is for a single resort with about 250 stands total, for a few months) with the stands where the stand is in the current stand group and the dates in the correct range, that have a booking associated, or which is locked (dates get locked when someone choose a stand to prevent someone else to book it while one is checking out). This collection serves as the invert of the data we want.
3.1.2. Using the stand_id of the above collection (made into an array), a new query is run on the DateRate table, which excludes the dates where it is associated with any of the bookings from 3.1.1.
3.1.3. Finally the number of stands (in this stand group) is counted (it is not counted in the query, since its data is further used to see if there is a special on these stands, but that check is quick).
The main slowdown happens in the 3.1. sub-items where it queries the same table twice, to get an inverted collection and then what we need.
The code in the slow section
$unavailable_stands_booked = DateRate::with('stand')
->select(['stand_id'])
->whereHas('stand', function($query) use ($stand_group_id) {
$query->where('stand_group_id', $stand_group_id);
})
->where('date', '>=', $check_in_date)
->where('date', '<', $check_out_date)
->where(function($q) {
$q->whereNotNull('booking_id')
->orWhereNotNull('import_id')
->orWhere('unavailable', 1)
->orwhere('locked_until', '>=', Carbon::now('GMT+2')->toDateTimeString());
})
->distinct()
->get();
$unavailable_stand_ids = [];
foreach ($unavailable_stands_booked as $key => $unavailable_stand) {
$unavailable_stand_ids[$unavailable_stand->stand_id] = $unavailable_stand->stand_id;
}
$stands_available = DateRate::with('stand')
->select(['stand_id'])
->whereHas('stand', function($query) use ($stand_group_id) {
$query->where('stand_group_id', $stand_group_id);
})
->where('date', '>=', $check_in_date)
->where('date', '<', $check_out_date)
->whereNotIn('stand_id', $unavailable_stand_ids)
->distinct()
->get();
//do specials check, and count the number available
My questions are:
- How do I make it go faster? Ideally I don’t want to change too much, but speed is very important.
- What type of caching will make this work even better? These checks are unique for each visitor, and when one creates a booking, the cached data is no longer good.
- How are these types of websites typically built?
The website is hosted in a dedicated environment, with all the resources it might need, but it is simply slow.