It seems like a very simple thing, doesn’t it? After all, doing it in SQL is. But if you’re using Doctrine, the right thing to do is use the capabilities of the ORM.
Recently I had to solve a problem for a client: there had been duplicate records in the database and it was necessary to eliminate them.
The point is that not all records had to be deleted, but only the problematic ones.
What I needed to do was eliminate those transactions that, among other filter criteria, belonged to a particular provider.
The way I did it was by using a query that finds all the necessary objects and then removes them:
1 2 3 4 5 6 7 8 9 10 |
$qb = $this->em ->createQueryBuilder() ->select('t') ->from(Transaction::class, 't') ->where('t.date BETWEEN :f AND :t') ->setParameter('f', $fromDate) ->setParameter('t', $toDate) ->orderBy('t.date'); |
So far the generic query is built, only including the mandatory filters (Basically the dates).
Then, if you want to use the filter by provider, we have:
1 2 3 4 5 |
$qb->innerJoin(Account::class, 'a') ->andWhere('a.provider = :p') ->setParameter('p', $provider); |
Everything normal, right?
Well, this is where the problems started.
When I went to check on the app screen how many records corresponded to the date range and provider I found about 3000, however the script was telling me that it was going to delete about 100k transactions.
So I decided to add an option to the script to view the SQL it was about to execute, and this is what I found:
1 2 3 4 5 6 |
SELECT t0_.id AS id_1, t0_.amount AS amount_5, t0_.account_id AS account_id_10 FROM transaction t0_ INNER JOIN account a1_ WHERE (t0_.date BETWEEN ? AND ?) AND a1_.provider_id = ? ORDER BY t0_.date ASC |
Again I stayed a while looking at the SQL. Everything looked good.
What could be happening?
Well… the truth is that SQL was not so good. If you pay a little attention you will notice that the INNER JOIN is missing a detail: the ON clause.
The SQL that should have been generated should be more like:
1 2 3 4 5 6 7 |
SELECT t0_.id AS id_1, t0_.amount AS amount_5, t0_.account_id AS account_id_10 FROM transaction t0_ INNER JOIN account a1_ ON a1_.id = t0.account_id WHERE (t0_.date BETWEEN ? AND ?) AND a1_.provider_id = ? ORDER BY t0_.date ASC |
Yes. That simple detail was making a fundamental difference.
Perfect, the problem is identified. How do we fix it?
As usual, it’s about going back to the sources. The Doctrine documentation is pretty clear on this.
You can do something like:
1 2 3 4 5 |
$qb->innerJoin(Account::class, 'a', Join::ON, 't.account_id = a.id') ->andWhere('a.provider = :p') ->setParameter('p', $provider); |
And the result will be correct, but if we leave it that way we are doing extra work… and since we have an ORM, why do it?
With t being the alias of the Transaction class and the relation being defined as part of the object-relational mapping, a better version is this:
1 2 3 4 5 |
$qb->innerJoin('t.a', 'a') ->andWhere('a.provider = :p') ->setParameter('p', $provider); |
In conclusion SQL and DQL are similar but not so much. It pays to know the differences and use the best of each on each occasion.