ylliX - Online Advertising Network
How to implement FILTER semantics with Oracle JSON aggregate functions

Oracle’s way to get multiple values in a top 1 per group query


I’ve blogged about generic ways of getting top 1 or top n per category queries before on this blog.

An Oracle specific version in that post used the arcane KEEP syntax:

SELECT
  max(actor_id)   KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id),
  max(first_name) KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id),
  max(last_name)  KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id),
  max(c)          KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id)
FROM (
  SELECT actor_id, first_name, last_name, count(film_id) c
  FROM actor
  LEFT JOIN film_actor USING (actor_id)
  GROUP BY actor_id, first_name, last_name
) t;

This is a bit difficult to read when you see it for the first time. Think of it as a complicated way to say you want to get the first value per group. This hypothetical syntax would be much nicer:

SELECT
  FIRST(actor_id ORDER BY c DESC, actor_id),
  FIRST(first_name ORDER BY c DESC, actor_id),
  FIRST(last_name ORDER BY c DESC, actor_id),
  FIRST(c ORDER BY c DESC, actor_id)
FROM (...) t;

So, we’re getting the FIRST value of an expression per group when we order the group contents by the ORDER BY clause.

Oracle’s syntax takes into account that ordering may be non-deterministic, leading to ties if you don’t include a unique value in the ORDER BY clause. In that case, you can aggregate all the ties, e.g. to get an AVG() if that makes sense in your business case. If you don’t care about ties, or ensure there are no ties, MAX() is an OK workaround, or since 21c, ANY_VALUE()

Now, there’s quite a bit of repetition when you’re projecting multiple columns per group like that. Window functions have a WINDOW clause, where common window specifications can be named for repeated use. But GROUP BY doesn’t have such a feature, probably because only few cases arise where this would be useful.

But luckily, Oracle has:

  • OBJECT types, which are just nominally typed row value expressions
  • ANY_VALUE, an aggregate function that generates any value per group, which has been added in Oracle 21c

With these two utilities, we can do this:

CREATE TYPE o AS OBJECT (
  actor_id NUMBER(18),
  first_name VARCHAR2(50),
  last_name VARCHAR2(50),
  c NUMBER(18)
);

And now:

SELECT
  ANY_VALUE(o(actor_id, first_name, last_name, c))
    KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id)
FROM (...) t;

Note, it would be possible to use MAX() in older Oracle versions, if you work around this error message as well:

ORA-22950: cannot order objects without MAP or ORDER method

This is just a workaround, of course. It’s tedious to manage named OBJECT types like that for every case of aggregation. If you don’t need the type safety, you can always also just use JSON instead:

SELECT
  ANY_VALUE(JSON_OBJECT(actor_id, first_name, last_name, c))
    KEEP (DENSE_RANK FIRST ORDER BY c DESC, actor_id)
FROM (...) t;



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *