Join Our Naming - myspace.com.mm

As a general rule, if you’re using an RDBMS and can solve your problem using SQL, you should solve your problem using SQL. It’s how we avoid doing joins or sorts in our application code, which is always a good thing.

But this is a general rule. And Jasmine sends us one where solving the problem as a query was a bad idea.

ALTER   FUNCTION [dbo].[GetName](@EntityID int)

RETURNS varchar(200)

AS

BEGIN

declare @Name varchar(200)

select @Name =
  case E.EntityType
    when 'Application'  then A.ApplicationName
    when 'Automation'   then 'Automated Process'
    when 'Group'        then G.GroupName
    when 'Organization' then O.OrgName
    when 'Person'       then P.FirstName + ' ' + P.LastName
    when 'Resource'     then R.ResourceName
    when 'Batch'        then B.BatchComment
  end
from Entities E
left join AP_Applications A   on E.EntityID = A.EntityID
left join CN_Groups G         on E.EntityID = G.EntityID
left join CN_Organizations O  on E.EntityID = O.EntityID
left join CN_People P         on E.EntityID = P.EntityID
left join Resources R         on E.EntityID = R.EntityID
left join AR_PaymentBatches B on E.EntityID = B.EntityID
where E.EntityID = @EntityID

return @Name

END

The purpose of this function is to look up the name of an entity. Depending on the kind of entity we’re talking about, we have to pull that name from a different table. This is a very common pattern in database normalization- a database equivalent of inheritance. All the common fields to all entities get stored in an Entities table, while specific classes of entity (like “Applications”) get their own table which joins back to the Entities table.

On the surface, this code doesn’t even really look like a WTF. By the book, this is really how you’d write this kind of function- if we were going by the book.

But the problem was that these tables were frequently very large, and even with indexes on the EntityID fields, it simply performed horribly. And since “showing the name of the thing you’re looking at” was a common query, that performance hit added up.

The fix was easy- write out seven unique functions- one for each entity type- and then re-write this function to use an IF statement to decide which one to execute. The code was simpler to understand and read, and performed much faster.

In the end, perhaps not really a WTF, or perhaps the root WTF is some of the architectural decisions which allow this to exist (why a function for getting the name, and the name alone, which means we execute this query independently and not part of a more meaningful join?). But I think it’s an interesting example of how “this is the right way to do it” can lead to some unusual outcomes.

Leave a Reply Cancel reply