We should be outraged by programmatic's 'missing 15%' because tech can solve it
A view from Dan de Sybel

We should be outraged by programmatic's 'missing 15%' because tech can solve it

We have the ability to create globally unique IDs for each impression generated by a publisher. What's stopping us from doing it?

Social media was abuzz after the recent ISBA report and a large number of them seem to revolve around the report’s statement that, on average, 15% of every advertiser dollar simply goes missing. People’s reactions vary from pure outrage to a simple shrug of indifference, because discrepancies are a way of life when doing digital media buying.

The correct reaction is an even combination of the two. Yes, discrepancies are a way of life when doing digital media buying, but we should be outraged that more isn’t being done to mitigate them, as we have the technologies and skillsets in the industry to deal with them. What we’ve lacked is sufficient motivation.

Discrepancies are caused for a variety of reasons, including (but not exclusively):

  1. Different counting methodologies – one supplier may charge based on auction wins, whereas another supplier charges on impression delivery. In theory, they should be the same, but auction wins are recorded server-side in data centres over fast network connections, whereas impression wins need to be served from an end user’s computer, which could be hanging off the equivalent of a dial-up connection on the moon. The auction win will almost always be recorded, but the impression may never get delivered if it takes too long.

  • Data outages – managing millions of queries per second is hard and it’s fairly common for at least some impression records not to make it all the way through the data pipeline to a log-level data feed. All revenue/cost recording logic is based on real-time signals, so if records don’t make it to the log-level feed, you will see a discrepancy between the costs in the feed and the costs recorded by the supplier. This is usually relatively minimal – anything bigger than 1-2% should prompt a deeper look.

  • Currency conversions – as Julian Savitch-Lee highlighted in his response to the ISBA report, currency handling is not standardised across suppliers, so with numerous currency conversions happening across the industry all with different rates, it is unsurprising that the cost reported by one supplier is sometimes different to another for the same impression. At worst, this discrepancy can be as high as 10%, but typically it is less than 5% and should be only 1-2%.

  • Hidden fees – not all log-level data feeds are equal and fees, even fully disclosed contracted ones, are often not surfaced at these extreme levels of granularity. This can be because some of the fees are calculated only at invoice time, for example, to provide volume discounts. So while these fees are not shown for good reason, they will contribute to the 15%. When we analyse the data sets we receive from our clients, these hidden fees tend to account for only small discrepancies, usually less than 2%.

  • Undisclosed intermediaries – reselling inventory is rife across programmatic media and, for the most part, this is a good way to ensure liquidity in the marketplace, adjusting supply to meet demand wherever it surfaces. Where it falls down is when multiple resellers are involved for a single impression. Current log files only give the next link in the chain and it can be very difficult, if not impossible, to get log files from resellers that sit in the middle of these reselling chains. Since they are not contracted with either publisher or advertiser, the question of data ownership is murky and hence it is easy to deflect log-file requests, leading to missing information and thus costs. Undisclosed intermediaries tend to yield the highest levels of discrepancy, with almost all discrepancies of 15% and higher (we’ve seen up to 90%) down to reselling.

  • Advertiser, publisher, demand-side platform or supply-side platform obfuscation – sometimes the missing records of data are actually there, but identifying data gets lost from one supplier to another. I was talking to a vice-president of a medium-sized DSP recently who was horrified to find out that many of the impressions his platform bought across some SSPs were coming up with "null" as the DSP name in the SSP logs. His DSP used an SSP aggregator to help manage his throughput costs and, despite integrating according to spec, it turned out that some of the downstream SSP integrations were not so carefully engineered. This leads to difficulty in matching data and hence unknown discrepancies. This kind of obfuscation results in both positive and negative discrepancies – ie where it may look like the publisher is receiving more money than the advertiser, as well as the reverse. As such, it’s difficult to quantify the effect this has and hence we remove records from our analyses where we can’t be sure of the buyer, the seller, the main DSP and main SSP in the chain to reduce the pollution of our results.

  • Fraud – trying to cover a complex subject with a single list item will never do it justice, but a simple example is if an advertiser believes it is buying from a particular publisher and it is not – the DSP logs will show records for that publisher and the SSP logs for the publisher will not. This gets harder to detect if the advertiser is genuinely buying inventory from the same publisher as then it just surfaces as an unknown discrepancy. The specific type of fraud I mention above can account from anywhere from 0% to 20%.

  • Identifying and addressing each of these issues for any given media buy is not for the faint of heart and it genuinely takes a decent amount of industry knowledge and technical investment to come up with solutions for them (full disclosure: Fenestra, the company I work for, was literally set up to solve these issues). However, it’s not impossible and there’s a whole lot more the industry can do to make things easier.

    Coming up with a globally unique ID for each impression is not a difficult problem to solve (we already have many algorithms for doing this for cookie IDs), but the methodology needs to be agreed on and stuck to by every publisher. Indeed, something similar has already been proposed in the upcoming OpenRTB 3.0 spec, but that standard’s lack of backward compatibility with existing OpenRTB implementations gives industry players a convenient excuse to delay migration until a "critical mass" has started transacting using the new model.

    While globally unique impression IDs don’t solve the above discrepancies, it makes them far quicker and easier to identify and, perhaps more importantly, gives the necessary information to suppliers and their clients to work together in addressing them. Coupling IDs with other industry initiatives such as the SupplyChain Object will ensure cleaner, more transparent media buying and start to rebuild trust in this highly effective media execution methodology. Living with discrepancies is inevitable – it’s why the auditing profession exists – but that shouldn’t give us an excuse to accept them wherever they appear.

    By using clean, transparent supply chains, "missing money" discrepancies should be reduced to less than 1% – maybe not immediately, but certainly within the space of a few months. Let’s not be having this same conversation in another three years' time.

    Dan de Sybel is chief technology officer at Fenestra