Express data cleansing

Address management suppliers are developing a faster online data service that could rival traditional bureaux. Claire Foss finds out what this means for the sector.

Imagine a world where cleaning a database didn't involve sending files to a bureau, or uploading them to an online cleaning service. Notifications could be sent straight to the client, with updates downloaded directly into their database with a single click.

Such a product might sound like a pipe dream, but could arrive sooner than you'd think. This service has been developed by address management provider Postcode Anywhere, although it is still at a testing stage, and is just one of several that have the potential to make the online data cleaning environment more competitive with traditional bureau services.

Online data services have moved on enormously in recent years. Less than a decade ago, 'online' data cleaning would have meant uploading packets of data via modem and phone line to a bureau such as UK Changes, which introduced this service in 1999 to clean phone numbers against the TPS file.

- Sophisticated data cleaning

By the year 2000, these types of products had transferred onto the internet, and online data cleaning in its current form was born. Market leaders include Experian Intact and Adare's Data360 product.

Since then, clients of all shapes and sizes, including Honda, the RNIB, Reader's Digest and Comic Relief, have embraced the concept of online data cleaning.

In the meantime, the technology on which web-enabled data cleaning relies becomes ever more sophisticated. A key benefit of this method is speed.

"Once clients have uploaded their files to be cleaned and suppressed, they can be returned to them very quickly," says Paul Kennedy, head of professional services at Broadsystem. "We can clean a file of 300,000 records in 20 minutes, for example."

Clients that use internet-based services are also able to set hierarchies enabling them to choose the order in which suppression files are applied and profile their customers online using systems such as Acorn.

Companies are also working on ways to enable online to offer more of the services that bureaux and hosted systems provide.

"We're considering launching more specific routines for our online service that would enable a client to handle different parts of the data file, such as feeds from a number of places, in a variety of ways," says Mike Fox, director of UK Changes.

This could allow clients to establish different ways of treating customer data that has arrived by email, compared with that received by post or phone.

Although online data cleaning providers can now offer a sleeker product, changes have generally been incremental, with the basic methods of uploading and automated suppression remaining essentially unchanged since the technology's inception.

But Postcode Anywhere's new notification technology could alter the way that model works in the future.

Clients that use a hosted database model have long benefited from continual updates to their files, while those that use a bureau or internet service have generally relied on periodic suppression and cleaning to maintain their database.

Postcode Anywhere's new system, however, can send alerts to users that have opted in to receive them as soon as updates for particular files are received.

It does this, with a client's permission, by collecting the unique delivery property reference number (UDPRN) allocated to each address when data is cleaned by the company and recording that number's association with the client.

The UDPRN was introduced as part of the Postcode Address File (PAF) in 2004 and, crucially, contains no personal details - so keeping it on record does not contravene data protection regulations.

- Immediate updates

As soon as updates for a significant proportion of a client's recorded UDPRN numbers are available, clients are notified and can download the updates immediately. The technology is in the testing stage with two undisclosed brands.

"Technology is evolving to allow a much more intimate relationship with the customer," says Guy Mucklow, managing director of Postcode Anywhere. "It is allowing the data cleaning process to take place in a more 'real time' environment than was previously possible and businesses are realising the value in this."

So does an increasingly sophisticated online data cleaning technology spell the end of the traditional database bureau?

While online data cleaning is a convenient and quick option for many brands, those with large customer databases are still likely to find that a bureau option suits them better.

"Although online services are becoming more capable and offer almost the range of services that a bureau does, we would still recommend that larger and more complex jobs were handled by a bureau," says UK Changes' Fox.

One of the reasons for this is the complexity involved in handling diverse databases. Online data cleaning is becoming more intelligent, but this also means that the end user must assume a greater level of control over the process.

Many of the internet-based products offer a user friendly entry-level interface for novice users, but to get the most out of some of the more complex online procedures, clients require a high level of knowledge and technical ability.

"Some of the suppliers that offer cleaning online require a strong knowledge of data and what it involves, and there are a lot of people who don't understand that," says Martin King, sales director at The REaD Group.

The accuracy of online services is another contentious issue for both clients and suppliers. Online data cleaning is a largely automated process, with queried or unmatchable files highlighted and dealt with manually by the client or account manager.

But despite manual intervention, this process, where files are compared using the same matching criteria, can leave databases vulnerable to over-suppression on the one hand and inaccurate or poor-quality matches on the other.

"Most matching technology uses a 'single key' process, where one field needs to match up. It's when you start to run multiple keys to be more accurate that you start to have degrees of correctness," says Iain Lovatt, business development director at B2B data vendor Bluesheep. "With a bureau, you can alter tolerance levels and improve match rates," he adds.

This is one area, it seems, where online will always lag behind a bureau service. "We have put a lot of effort into developing our matching routines and we are not comfortable with allowing people to change the tolerances online," says UK Changes' Mike Fox.

"Users of our service can set different levels for matching - address or family name, or full name, for example, but not the parameters that determine whether something is a match or not," he adds.

Fox says that UK Changes' bureau service still retains this flexibility and can alter its parameters in difficult circumstances, such as where information fields are missing or incomplete.

Online data cleaning has a long way to go before it catches up with the services offered by data bureaux. But as the medium continues to develop and internet speeds increase, clients who use the online self-service model could find that their data hygiene process becomes a lot easier. COMMENT


Database management system advances are allowing data companies to offer a more scaleable solution to the end client. We should see most SME companies managing their data using purely online data services.

David Arrowsmith, customer intelligence strategy manager, SAS Online data cleaning is becoming more important because the corporate board recognises that they have a responsibility to look after their data, and there are strong punitive measures if it is not done correctly.

Peter Smith, director, TMN Group The internet can bring data collection, cleaning and use together in near real-time. This puts the user in the driving seat because with the right CRM tools, they can look for patterns themselves in click-response data to further expand a customer's profile.


Data cleaning services provide marketers with full control of their data, but often lack the ability to highlight problems or recommend alternative approaches. So users must be familiar with data cleansing processes to make it work.



Product: Cameo Online

Supplier: EuroDirect

Cameo Online cleans, profiles and enhances data from 30 countries in a single online system. International data can be cleaned online as it is being profiled against CAMEO geodemographic characteristics.

Product: Pure Lead

Supplier: EDR

This cleans and dedupes data acquired for email campaigns and calculates potential ROI by comparing cost and likely open rates from different sources.

Product: Suppression Online

Supplier: The REaD Group

Suppression online dedupes and cleans uploaded data against the Bereavement Register, the Gone Away Suppression (GAS) file, the Postcode Address File and the National Suppression File. Clients include smaller firms, such as the Hartlepool and District Hospice.

Companies can choose which suppression files to buy, based on cost and matches. There is no licence fee or processing costs.

Product: Experian Intact

Supplier: Experian Integrated Marketing

Intact allows customers to clean, suppress and enhance files online. Users vary from large blue-chips to small business owners and include Boden, Help the Aged and Nottingham Forest Football Club.

Intact was one of the first online data cleaning services on the market, but has recently updated the product to add a free data audit as part of the service.

Intact has also recently been incorporated into Experian Integrated Marketing. This means clients that use several Experian products, including Intact, only have to deal with a single account manager.



- Preference service TPS

What it is: A register of UK consumers who have opted out of receiving unsolicited phone calls. Run by the Direct Marketing Association.

- Preference service MPS

What it is: A list of UK households that have opted out of receiving direct mail. Also run by the DMA.

- Preference service FPS

What it is: A register of numbers opted out of receiving fax marketing. Predominantly aimed at businesses, but individuals can also register.

- Goneaway service GAS

What it is Compiled by The REaD Group, GAS is a list of people who have moved from a known address. GAS Reactive is a sister file that lists new known addresses. Goneaway service NCOA

What it is: A register of homemovers that have used Royal Mail's redirection service.The Business Changes File is the corporate equivalent.

- Goneaway service Xpression

What it is: Formerly known as the National Suppression File, Xpression is formed of several files including GAS and NCOA. Goneaway service disConnect

What it is: A file managed by Equifax that lists homemovers, deceased and opted-out consumers.

- Business suppression BSF

What it is: Run by The REaD Group, this file identifies company details that have altered due to factors such as closure and employee changes.

- Business suppression CTPS

What it is: The Corporate TPS is run by the DMA and is a file of firms that have registered to opt out of unsolicited calls.

- Bereavement file TBR

What it is: A REaD Group File of deceased persons.

- Bereavement file Mortascreen

What it is: A file of deceased persons compiled by Smee & Ford and marketed by agency Millennium.

- Bereavement file Baby MPS

What it is: A service for families who have lost infants and wish to stop baby-related mailings.