Data masking is a method of creating a structurally similar but inauthentic version of an organization’s data that can be used for purposes such as software testing and user training. The purpose is to protect the actual data while having a functional substitute for occasions when the real data is not required.
Although most organizations have stringent security controls in place to protect production data in storage or in business use, sometimes that same data has been used for operations that are less secure. The issue is often compounded if these operations are outsourced and the organization has less control over the environment. In the wake of compliance legislation, most organizations are no longer comfortable exposing real data unnecessarily.
In data masking, the format of data remains the same; only the values are changed. The data may be altered in a number of ways, including encryption, character shuffling, and character or word substitution. Whatever method is chosen, the values must be changed in some way that makes detection or reverse engineering impossible.
Vendors of data masking products include Compuware, dataguise, IBM, Informatica and Oracle.
See also: extract, transform and load (ETL)
Learn More About IT:
> Neil Roiter explains using data masking to hide information from testers.
This article is part of
This was last updated in October 2009