Data Masking

by Martha Kendall Custard
Data masking allows organizations to share data safely. Learn what it is, what types there are, and how to use it.

What is data masking?

Data masking is a method to protect sensitive data in use from unintended exposure while maintaining the data’s functional value by obfuscating the data. Data masking techniques can include substituting parts of datasets, shuffling the data, translating specific numbers to ranges, scrambling the data, and more. A common use case would be to mask certain data available to call center representatives, like changing customers’ birth dates to age ranges (between the ages of 30-50 years old, for example) in order to protect the customers’ sensitive birth date information, while retaining the usefulness of the age range information to the call center employee.

Types of data masking

Types of data masking vary depending on how the original values are organized. The main types include:

  • Static: Creates one sanitized version of the database by altering all sensitive information. A backup of a database in production is created and moved to a different location. After removing unnecessary data, the remaining information is masked while in stasis. Once this is complete, the new copy can be safely distributed.
  • Deterministic: Maps two data sets so they have the same type of data, with each value consistently replaced by the corresponding value. For example, the term “Verbena” would always be replaced by the term “Amina.” This method can be convenient but isn’t the most secure.
  • On-the-fly: Useful in a development environment, this type masks data as it is transferred from production systems to development systems before being saved. Instead of creating a backup, data is automatically masked while continuously streaming from production to the desired destination. 
  • Dynamic: While on-the-fly stores information in a secondary data store in the development environment, dynamic data masking streams these details directly from production to the development environment.

Benefits of data masking

Data masking is a process that keeps sensitive information away from prying eyes while in use. Organizations using this strategy experience the following security benefits:

  • Proactive security measure: Helps organizations avoid critical threats like data loss, exfiltration, account compromise, insecure interfaces, and insider threats.
  • Safer cloud adoption: Some organizations might be hesitant to operate in the cloud due to potential security risks. Masking solves this problem by reducing these concerns.
  • Usable, low-risk data: While useless to any security risks, masked data is still functional for the organization’s internal use.
  • Safe sharing: Sensitive details can be shared with testers and developers without leaking data that is not masked.

Data masking techniques

Organizations can choose from various masking techniques, each varying by the method and level of security. The most common techniques include:

  • Encryption: Renders the data useless unless the viewer has the encryption key. This technique is the most secure, as it uses an algorithm to mask the data fully. It’s also the most complicated, as it relies on technology like encryption software to perform ongoing security measures. 
  • Scrambling: Rearranges characters in a randomized order. This method is simple and not as secure as encryption.      
  • Nulling: Presents specific values as missing (null) when viewed by certain users. 
  • Value variance: Original values are concealed by providing a function instead, like the difference between the highest and lowest value in a series.               
  • Substitution: Values are replaced with fake details that seem realistic. For example, names might be replaced by a random selection of other names.
  • Shuffling: Instead of replacing data values with fake alternatives, the actual values within the set are shuffled to represent existing records while safeguarding sensitive information.

Data masking best practices

Certain measures can be taken to ensure data masking processes are effective. For the best results, the following safety precautions should be adhered to:

  • Plan ahead: An organization should identify information that requires protecting before beginning the masking process. Additional information that needs gathering includes who will be authorized to view specific details, where it will be stored, and which applications will be involved. 
  • Prioritize referential integrity: All information types should be masked using one standard algorithm. While the same masking tool might not be an option for large businesses, all masking tools should be synchronized to share data across department lines without issue.
  • Secure the algorithms: Algorithms, alternative data sets, and keys must be secured to prevent unauthorized users from reverse engineering sensitive information.
MKC

Martha Kendall Custard

Martha Kendall Custard is a former freelance writer for G2. She creates specialized, industry specific content for SaaS and software companies. When she isn't freelance writing for various organizations, she is working on her middle grade WIP or playing with her two kitties, Verbena and Baby Cat.

Data Masking Software

This list shows the top software that mention data masking most on G2.

Data security and privacy for data in use by both mission-critical and line-of-business applications.

Oracle Data Masking and Subsetting helps database customers improve security, accelerate compliance, and reduce IT costs by sanitizing copies of production data for testing, development, and other activities and by easily discarding unnecessary data.

Enhance data protection by de-sensitizing and de-identifying sensitive data, and pseudonymize data for privacy compliance and analytics. Obscured data retains context and referential integrity remain consistent, so the masked data can be used in testing, analytics, or support environments.

Data Safe is a unified control center for your Oracle Databases which helps you understand the sensitivity of your data, evaluate risks to data, mask sensitive data, implement and monitor security controls, assess user security, monitor user activity, and address data security compliance requirements. Whether you’re using Oracle Autonomous Database or Oracle Database Cloud Service (Exadata, Virtual Machine, or Bare Metal), Data Safe delivers essential data security capabilities as a service on Oracle Cloud Infrastructure.

IBM InfoSphere Optim Data Privacy protects privacy and support compliance using extensive capabilities to de-identify sensitive information across applications, databases and operating systems

CA Test Data Manager uniquely combines elements of data subsetting, masking, synthetic, cloning and on-demand data generation to enable testing teams to meet the agile testing needs of their organization. This solution automates one of the most time-consuming and resource-intensive problems in Continuous Delivery: the creating, maintaining and provisioning of the test data needed to rigorously test evolving applications.

BizDataX makes data masking/data anonymization simple, by cloning production or extracting only a subset of data. And mask it on the way, achieving GDPR compliance easier.

Imperva Data Protection analyzes all user access to business-critical web applications and protects applications and data from cyber attacks.

Sensitive Data Discovery, Data Masking. Access Controls.

lyftrondata modern data hub combines an effortless data hub with agile access to data sources. Lyftron eliminates traditional ETL/ELT bottlenecks with automatic data pipeline and make data instantly accessible to BI user with the modern cloud compute of Spark & Snowflake. Lyftron connectors automatically convert any source into normalized, ready-to-query relational format and provide search capability on your enterprise data catalog.

Apache Ranger is a framework designed to enable, monitor and manage comprehensive data security across the Hadoop platform.

SQL Server 2017 brings the power of SQL Server to Windows, Linux and Docker containers for the first time ever, enabling developers to build intelligent applications using their preferred language and environment. Experience industry-leading performance, rest assured with innovative security features, transform your business with AI built-in, and deliver insights wherever your users are with mobile BI.

The Satori Data Security Platform is a highly-available and transparent proxy service that sits in front of your data stores (databases, data warehouses and data lakes).

Gearset is the most trusted DevOps platform with a full suite of powerful solutions for every team developing on Salesforce. Deploy: Achieve fast, reliable metadata and data deployments, including sandbox seeding, Vlocity, CPQ and Flows. Automate: Speed up your end-to-end release management with CI/CD and pipelines, for both regular releases and long term projects. Data management: Securely back up, archive, and restore your data with confidence.

Database Management Systems Software

Snowflake’s platform eliminates data silos and simplifies architectures, so organizations can get more value from their data. The platform is designed as a single, unified product with automations that reduce complexity and help ensure everything “just works”. To support a wide range of workloads, it’s optimized for performance at scale no matter whether someone’s working with SQL, Python, or other languages. And it’s globally connected so organizations can securely access the most relevant content across clouds and regions, with one consistent experience.

DataSunrise Database Security Software secures the databases and data in real-time with high performance. DataSunrise solution is a last line of defense against unwanted data and database access from outside or inside.

SQL Secure does not install any components, DLLs, scripts, stored procedures or tables on the SQL Server instances being monitored.

Integrate all your cloud and on-premises data with a secure cloud integration platform-as-a-service (iPaaS). Talend Integration Cloud puts powerful graphical tools, prebuilt integration templates, and a rich library of components at your fingertips. Talend Cloud's suite of apps also provide market-leading data integrity and quality solutions, ensuring that you can make data-driven decisions with confidence.