As Software Integration architects and developers, our companies and clients entrust us with data. This data typically ranges from insensitive to top-secret. The level of sensitivity of a piece of data, often determines which secure practices development teams choose to use in building an integration solution.
Functional and non-functional requirements can also drive the decision the to expose or not, the entire or portions of sensitive field data while in transmission or at rest. One way this can be achieved is through masking.
Quick Navigation
Mule 4 offers the following ways to mask sensitive fields:
DataWeave Functions
JSON Logger - Mask fields
Anypoint Security (RTF) - Tokenization Service
This article is the third in a series that focuses on security around Mule applications. The previous article focused on validating the data integrity of a downloaded file. If you are interested in the previous post, feel free to check it out here. This article will focus on keeping data confidential using masking functions in DataWeave.
Video
If you are interested in the video version, feel free to watch the embedded video.
Article
Lets begin by introducing two DataWeave functions that can be of use and then two scenarios in which we can use them.
Introduction to Mask and Replace DataWeave Functions
Mask: DataWeave offers the mask function, a part of the dw::util::values module. It replaces the desired field with a masked version of it throughout the object or collection.
Replace: DataWeave offers the replace function, a part of the dw::core module. It replaces a portion (substring) of a String based on a regular expression with another String.
Now lets look at two use cases where we can see these functions in action.
Use Case: Mask the Entire United States Social Security Number (SSN)
The social security number is a major identification number in the United States. It is very sensitive and if someone gets a hold of a person's number, it can do significant damage to a person's identity. The following example shows how to mask a SSN.
Example Payload:
The demonstration provided in this post will come from the following payload.
[
{
"firstName" : "Staci",
"middleInitial" : "A",
"lastName" : "Cane",
"dateOfBirth" : "02/03/1999",
"ssn" : "000-00-0000"
},
{
"firstName": "Temi",
"middleInitial": "O",
"lastName": "Bukola",
"dateOfBirth" : "10/11/2020",
"ssn": "111-11-1111"
}
]
Solution: Mask Function
One way we can satisfy this use case is through using the mask function. To review, the mask function replaces the desired field with a masked version of it throughout the object or collection.
To get started place the following import statement above the "- - -" separator in the DataWeave script.
import * from dw::util::Values
There are multiple options we can use to locate the ssn numbers in an object or collection. Option 1 is to use the field name. Option 2 is to use the field() path element. The following examples both yield the same result. They both replace all field data labeled with "ssn" with the "***-**-****".
Option 1: Mask by field name
DWL Script
%dw 2.0
output application/json
import * from dw::util::Values
---
payload mask "ssn" with "***-**-****"
Option 2: Mask by path element
DWL Script
%dw 2.0
output application/json
import * from dw::util::Values
---
payload mask field("ssn") with "***-**-****"
Now, what happens if you want to mask a certain a portion of the field?
Use case: Mask a portion of the United States Social Security Number
Its typical to mask an entire SSN except the last 4-digits. The last 4-digits can be used as verification between a customer and an employee helping the customer without divulging too much information. Lets take a deeper look at mask and introduce replace.
Example Payload
The demonstrations provided in this post will come from the following payload.
[
{
"firstName" : "Staci",
"middleInitial" : "A",
"lastName" : "Cane",
"dateOfBirth" : "02/03/1999",
"ssn" : "000-00-0000"
},
{
"firstName": "Temi",
"middleInitial": "O",
"lastName": "Bukola",
"dateOfBirth" : "10/11/2020",
"ssn": "111-11-1111"
}
]
Solution 1: Replace Function
To review, the replace function takes a portion (substring) of a String based on a regular expression and replaces it with another String.
Example
This example applies the replace function to the entire ssn field with the exception of the last 4-digits to each object inside the collection utilizing the map function.
DWL Script
%dw 2.0
output application/json
---
payload map {
firstName: $.firstName,
middleInitial: $.middleInitial,
lastName: $.lastName,
dateOfBirth: $.dateOfBirth,
ssn : $.ssn replace /^[0-9]{3}-[0-9]{2}/ with "***-**"
}
Solution 2: Mask + Replace Functions
Can we do this without having to apply the map function to our collection? This is where "$" comes in handy. Using the "with" helper function we can apply $ to get the current value and perform an operation on it.
Example
This example uses a combination of mask and replace, where it selects all "ssn" values and masks everything except the last 4-digits. Instead of supplying the "with" helper function with a String value, an expression is passed in. The expression starts by taking the current value ("$"), then performs a replace based on the provided regular expression. The regular expression matches the 3 numbers, followed by a "-", followed by 2 additional numbers. If a match is found, the matching text is replaced by "***-**", exposing the last 4-digits.
DWL Script
%dw 2.0
output application/json
import * from dw::util::Values
---
payload mask "ssn" with ($ replace /^[0-9]{3}-[0-9]{2}/ with "***-**")
We did this without using the map function.
Note: This example assumes validation is already completed on the payload and the ssn has a valid format.
Conclusion
Well that's all folks! Together we have looked at two functions to help us achieve confidentiality through masking. Let me know what you think.
Subscribe to my blog in the footer section and/or social media to ensure you do not miss a post.