Do you have an important file that you want to determine if its been tampered with or obtained errors during transmission? This can be done by validating the data integrity of the file. This post is the second in a series of posts that focuses on security around Mule applications. The previous post focused on hash functions in the DataWeave crypto library. If interested in the previous post, feel free to check it out here. This post will discuss the two Checksum operations in Mule 4's Cryptography module .
What is a checksum?
A checksum is a unique value resulting from running a cryptographic hash function. This function takes in a string representation (or file) and assigns it a hash value. If the hash value of the checksum of the original file is known, then it can be validated.
In this case the sender runs his/her file through the hash function utility and receives an assigned value. This value is provided to the receiver. Once the file is downloaded the receiver takes the file and runs it through a hash function utility. The result is compared with the checksum value of the original file.
Validation Outcomes
A successful validation occurs when the expected checksum (checksum of the original file) matches the actual checksum of the copy of the file that was sent to the receiver.
An unsuccessful validation occurs when the expected checksum (checksum of the original file) does not match the actual checksum of the copy of the file that was sent to the receiver.
The receiver will use the validation results to determine if the data integrity of the file has been upheld.
How does the receiver know the expected checksum?
The sender will provide the receiver with the expected checksum. In cases where the receiver downloads a file from a website, the sender may provide the expected checksum value as part of the content on the website. The screenshot below is the SHA-256 checksum that MuleSoft provides to developers downloading AnypointStudio-7.8.0-win64.zip. The highlighted portion of the screenshot represents the initial checksum value assigned to the file. In this scenario the checksum value is shown in plain sight.
If the receiver is downloading the file from an Application Programming Interface (API), the sender may include a checksum value in the response headers or body associated with the request. Check out these APIs:
How can we calculate and validate checksums in Mule 4?
As mentioned earlier, Mule offers a Cryptography module that allows developers to build in their flows the ability to calculate and validate checksums. Let's get started!
Installation
To utilize the Cryptography module, first download the module from the exchange. Inside your project, locate the Mule palette in Anypoint Studio. Click on "Search in Exchange." Search for "Cryptography Module." Select "Add >." Then click "Finish."
Check the Mule Palette to ensure the "Crypto" module exists in the palette.
Notice, the following dependency code snippet is placed in the pom.xml file. Once built, this dependency should have downloaded into your local .m2 repository.
<dependency>
<groupId>com.mulesoft.modules</groupId>
<artifactId>mule-cryptography-module</artifactId>
<version>1.3.6</version>
<classifier>mule-plugin</classifier>
</dependency>
Demo Time
During my research of checksums, I had the opportunity to view many great articles. My demo takes those articles into account and puts that knowledge into the context of Mule 4. Please review the last section entitled, "Want to learn more about checksums?" in this post to review my original sources. Also, you may wish to follow the steps below to recreate this project:
1. To start, create and save a simple word document file. This represents the original file that the sender will provide.
2. Then, duplicate the second file, except this time make a slight modification to the file. This file should be saved as your tampered file.
3. Outside of Mule, use a checksum utility. This tutorial uses the Windows PowerShell hash function utility called Get-FileHash to calculate the values produced for both files using the checksum utility. If you would like to recreate what I did using Get-FileHash use the following syntax:
Get-FileHash <absolute_file_path>
Notice, despite only a slight difference in the color of the "IO" in both files, the checksums (hash value) are completely different.
4. Place the following contents, including the original file's hash value, in a yaml (properties) file.
http:
port: "8081"
base_path: "/checksum"
original_path: "/original"
tampered_path: "/tampered"
doc:
original_file_path: "C:/Temp/sample.docx"
tampered_file_path: "C:/Temp/sample_tampered.docx"
checksum: "2D0874D2E4829A66E9607FE3714C1776E585E3F6E4CD4C06662FF22A3B6CF0E6"
*NOTE: Depending on the location of your files, you may need to modify the file paths.
5. Create original copy file flow.
<flow name="original-copy-checksum-flow"
doc:id="dc102ef8-6b6a-4fcd-8393-9b3047baa300">
<http:listener doc:name="Listener"
doc:id="ddf7a56d-101b-42a7-9850-d1c8b8cba80b"
config-ref="HTTP_Listener_config" path="${http.original_path}" />
<file:read doc:name="Original Copy"
doc:id="3058aa9c-ebbf-4724-8892-e417ec6c84a2"
path="${doc.original_file_path}" />
<crypto:calculate-checksum
doc:name="Calculate checksum"
doc:id="2585404a-47a5-413d-9be2-209842bf413a" target="actualChecksum" />
<logger level="INFO" doc:name="Actual File Checksum"
doc:id="ed08a39d-7fb4-4693-8556-463aebe43d37"
message='#["The received file actual checksum: " ++ vars.actualChecksum]' />
<crypto:validate-checksum
doc:name="Validate checksum"
doc:id="ba1755f1-720f-46a1-b6df-d962e1b8b90c"
expected="#[lower(p('doc.checksum'))]" />
<logger level="INFO" doc:name="Expected File Checksum"
doc:id="39d67f80-59f7-4181-b882-7cd791bf498e"
message='#["The received file expected checksum from the original file: " ++ vars.actualChecksum]' />
<error-handler>
<on-error-propagate enableNotifications="true"
logException="true" doc:name="On Error Propagate"
doc:id="c0ef90b0-97b8-4f7f-9a78-70fa548cbfd5">
<logger level="INFO" doc:name="Error Message"
doc:id="25e5e62c-6957-449f-952b-ecb9c5131e96"
message='#["Invalid Checksum. This file has an error or was tampered with by a 3rd party"]' />
</on-error-propagate>
</error-handler>
</flow>
6. Create tampered file flow.
<flow name="tampered-checksum-flow"
doc:id="52ede646-4a03-47d8-84ad-d973b0bbd4cb">
<http:listener doc:name="Listener"
doc:id="e338c9d8-6cdc-4776-b882-2920f33cf491"
config-ref="HTTP_Listener_config" path="${http.tampered_path}" />
<file:read doc:name="Tampered File"
doc:id="ae732123-848e-47bf-a7d5-5eea0478c4ca"
path="${doc.tampered_file_path}" />
<crypto:calculate-checksum
doc:name="Calculate checksum"
doc:id="fbb1e55c-9b48-44ec-81b0-3ec50afeeb8c" target="actualChecksum" />
<logger level="INFO" doc:name="Actual File Checksum"
doc:id="5df047c8-ba0e-487e-a292-d99da31146d7"
message='#["The received file actual checksum: " ++ vars.actualChecksum]' />
<crypto:validate-checksum
doc:name="Validate checksum"
doc:id="e15a1c2d-00a5-4280-b2be-8aa5b24d44f4"
expected="#[lower(p('doc.checksum'))]" />
<logger level="INFO" doc:name="Expected File Checksum"
doc:id="e8144249-b4be-499e-a1a6-609b05dd255e"
message='#["The received file expected checksum from the original file: " ++ vars.actualChecksum]' />
<error-handler>
<on-error-propagate enableNotifications="true"
logException="true" doc:name="On Error Propagate"
doc:id="a7631f3e-10db-494a-a3f4-8501e9b14479">
<logger level="ERROR" doc:name="Error Message"
doc:id="0a455358-3931-42cc-9e1c-5918a598e3f5"
message='#["Invalid Checksum. This file has an error or was tampered with by a 3rd party"]' />
</on-error-propagate>
</error-handler>
</flow>
7. Run the Mule Application.
8. Now, lets look at the scenario where the checksums match. Invoke the original file flow.
curl http://localhost:8081/checksum/original
The "Calculate Checksum" operation calculates the checksum of the file and stores the value as "actualChecksum" variable. The "Validate Checksum" operation validates the expected checksum with the actual checksum.
Notice the two checksum values match with the exception of the case. This is because different hash utility tools may write their hexadecimal string in uppercase or lowercase. Hexadecimal values are case insensitive and this has no bearing on the actual binary hash value (which must match). Check out a stack overflow post on this subject here.
Below are the log statements from the project to prove the checksum values match.
INFO 2021-02-12 09:34:27,774 [[MuleRuntime].uber.01: [waio-crypto-module-demo].original-copy-checksum-flow.CPU_INTENSIVE @49047c46] [processor: original-copy-checksum-flow/processors/2; event: 5d6d3f20-6d3f-11eb-9cee-5800e3cf50ad] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: The received file actual checksum: 2d0874d2e4829a66e9607fe3714c1776e585e3f6e4cd4c06662ff22a3b6cf0e6
INFO 2021-02-12 10:11:00,416 [[MuleRuntime].uber.02: [waio-crypto-module-demo].original-copy-checksum-flow.CPU_INTENSIVE @49047c46] [processor: original-copy-checksum-flow/processors/4; event: 5d6d3f20-6d3f-11eb-9cee-5800e3cf50ad] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: The received file expected checksum from the original file: 2d0874d2e4829a66e9607fe3714c1776e585e3f6e4cd4c06662ff22a3b6cf0e6
9. Next, let's look at the case where a file is tampered by a third party or downloaded with an error. Invoke the tampered flow.
curl http://localhost:8081/checksum/tampered
Similar to the original flow, this flow calculates the checksum of the file (this time the tampered file), stores it in a variable callled "actualChecksum". Then the flow validates the actual checksum with the expected checksum.
Notice prior to validating the checksum, the "calculate checksum" operation is returning a checksum that is different from the expected checksum in the yaml file.
The "validate checksum" operation throws the following error.
org.mule.runtime.core.internal.message.ErrorBuilder$ErrorImplementation
{
description=Checksum does not match (SHA_256), expected '2d0874d2e4829a66e9607fe3714c1776e585e3f6e4cd4c06662ff22a3b6cf0e6' but was '25e9a5240a504cb9ae25a8a69d39b26dd277cecd94648e21d60fb7cb14d90e83'
detailedDescription=Checksum does not match (SHA_256), expected '2d0874d2e4829a66e9607fe3714c1776e585e3f6e4cd4c06662ff22a3b6cf0e6' but was '25e9a5240a504cb9ae25a8a69d39b26dd277cecd94648e21d60fb7cb14d90e83'
errorType=CRYPTO:VALIDATION
cause=org.mule.runtime.api.exception.MuleRuntimeException
errorMessage=-
childErrors=[]
}
The expected checksum as provided by the sender does not match the actual checksum received by the receiver. This could indicate there was an error in the file OR a 3rd party has tampered with the file prior to download.
This concludes the demonstration. Thank you for reading this post. Check out the source code here.
Interested in watching a presentation on this topic?
View my presentation entitled, "Has your file been tampered with? A practical way to ensure file integrity in Mule 4" at the Charlotte MuleSoft Meetup Group, presented in July 2021.
Want to learn more about checksums? Check out these resources:
Comentarios