However some time in April 2022, Microsoft decided to change the way the obfuscation worked and the parser no longer worked (the unobfuscation part).
What changed?
OneDrive now appeared to encrypt the data and the ObfuscationStringMap.txt is no longer used. The file may still exist on older installations, but newer ones include a different file.
Figure 1 - Contents of \AppData\Local\Microsoft\OneDrive\logs\Business1 folder |
As seen in Figure 1 above, there is a new file called general.keystore. This file's format is JSON that can be easily read and apparently holds the key to decrypt the encrypted content as a base64 encoded string.
Figure 2 - Sample general.keystore contents |
Time for some Reverse Engineering
With a little bit of digging around with IDA Pro on the LoggingPlatform.dll file from OneDrive, we can see the BCrypt Windows APIs being used in this file. Note, this is not the bcrypt hash algorithm which bears the same name!
Figure 3 - BCrypt* Imports in LoggingPlatform.dll |
Jumping to where these functions are used, it is quickly apparent that the encryption used is AES in CBC (Cipher Block Chaining) mode with a key size of 128 bits.
Figure 4 - IDA Pro Disassembly |
In the above snippet, we can see the call to BCryptAlgorithmProvider and then if successful, a call to BCryptSetProperty function which has the following syntax:
NTSTATUS BCryptSetProperty([in, out] BCRYPT_HANDLE hObject,[in] LPCWSTR pszProperty,[in] PUCHAR pbInput,[in] ULONG cbInput,[in] ULONG dwFlags);
Without delving into too many boring assembly details, I'll skip to the relevant parts...
For each string to be encrypted, OneDrive initialises a new encryption object with the key that is stored in the general.keystore file, then encrypts the string and disposes of the encryption object. The encrypted blob is then base64 encoded and written out to the log the obfuscated string. There are a few other quirks along the way, such as replacement of the characters / and + with _ and - respectively, as the former can appear in base64 text but are also used in URLs to make it parseable later.
Why the change?
In the previous iteration of ODL (when the ObfuscationStringMap was used), there were instances where the same key (3 word combination) was often repeated in the file making it difficult or impossible to know which value to use as its replacement to get the original string.
Using encryption in place and not using a lookup table does appear to be a more robust scheme which eliminates the above issue. It does use some more disk space as the encrypted blob will always be a multiple of 16 bytes (128 bits) as this is block based encryption. In other words, it's inefficient for small text (less than 10 bytes).