{"id":13631,"date":"2026-06-16T04:03:32","date_gmt":"2026-06-16T04:03:32","guid":{"rendered":"https:\/\/serisec.com\/index.php\/2026\/06\/16\/33072\/"},"modified":"2026-06-16T04:03:32","modified_gmt":"2026-06-16T04:03:32","slug":"33072","status":"publish","type":"post","link":"https:\/\/serisec.com\/index.php\/2026\/06\/16\/33072\/","title":{"rendered":"Evil MSI Background: BASE64 Statistical Analysis, (Mon, Jun 15th)"},"content":{"rendered":"<p>    Evil MSI Background: BASE64 Statistical Analysis, (Mon, Jun 15th)<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p>I like it when a fellow handler posts a diary entry about images with malicious content. Last one is Xavier: &#8220;<a href=\"https:\/\/isc.sans.edu\/diary\/The%20Evil%20MSI%20Background%20is%20Back!\/33054\">The Evil MSI Background is Back!<\/a>&#8220;.<\/p>\n<p>I like to have a go at the sample with my tools, and see if there are any improvements I can make to my tools.<\/p>\n<p>Let&#8217;s take a look at the bytes present in this suspicious JPEG file, using my tool <a href=\"https:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/byte-stats.py\">byte-stats.py<\/a>:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-192621.png?ssl=1\" style=\"width: 993px; height: 637px;\"><\/p>\n<p>The results: almost half of the content (45.65%) is BASE64 characters, and the longest BASE64 string is 1000 characters.<\/p>\n<p>And the longest string is almost 1 million characters long.<\/p>\n<p>Let&#8217;s take a look with <a href=\"https:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a>:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-192927.png?ssl=1\" style=\"width: 993px; height: 491px;\"><\/p>\n<p>The longest BASE64 string is indeed 1000 characters long but doesn&#8217;t seem to decode to something recognizable.<\/p>\n<p>A special encoding must have been used, and this is something you typically figure out by looking at the script or program that extracts and decodes the payload from this JPEG file.<\/p>\n<p>But what if you don&#8217;t have that script, what if you just have the JPEG file?<\/p>\n<p>Then you need a bit of skills and luck to figure out what encoding was used.<\/p>\n<p>You can try out all the encodings supported by <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a>:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260612-054205.png?ssl=1\" style=\"width: 993px; height: 163px;\"><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260612-054245.png?ssl=1\" style=\"width: 993px; height: 777px;\"><\/p>\n<p>We see long BASE85 encoded strings, but still no string close to 1 million character. So this must be a custom encoding.<\/p>\n<p>To try to figure out what custom encoding is used, I&#8217;ve added a &#8211;stats option to <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a>:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193034.png?ssl=1\" style=\"width: 993px; height: 521px;\"><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193107.png?ssl=1\" style=\"width: 993px; height: 521px;\"><\/p>\n<p>We see that all BASE64 characters appear in the detected BASE64 strings, but that the letter A appears significantly less than other letters.<\/p>\n<p>If we use a minimum length for the detected BASE64 strings, the letter A is even missing:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193129.png?ssl=1\" style=\"width: 993px; height: 521px;\"><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193200.png?ssl=1\" style=\"width: 1000px; height: 521px;\"><\/p>\n<p>Notice that the = character is also missing, but the = character is a padding character in BASE64, not a normal character: it can only appear once or twice at the end of a BASE64 string.<\/p>\n<p>So this statistics feature of <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a> helps us to detect that we might be dealing with a custom encoding, based on BASE64, where the letter A has been replaced with another character. Which character would that be? Let&#8217;s take another look at out first analysis:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-192621.png?ssl=1\" style=\"width: 993px; height: 637px;\"><\/p>\n<p>Character # is the most frequent. So probably A has been replaced with #.<\/p>\n<p>Let&#8217;s try that out:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193505.png?ssl=1\" style=\"width: 993px; height: 631px;\"><\/p>\n<p>Still no succes.<\/p>\n<p>Let&#8217;s run <a href=\"https:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/byte-stats.py\">byte-stats.py<\/a>:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193613.png?ssl=1\" style=\"width: 993px; height: 639px;\"><\/p>\n<p>This time we have a very long BASE64 string, almost 1 million characters long. But why isn&#8217;t <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a> detecting it?<\/p>\n<p><a href=\"https:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/byte-stats.py\">byte-stats.py<\/a> looks for longest strings, for example the longest string of consecutive BASE64 characters. But it doesn&#8217;t check if that string length is a multiple of 4 (that&#8217;s a requirement for BASE64). While <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a> does check this.<\/p>\n<p>So there must still be some kind of encoding we haven&#8217;t figured out. Let&#8217;s take a look at the string:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193710.png?ssl=1\" style=\"width: 993px; height: 777px;\"><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-193942.png?ssl=1\" style=\"width: 993px; height: 777px;\"><\/p>\n<p>If you are a bit familiar with BASE64 encoding, you will notice that the string has been reversed: == appears at the beginning, and not at the end. And the end is &#8230;qVT, which is TVq reversed, and that&#8217;s a marker for MZ, e.g., a Windows executable.<\/p>\n<p>So let&#8217;s reverse the encoded payload with <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/translate.py\">translate.py<\/a>:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-194011%281%29.png?ssl=1\" style=\"width: 993px; height: 213px;\"><\/p>\n<p>That&#8217;s indeed a PE file. And it has the same hash as the file Xavier extracted:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/isc.sans.edu\/diaryimages\/images\/20260611-194137.png?ssl=1\" style=\"width: 993px; height: 201px;\"><\/p>\n<p>This new feature of <a href=\"http:\/\/github.com\/DidierStevens\/DidierStevensSuite\/blob\/master\/base64dump.py\">base64dump.py<\/a>, &#8211;stats, can help with the reversing of custom encodings by providing statistics of the encoding characters.<\/p>\n<p>\u00a0<\/p>\n<p>Didier Stevens<br \/>\nSenior handler<br \/>\n<a href=\"http:\/\/blog.didierstevens.com\/\">blog.DidierStevens.com<\/a><\/p>\n<p> (c) SANS Internet Storm Center. https:\/\/isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.<\/p><\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><\/p>\n<p> \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/isc.sans.edu\/diary\/rss\/33072\">Go to isc.sans.edu<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Evil MSI Background: BASE64 Statistical Analysis, (Mon, Jun 15th) I like it when a fellow handler posts a diary entry about images with malicious content. Last one is Xavier: &#8220;The Evil MSI Background is Back!&#8220;. I like to have a go at the sample with my tools, and see if there are any improvements I [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[56],"tags":[69],"class_list":["post-13631","post","type-post","status-publish","format-standard","hentry","category-isc-sans-edu","tag-isc-sans-edu"],"_links":{"self":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts\/13631"}],"collection":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/comments?post=13631"}],"version-history":[{"count":0,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/posts\/13631\/revisions"}],"wp:attachment":[{"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/media?parent=13631"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/categories?post=13631"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/serisec.com\/index.php\/wp-json\/wp\/v2\/tags?post=13631"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}