Static Malware Analysis of PDF Document
In this blog, I have discussed about static analysis approach of malicious pdf. In Cyber world there are various type of attack vector can be performed via the pdf file like, malicious pdf can be attached with attachment, email & webpage etc, that attached document contains the malicious payload which might be executed after the file get opened & call to secondary payload.

In this blog we will perform step by step static analysis of the malware sample and find its RCA.
Step: 1) First we will calculate the hash value of the sample and check for reputation on TI platform, I have one malicious pdf for analysis

Now check hash “99a84407ad137c16c54310ccf360f89999676520” reputation on virus total

As per the TI result it is a highly malicious & seems java script is embedded with this pdf file. Once pdf is opened java script will execute in backend.
Step: 2) Now for further investigation I will check what kind of strings this badpdf.pdf file have. Go to mal_pdf folder & type Command
$ strings -a badpdf.pdf | less

After hit enter, we will get this output. On the top we see header “%PDF-1.3” it shows the pdf file version and see the keyword OpenAction means once the pdf file is open it will run the embedded java script payload in backend.

Step: 3) Now we will check metadata of this sample, Nothing found suspicious


Step: 4) Collect pdf information using command “peepdf badpdf.pdf”

File have few objects and having vulnerability & its CVE number, might be present pdf version is vulnerable to this CVE
Step: 5) Now I am using pdfid command, it will scan a file to look for certain PDF keywords counts.

As per result there are three Java scripts are present & multiple keyword are detected and see java script are embedded in this pdf, Open action used for to force the pdf to execute something when pdf is opened.
Step: 6) For further investigation we see the “OpenAction” content use command
“pdf-parser.py –search openaction badpdf.pdf”

As see mention data are present in OpenObject keyword and it will open JS and seems it is calling zfnvkWYOKv() function
Step: 7) For further investigation see the java scirpt content. Now using this pdf-parser.py, We will find the object of this Java script & its content Use command
“pdf-parser.py –search javascript badpdf.pdf”


As it is clear there are three java script object which object is (1, 7,12) all having JS inside, Object 7 & 12 are referencing to object 10 & 13
Step: 8) Now we will look at object 10 & 13 for further investigation. We check content of object 10

We see object 10 & 12 are ultimately call to object 13 so final java script are inside the object 13

Step: 9) As see malicious script are inside the object 13, Now we will check the content of object 13

FlateDecode means it is zlib compression was used and length is 1183 bytes, So by default pdf-parser will not reflect the raw logs of decoded file.
For further investigation we will check the raw content of decode file object 13


As see content are obfuscated and need proper formating
Step: 10) Will try to format this obfuscated content, dump object 23 data in file obj13.js, use command “pdf-parser.py — object 13 -f -w -d obj13.js badpdf.pdf”

Step: 11) Now I will open file obj13.js, As see java code is obfuscated by the CharCode.

We will decode the shellcode of funtion
Step: 12) Will use modified spider monkey to directly run the java script file and see there are three new files are created & these are ASCII, Binary and Unicode representation of java script file.


Now we will look at binary version of shell code to find IOC, use Hexdump to see ASCII representation as well.

In highlighted we see something interesting that is Indicator of compromise, IOC is prove that machine is being infected by malware, also we can use the string command to see string in this binary file.

Now finally in java script object 13 this is URL that shows this shellcode is trying to reach & trying to download secondary payload for this.
Result:

In our organization, we will check out if anybody reach out to this URL and also block this IOC on perimeter level security devices.