1 Attachment(s)
Mafia DTA files format. Reversing encryption and packing algorithm
Mafia DTA files format. Reversing encryption and packing algorithm. Creating DTA Unpacker on C++.
Skills: Asm, C++, basics of OllyDbg and hex editing
Time: ….
Tools: OllyDBG, Ida (if you have Ida…) , Hex Editor (e.g. 010Editor)
Affected: Mafia, Chameleon, Hidden & Dangerous 2
Content.
- Prologue
- Preparation
- Part I. Basics
- Part II. Code tracing. Unpacking data
- Part III. Structure analyzing. Coding the dta unpacker
Prologue.
At the beginning I could say that this document should be a detailed guide, aimed to show basics aspects of understanding unknown file formats, reversing encryption and packing algorithm inside executable file, creating unpacker on C++.
Second note: We will base our analyzing on Chameleon. I don’t know what this game actually is, but the executable file and dll are quite small, so I can provide them with this tutorial. Anyway, Mafia or Hidden & Dangerous 2 has same (nearly same) dll’s, and doesn’t matter which game I used.
Preparation.
Assume that you have already took a quick look at our aim – Chameleon’s setup files, which consists of .dta archive, executable module ChameleonSetup.exe and rw_data.dll (there was other files, but they are not interesting in our case)
dta examination give no result, except file header, which equal “ISD1”
As we can see, all data inside archive are packed or even encrypted, and without detailed reversing of executable files we are unable to get something valuable.
Part I. Basics,
So, shall we begin?
From this moment I will explain major ways, or you can call them basic steps, in retrieving, catching and tracking data, which loaded by application. These steps can be implemented in any application and in any cases, and if you will get an idea how it’s going, you will be able to do everything what you want in a future.
Of course, if you already know how to set breakpoints on “CreateFile” call, and you know how to search string etc, just skip these steps and immediately turn to sections bellow.
Open “ChameleonSetup.exe” in OllyDbg. Let’s try to find functions, which are liable for operation with .dta archive. It means that we should find some codes with dta initialization, or something where “.dta” appears.
Choose “Search for -> All referenced text strings” in context menu,
then scroll string list to the top and call “Search for text” from the context menu
Uncheck “Case sensitive” and press OK. Now, just set breakpoints (BP) on each address with “.dta”.
You can limit yourself with following 4 BP
Code:
Address=00405749, Text string=UNICODE "\isdata.dta"
Address=00405764, Text string=UNICODE "isdata.dta"
Address=00405951, Text string=UNICODE "\isdata.dta"
Address=0040596C, Text string=UNICODE "isdata.dta"
But to be sure in successful result, we need additional BP in kernel to catch all files IO calls.
By using command box, enter and call following lines one by one
- BP CreateFileW
- BP ReadFile
- BP SetFilePointer
Let me make small remark about each function
CreateFileW - Creates or opens an object, and returns a handle that can be used to access that object.
Code:
HANDLE CreateFileW
(
LPCWSTR filename, //[In] pointer to filename to be accessed.
DWORD access, //[In] access mode requested.
DWORD sharing, //[In] share mode.
LPSECURITY_ATTRIBUTES sa, //[In] pointer to security attributes.
DWORD creation, //[In] how to create the file.
DWORD attributes, //[In] attributes for newly created file.
HANDLE template //[In] handle to file with extended attributes to copy.
)
ReadFile - Reads data from the specified file or input/output (I/O) device. Reads occur at the position specified by the file pointer if supported by the device.
Code:
BOOL WINAPI ReadFile(
__in HANDLE hFile,
__out LPVOID lpBuffer,
__in DWORD nNumberOfBytesToRead,
__out_opt LPDWORD lpNumberOfBytesRead,
__inout_opt LPOVERLAPPED lpOverlapped
);
SetFilePointer - Moves the file pointer of the specified file.
Code:
DWORD WINAPI SetFilePointer(
__in HANDLE hFile,
__in LONG lDistanceToMove,
__inout_opt PLONG lpDistanceToMoveHigh,
__in DWORD dwMoveMethod
);
BP on these functions provides us with ample opportunity to catch nearly everything, e.g. we can get the data right after it is read.
What we got at this moment: breakpoints in functions, which are manipulating with .dta archive names. Even if we fail and these BP will not work, we will definitely catch file accessing routines due BP in system functions.
Note: finding strings with archives names or archive extension is a common way in identifying “archives functions”. It works in 70%. In other 30% we need breakpoints in kernel functions
Run application by pressing F9. We will immediately break on CreateFileW.
Code:
7C8107F0 > $ 8BFF mov edi, edi
7C8107F2 . 55 push ebp
7C8107F3 . 8BEC mov ebp, esp
7C8107F5 . 83EC 58 sub esp, 0x58
7C8107F8 . 8B45 18 mov eax, dword ptr [ebp+0x18]
7C8107FB . 48 dec eax
7C8107FC . 0F84 46FF0100 je 7C830748
In stack we have all parameters with which the function was called.
Code:
0012D110 7C801A53 /CALL to CreateFileW from kernel32.7C801A4E
0012D114 7FFDFC00 |FileName = "Chameleon\ISdata.dta"
0012D118 80000000 |Access = GENERIC_READ
0012D11C 00000001 |ShareMode = FILE_SHARE_READ
0012D120 00000000 |pSecurity = NULL
0012D124 00000003 |Mode = OPEN_EXISTING
0012D128 10000080 |Attributes = NORMAL|RANDOM_ACCESS
0012D12C 00000000 \hTemplateFile = NULL
0012D130 003C06D8
0012D134 100021D7 RETURN to rw_data.100021D7 from kernel32.CreateFileA
Wonderful! Application tries to open “ISdata.dta” and the main call was made in rw_data.100021D7 (RETURN to rw_data.100021D7 from kernel32.CreateFileA).
Click in stack window on line with address 0012D134, and call context menu to follow return pointer in disassembler (or simple press Enter):
I think that we can remove BP in kernel CreateFileW and just set BP on CreateFileA in rw_data
Code:
100021AE |. 53 push ebx ; /hTemplateFile
100021AF |. 894D 08 mov dword ptr [ebp+0x8], ecx ; |
100021B2 |. 8B7D 0C mov edi, dword ptr [ebp+0xC] ; |
100021B5 |. 0BFA or edi, edx ; |
100021B7 |. 68 80000010 push 10000080 ; |Attributes = NORMAL|RANDOM_ACCESS
100021BC |. 6A 03 push 0x3 ; |Mode = OPEN_EXISTING
100021BE |. 897D 0C mov dword ptr [ebp+0xC], edi ; |
100021C1 |. 8BBC24 6C010000 mov edi, dword ptr [esp+0x16C] ; |
100021C8 |. 53 push ebx ; |pSecurity
100021C9 |. 6A 01 push 0x1 ; |ShareMode = FILE_SHARE_READ
100021CB |. 68 00000080 push 0x80000000 ; |Access = GENERIC_READ
100021D0 |. 57 push edi ; |FileName
100021D1 |. FF15 10000110 call dword ptr [<&KERNEL32.CreateFileA>] ; \CreateFileA
You can restart application and see how it works (you will break again on 100021D1 with accessing “ISdata.dta");
Now, by pressing F8, trace until ReadFile call on 10002205 (also you can remove BP from kernel32.ReadFile, ‘cuz we already found it)
Code:
0012D140 00000070 |hFile = 00000070 (window)
0012D144 0012D168 |Buffer = 0012D168
0012D148 00000004 |BytesToRead = 4
0012D14C 0012D16C |pBytesRead = 0012D16C
0012D150 00000000 \pOverlapped = NULL
Actualize in dump buffer address 0012D168 (Buffer = 0012D168) and make one more step with F8. Now ReadFile has been called and in dump we have dword with 49534431
It’s logical to assume that if we read something, we should check it. Below we will stumble on checking routine (begins from 10002246)
Code:
10002246 |> \8B4424 14 mov eax, dword ptr [esp+0x14] ; move first dword to eax
1000224A |. C745 20 FFFFFFFF mov dword ptr [ebp+0x20], -0x1
10002251 |. 3D 49534430 cmp eax, 0x30445349 ; compare with ISD0
10002256 |. 75 05 jnz short 1000225D
10002258 |. 895D 20 mov dword ptr [ebp+0x20], ebx ; if ISD0, mov 0
1000225B |. EB 0E jmp short 1000226B
1000225D |> 3D 49534431 cmp eax, 0x31445349 ; compare with ISD1
10002262 |. 75 07 jnz short 1000226B
10002264 |. C745 20 01000000 mov dword ptr [ebp+0x20], 0x1 ; if ISD1, mov 1
We have ISD1… Now trace until the RET (on 100023E5) and leave this function. We still in rw_data, more exactly – in rw_data.dtaCreate. OK, completely leave rw_data.dtaCreate and trace Chameleon until (004052F6).
Code:
004052F2 . 33C7 xor eax, edi
004052F4 . 52 push edx
004052F5 . 50 push eax
004052F6 . FF53 0C call dword ptr [ebx+0xC]
Why this call? Of course, as far as possible, we should check every call in order to know what happen in every routine. And you can do it manually, but only instruction on 004052F6 will lead us to the useful content. You can press F7 to trace into 10002440, or press F8 if you want to break directly on kernel function SetFilePointer.
Current function calls SetFilePointer and sets pointer in file on second dword, then again sets pointer to zero… Never mind, ‘cuz we should pay attention on ReadFile (100025BB)
Code:
100025B5 . 57 push edi ; /pOverlapped
100025B6 . 51 push ecx ; |pBytesRead
100025B7 . 6A 18 push 0x18 ; |BytesToRead = 18 (24.)
100025B9 . 55 push ebp ; |Buffer
100025BA . 52 push edx ; |hFile
100025BB . FF15 08000110 call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
This read into buffer 0x18 bytes from the beginning of “ISData.dta”. Stack:
Code:
0012D240 0000004C |hFile = 0000004C (window)
0012D244 003C08C0 |Buffer = 003C08D0
0012D248 00000018 |BytesToRead = 18 (24.)
0012D24C 0012D264 |pBytesRead = 0012D264
0012D250 00000000 \pOverlapped = NULL
Instruction on 100025C3
Code:
100025C3 . /75 32 jnz short 100025F7 ; data read
denotes, that data has been successfully read, and the next instruction
Code:
cmp dword ptr [esp+0x10], 0x18
checks how many bytes has been read (in our case 0x18 byres)
OK, probably we close to decrypting/unpacking routine. What we have right now: piece of code, which moves to stack some values and call function. Let examine it more precisely.
Code:
10002630 > \8B4424 74 mov eax, dword ptr [esp+0x74] ; mov some dword1
10002634 . 8D75 04 lea esi, dword ptr [ebp+0x4] ; esi: file in buffer + 0x4
10002637 . B9 05000000 mov ecx, 0x5
1000263C . 8D7C24 3C lea edi, dword ptr [esp+0x3C]
10002640 . F3:A5 rep movs dword ptr es:[edi], dword ptr [esi] ; copy 0x14 byte from second dword to the stack
10002642 . 8B4C24 70 mov ecx, dword ptr [esp+0x70] ; mov some dword2
10002646 . 50 push eax ; to stack: dword1
10002647 . 51 push ecx ; to stack: dword2
10002648 . 8D5424 44 lea edx, dword ptr [esp+0x44] ; get buffer address
1000264C . 6A 14 push 0x14 ; size
1000264E . 52 push edx ; to stack: buffer address
1000264F . E8 5C690000 call 10008FB0
Our buffer in stack contains 0x14 bytes from “ISData.dta” (beginning from the second dword)
Code:
0012D290 F6 DD 75 DE F2 44 DC DE 82 DD 75 DE D2 21 DC DE цЭuЮтDЬЮ‚ЭuЮТ!ЬЮ
0012D2A0 4B D5 75 DE KХuЮ
Interesting. Let’s trace into “call 10008FB0”. Inside we have
Code:
10008FB0 /$ 8B4C24 08 mov ecx, dword ptr [esp+0x8]
10008FB4 |. 55 push ebp
10008FB5 |. 8BC1 mov eax, ecx
10008FB7 |. 56 push esi
10008FB8 |. C1E8 03 shr eax, 0x3
10008FBB |. 57 push edi
10008FBC |. 8D14C5 00000000 lea edx, dword ptr [eax*8]
10008FC3 |. 2BCA sub ecx, edx
10008FC5 |. 895424 14 mov dword ptr [esp+0x14], edx
10008FC9 |. 8BE9 mov ebp, ecx
10008FCB |. 8BC8 mov ecx, eax
10008FCD |. 48 dec eax
10008FCE |. 85C9 test ecx, ecx
10008FD0 |. 74 36 je short 10009008
10008FD2 |. 8B5424 10 mov edx, dword ptr [esp+0x10]
10008FD6 |. 8B7C24 1C mov edi, dword ptr [esp+0x1C]
10008FDA |. 53 push ebx
10008FDB |. 8B5C24 1C mov ebx, dword ptr [esp+0x1C]
10008FDF |. 8D14C2 lea edx, dword ptr [edx+eax*8]
10008FE2 |. 8D70 01 lea esi, dword ptr [eax+0x1]
10008FE5 |> 8B02 /mov eax, dword ptr [edx]
10008FE7 |. 8B4A 04 |mov ecx, dword ptr [edx+0x4]
10008FEA |. F7D0 |not eax
10008FEC |. F7D1 |not ecx
10008FEE |. 33C3 |xor eax, ebx
10008FF0 |. 33CF |xor ecx, edi
10008FF2 |. F7D0 |not eax
10008FF4 |. F7D1 |not ecx
10008FF6 |. 8902 |mov dword ptr [edx], eax
10008FF8 |. 894A 04 |mov dword ptr [edx+0x4], ecx
10008FFB |. 4E |dec esi
10008FFC |. 83EA 08 |sub edx, 0x8
10008FFF |. 85F6 |test esi, esi
10009001 |.^ 77 E2 \ja short 10008FE5
10009003 |. 8B5424 18 mov edx, dword ptr [esp+0x18]
10009007 |. 5B pop ebx
10009008 |> 8B4424 10 mov eax, dword ptr [esp+0x10]
1000900C |. 8D0C02 lea ecx, dword ptr [edx+eax]
1000900F |. 8BD5 mov edx, ebp
10009011 |. 4D dec ebp
10009012 |. 85D2 test edx, edx
10009014 |. 74 21 je short 10009037
10009016 |. 8D7C24 18 lea edi, dword ptr [esp+0x18]
1000901A |. 8D0429 lea eax, dword ptr [ecx+ebp]
1000901D |. 2BF9 sub edi, ecx
1000901F |. 8D75 01 lea esi, dword ptr [ebp+0x1]
10009022 |> 8A08 /mov cl, byte ptr [eax]
10009024 |. F6D1 |not cl
10009026 |. 8808 |mov byte ptr [eax], cl
10009028 |. 8A1407 |mov dl, byte ptr [edi+eax]
1000902B |. 32D1 |xor dl, cl
1000902D |. 4E |dec esi
1000902E |. F6D2 |not dl
10009030 |. 8810 |mov byte ptr [eax], dl
10009032 |. 48 |dec eax
10009033 |. 85F6 |test esi, esi
10009035 |.^ 77 EB \ja short 10009022
10009037 |> 5F pop edi
10009038 |. 5E pop esi
10009039 |. 5D pop ebp
1000903A \. C2 1000 retn 0x10
Voila, function with cycles, XORs, NOTs… Also, this function has lots of local calls…
Now we should trace it and in every cycle check our incoming buffer with 0x14 encrypted data.
First cycle take third and fourth dword from buffer, not them, xor them with dwords DE75DDF2, DEDC644B (which has been passed to the main function). We can assume that these strange dwords are keys: key1 and key2.
Code:
10008FE5 |> /8B02 /mov eax, dword ptr [edx] ; 3th dword from buffer
10008FE7 |. |8B4A 04 |mov ecx, dword ptr [edx+0x4] ; 4th dword from buffer
10008FEA |. |F7D0 |not eax ; not dword3
10008FEC |. |F7D1 |not ecx ; not dword4
10008FEE |. |33C3 |xor eax, ebx ; xor (not dword3) with key1
10008FF0 |. |33CF |xor ecx, edi ; xor (not dword4) with key2
10008FF2 |. |F7D0 |not eax ; not(xor (not dword3) with key1)
10008FF4 |. |F7D1 |not ecx ; not(xor (not dword4) with key2)
10008FF6 |. |8902 |mov dword ptr [edx], eax ; write result: dword3
10008FF8 |. |894A 04 |mov dword ptr [edx+0x4], ecx ; write result: dword4
10008FFB |. |4E |dec esi ; decrease counter
10008FFC |. |83EA 08 |sub edx, 0x8
10008FFF |. |85F6 |test esi, esi
10009001 |.^\77 E2 \ja short 10008FE5
But our buffer isn’t completely decrypted.
Second cycle in current function
Code:
10009022 |> /8A08 /mov cl, byte ptr [eax] ; mov byte from the end of buffer
10009024 |. |F6D1 |not cl ; not byte
10009026 |. |8808 |mov byte ptr [eax], cl ; write it back
10009028 |. |8A1407 |mov dl, byte ptr [edi+eax] ; get byte from the key
1000902B |. |32D1 |xor dl, cl
1000902D |. |4E |dec esi
1000902E |. |F6D2 |not dl
10009030 |. |8810 |mov byte ptr [eax], dl
10009032 |. |48 |dec eax
10009033 |. |85F6 |test esi, esi
10009035 |.^\77 EB \ja short 10009022
do the same thing as previous cycle, but decrypt data byte-by-byte.
So, at the end our buffer contains
Code:
0012D290 04 00 00 00 B9 20 00 00 70 00 00 00 99 45 00 00 ...№ ..p...™E..
0012D2A0 B9 08 00 00 № ..
As a result, decryption turns into a simple steps:
- get encrypted byte
- NOT encrypted byte
- XOR by key
- NOT result
- save result
C++ example:
Code:
unsigned char keys[8];
((DWORD)keys)[0] = key2;
((DWORD)keys)[1] = key1;
for (unsigned int i = 0; i < size; i++)
data[i] = (unsigned char)(~((~data[i]) ^ key[i%8]));
From this point, we can fully describe parameters of our call on 1000264F
Code:
10002646 . 50 push eax ; to stack: key1
10002647 . 51 push ecx ; to stack: key2
10002648 . 8D5424 44 lea edx, dword ptr [esp+0x44] ; get buffer address
1000264C . 6A 14 push 0x14 ; size
1000264E . 52 push edx ; to stack: buffer address
1000264F . E8 5C690000 call 10008FB0 ; decrypt
Wait a minute! One reasonable question: how we get decryption keys? Basically we can forget about them, because they do not change for each Chameleon archive. Keys for each archive in Mafia and H&D2 we can picked from stack before calling decryption function. More details you can find in appendix.
Let’s get back to the subject.
After decryption we have only 20 decrypted bytes in DTA header
Code:
struct DtaHeader
{
char signature[4]; // “ISD1”
DWORD d1; // 04 00 00 00 - 4
DWORD d2; // B9 20 00 00 - 8377
DWORD d3; // 70 00 00 00 - 112
DWORD d4; // 99 45 00 00 - 17817
};
By tracing down we stopped on another SetFilePointer at 10002892
Code:
1000288C . 6A 00 push 0x0 ; /Origin = FILE_BEGIN
1000288E . 6A 00 push 0x0 ; |pOffsetHi = NULL
10002890 . 51 push ecx ; |OffsetLo
10002891 . 52 push edx ; |hFile
10002892 . FF15 04000110 call dword ptr [<&KERNEL32.SetFilePoint>; \SetFilePointer
Stack
Code:
0012D244 00000070 |hFile = 00000070 (window)
0012D248 000020B9 |OffsetLo = 20B9 (8377.)
0012D24C 00000000 |pOffsetHi = NULL
0012D250 00000000 \Origin = FILE_BEGIN
Function moves file pointer to 0x20B9 (dword d2 in DTA header). Next, ReadFile on 100028B5 loads data, beginning from 20B9, and loads 0x70 bytes (dword d3 in DTA header).
Stack
Code:
0012D240 00000070 |hFile = 00000070 (window)
0012D244 003C08D0 |Buffer = 003C08D0
0012D248 00000070 |BytesToRead = 70 (112.)
0012D24C 0012D264 |pBytesRead = 0012D264
0012D250 00000000 \pOverlapped = NULL
By switching to our hex editor, we notice, that 0x20B9 + 0x70 offset leads us to the end of dta. If you don’t want to trace in Olly and wait again unpacked data, you can take advantage of 010Editor script (or make your own small tool) and decrypt this block by yourself.
That’s all for this function. After returning to the Chameleon, we have DTA header and record with files data (file table).
As we can see, dta header contains following data:
Code:
struct DtaHeader
{
char signature[4];
DWORD numOfFiles; // Number of files in archive
DWORD ftOffset; // File table offset
DWORD ftSize; // File table size
DWORD extra1;
};
Let’s try to identify something in the file table. Each data entry has a fixed length of 28 bytes
50 00 01 00 18 00 00 00 3E 00 00 00 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
It’s clear, that the latest 16 bytes reserved for file name. By comparing other 3 entries, we can say that the third word is a file name length
Code:
typedef struct
{
ubyte unknown;
ubyte unknown;
WORD fileNameSize; //File name length
DWORD unknown;
DWORD unknown;
char fileName[16];
}
Part II. Code tracing. Unpacking data
In Part 1 we have finished with decryption, dta file header and got some internal structure of given archive.
Let’s continue.
Don’t forget that we work with installer, and after needed data has been decrypted, installer manually reads files from the archive.
Code:
004052F4 . 52 push edx ; push key1
004052F5 . 50 push eax ; push key2
004052F6 . FF53 0C call dword ptr [ebx+0xC] ; read dta header and file list
004052F9 . 84C0 test al, al ; data decrypted
004052FB . 74 38 je short 00405335
004052FD . 6A 00 push 0x0
004052FF . 68 3CD74000 push 0040D73C ; ASCII "idata.txt"
00405304 . E8 DB390000 call <jmp.&rw_data.w_data.> ; rw_data.dtaOpen, open "idata.txt"
We jumped again into rw_data (rw_data.dtaOpen). And first called kernel function is CreateFileA with following parameters
Code:
0012D028 0040D73C |FileName = "idata.txt"
0012D02C 80000000 |Access = GENERIC_READ
0012D030 00000001 |ShareMode = FILE_SHARE_READ
0012D034 00000000 |pSecurity = NULL
0012D038 00000003 |Mode = OPEN_EXISTING
0012D03C 00000080 |Attributes = NORMAL
0012D040 00000000 \hTemplateFile = NULL
If after execution we will get EAX = -1, the "idata.txt" will be read from the archive, otherwise – from hdd
Trace down until 1000377C
Code:
10003774 |> \8B8C24 80020000 mov ecx, dword ptr [esp+0x280]
1000377B |. 51 push ecx ; /Arg1
1000377C |. E8 8F020000 call 10003A10 ; \rw_data.10003A10
and step inside call 10003A10. It’s a really long function… If everything OK (there are some checks at the beginning, like “if needed file located in archive” and so on), you will come to another SetFilePointer call
Code:
10003CFC |. A1 A4BC0110 ||mov eax, dword ptr [0x1001BCA4]
10003D01 |. 53 ||push ebx ; /Origin
10003D02 |. 53 ||push ebx ; |pOffsetHi
10003D03 |. 8B8C07 18010000 ||mov ecx, dword ptr [edi+eax+0x118] ; |
10003D0A |. 8B1407 ||mov edx, dword ptr [edi+eax] ; |
10003D0D |. 8B4C0E 04 ||mov ecx, dword ptr [esi+ecx+0x4] ; |
10003D11 |. 51 ||push ecx ; |OffsetLo
10003D12 |. 52 ||push edx ; |hFile
10003D13 |. FF15 04000110 ||call dword ptr [<&KERNEL32.SetFilePointer>] ; \SetFilePointer
which will set file pointer to 0x44 (68.)
Code:
0012CD00 00000070 |hFile = 00000070 (window)
0012CD04 00000044 |OffsetLo = 44 (68.)
0012CD08 00000000 |pOffsetHi = NULL
0012CD0C 00000000 \Origin = FILE_BEGIN
OK, why 0x44? Take a look into record with "idata.txt", because this value was taken from the File table
Code:
91 02 09 00 44 00 00 00 72 00 00 00 49 44 41 54 41 2E 54 58 54 00 00 00 00 00 00 00
And we can define another value - structure offset
Code:
typedef struct
{
ubyte unknown;
ubyte unknown;
WORD fileNameSize; //File name length
DWORD structOffset; //structure offset, contains additional data
DWORD unknown;
char fileName[16];
}
Step over until ReadFile call at 10003E1E
Code:
10003E1E |. FF15 08000110 ||call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
Stack
Code:
0012CCFC 00000070 |hFile = 00000070 (window)
0012CD00 0012CDC0 |Buffer = 0012CDC0
0012CD04 00000020 |BytesToRead = 20 (32.)
0012CD08 0012CD28 |pBytesRead = 0012CD28
0012CD0C 00000000 \pOverlapped = NULL
We see, that there 32 bytes will be read from 68 offset and will be decrypted on 10003E77
Code:
10003E77 |. E8 34510000 ||call 10008FB0 ; decrypt
Now we got another structure, which adds some more information about file "idata.txt" inside archive.
Continue tracing until ReadFile call on 10003F96
Code:
10003F96 |. FF15 08000110 ||call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
If you are an attentive person, you will notice that the BytesToRead parameter calculated some instruction before and new data will be read from the current file pointer (right after previous 32 bytes). For our "idata.txt" we read 9 bytes (value comes from latest file structure + 0x1C), and decrypt them on 1000401B
Code:
1000401B |. E8 904F0000 ||call 10008FB0 ; decrypt
We got full filename
After that, our new filename converted to uppercase and compares with filename from the first data block (from File table)
Some instruction after we read one dword after filemame, and then read another byte at 100043AB. At this moment we can’t guess the purpose of these values.
And… maybe you will not believe in that, but it was only preparation before unpacking.
We decrypt file metadata (file name, size etc), we decrypt some other values and now we come nearer to unpacking routine
Let’s summarize our data.
Now just trace and trace, and soon without fail you land on this function
Code:
00403D60 /$ 6A FF push -0x1 ; read and unpack
00403D62 |. 68 5B8F4000 push 00408F5B ; SE handler installation
00403D67 |. 64:A1 0000000>mov eax, dword ptr fs:[0]
00403D6D |. 50 push eax
00403D6E |. 64:8925 00000>mov dword ptr fs:[0], esp
00403D75 |. 81EC 24080000 sub esp, 0x824
00403D7B |. 53 push ebx
00403D7C |. 56 push esi
00403D7D |. 57 push edi
00403D7E |. 68 48D74000 push 0040D748 ; UNICODE "Chameleon.exe"
00403D83 |. 6A 15 push 0x15
00403D85 |. E8 E6FBFFFF call 00403970
00403D8A |. 83C4 08 add esp, 0x8
00403D8D |. 33FF xor edi, edi
00403D8F |. 57 push edi
00403D90 |. 68 3CD74000 push 0040D73C ; ASCII "idata.txt"
00403D95 |. E8 4A4F0000 call <jmp.&rw_data.w_data.> ; check dta
Here we again check and retrieve data from dta. Then we get 5th dword from the “file extra data” block (for "idata.txt" it’s 0x00002BB4)
Code:
00403DB7 |> \55 push ebp
00403DB8 |. 57 push edi
00403DB9 |. 6A 02 push 0x2
00403DBB |. 56 push esi
00403DBC |. E8 1D4F0000 call <jmp.&rw_data.w_data.>
Subtract 2 from received value and allocate memory with size (0x00002BB4 – 2)
Code:
00403DC1 |. 83EB 02 sub ebx, 0x2 ; unpacked size - 2
00403DC4 |. 53 push ebx ; /size
00403DC5 |. FF15 B8A34000 call dword ptr [<&MSVCRT.malloc>] ; \malloc
Definitely this value is Unpacked File Size
And now we a going to main unpacking routine on 00403DD3, passing to this function buffer size, buffer address and 1
Code:
00403DD0 |. 53 push ebx ; size
00403DD1 |. 55 push ebp ; buffer
00403DD2 |. 56 push esi
00403DD3 |. E8 004F0000 call <jmp.&rw_data.w_data.> ; read end unpack
We jump in rw_data.dtaRead. Bla-bla-bla, instruction and instruction… Continue tracing… We should stop on SetFilePointer (100053B3) and check offset value
Code:
001249E4 00000070 |hFile = 00000070 (window)
001249E8 00000072 |OffsetLo = 72 (114.)
001249EC 00000000 |pOffsetHi = NULL
001249F0 00000000 \Origin = FILE_BEGIN
0x72… where we saw 0x72? It was in the first data block “File table”, for "idata.txt" entry.
Code:
typedef struct
{
ubyte b1;
ubyte flags;
WORD fileNameSize; //File name length
DWORD structOffset; //structure offset, contains additional data
DWORD dataOffset; //data offset
char fileName[16];
}
read data from 0x72 to the stack
Code:
10005434 |. 50 push eax ; |Buffer
10005435 |. 57 push edi ; |hFile
10005436 |. FF15 08000110 call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
Now function should decide what to do with data: whether they are encrypted, packed or something else.
If data is encrypted, we decrypt it before unpacking
Code:
10005629 |> \8A4424 1F mov al, byte ptr [esp+0x1F] ; get flag
1000562D |. 84C0 test al, al
1000562F |. 74 21 je short 10005652 ; data was crypted
10005631 |. A1 6CBC0110 mov eax, dword ptr [0x1001BC6C]
10005636 |. 8B4C24 10 mov ecx, dword ptr [esp+0x10] ; compressed size
1000563A |. 8B5428 44 mov edx, dword ptr [eax+ebp+0x44] ; key2
1000563E |. 8B4428 40 mov eax, dword ptr [eax+ebp+0x40] ; key1
10005642 |. 52 push edx
10005643 |. 50 push eax
10005644 |. 8D9424 89000000 lea edx, dword ptr [esp+0x89]
1000564B |. 51 push ecx
1000564C |. 52 push edx
1000564D |. E8 EE390000 call 10009040 ; decrypt before unpacking
At this moment all data are stored in stack (the size of “extra data” is equal 0x80 bytes)
and they are ready for unpacking.
Main unpacking cycle begins from 100056F6. As I think, this is some kind of dictionary coder
A dictionary coder, also sometimes known as a substitution coder, is a class of lossless data compression algorithms which operate by searching for matches between the text to be compressed and a set of strings contained in a data structure (called the 'dictionary') maintained by the encoder. When the encoder finds such a match, it substitutes a reference to the string's position in the data structure.
Maybe, LZ77 variation, but without questions, this algorithm works quite well.
Code:
100056F6 |> /8B4424 30 /mov eax, dword ptr [esp+0x30]
100056FA |> |8A5424 17 mov dl, byte ptr [esp+0x17]
100056FE |. |84D2 |test dl, dl
10005700 |. |75 23 |jnz short 10005725
10005702 |. |66:0FB68434 8000>|movzx ax, byte ptr [esp+esi+0x80]
1000570B |. |66:0FB69434 8100>|movzx dx, byte ptr [esp+esi+0x81]
10005714 |. |C1E0 08 |shl eax, 0x8
10005717 |. |03C2 |add eax, edx
10005719 |. |C64424 17 10 |mov byte ptr [esp+0x17], 0x10
1000571E |. |894424 30 |mov dword ptr [esp+0x30], eax
10005722 |. |83C6 02 |add esi, 0x2
10005725 |> |F6C4 80 |test ah, 0x80
10005728 |. |75 15 |jnz short 1000573F ; just copy bytes to the result buffer
1000572A |. |8B5424 38 |mov edx, dword ptr [esp+0x38]
1000572E |. |8A8434 80000000 |mov al, byte ptr [esp+esi+0x80] ; extra data + 0x80 + counter
10005735 |. |46 |inc esi ; inc counter
10005736 |. |880411 |mov byte ptr [ecx+edx], al ; write byte
10005739 |. |41 |inc ecx
1000573A |. |E9 CD000000 |jmp 1000580C
1000573F |> |8A8434 81000000 |mov al, byte ptr [esp+esi+0x81] ; extra data + 0x81 + counter
10005746 |. |33DB |xor ebx, ebx
10005748 |. |8A9C34 80000000 |mov bl, byte ptr [esp+esi+0x80] ; extra data + 0x80 + counter
1000574F |. |8BD0 |mov edx, eax
10005751 |. |81E2 FF000000 |and edx, 0xFF ; get only byte
10005757 |. |C1E3 04 |shl ebx, 0x4
1000575A |. |C1EA 04 |shr edx, 0x4
1000575D |. |03DA |add ebx, edx
1000575F |. |895C24 64 |mov dword ptr [esp+0x64], ebx
10005763 |. |75 55 |jnz short 100057BA
10005765 |. |66:0FB69434 8200>|movzx dx, byte ptr [esp+esi+0x82]
1000576E |. |66:0FB6C0 |movzx ax, al
10005772 |. |C1E0 08 |shl eax, 0x8
10005775 |. |8D5C02 0F |lea ebx, dword ptr [edx+eax+0xF]
10005779 |. |33D2 |xor edx, edx
1000577B |. |81E3 FFFF0000 |and ebx, 0xFFFF
10005781 |. |895C24 4C |mov dword ptr [esp+0x4C], ebx
10005785 |. |8D43 01 |lea eax, dword ptr [ebx+0x1]
10005788 |. |85C0 |test eax, eax
1000578A |. |7E 25 |jle short 100057B1
1000578C |. |33C0 |xor eax, eax
1000578E |> |8B7C24 38 |/mov edi, dword ptr [esp+0x38]
10005792 |. |8A9C34 83000000 ||mov bl, byte ptr [esp+esi+0x83]
10005799 |. |03C1 ||add eax, ecx
1000579B |. |42 ||inc edx
1000579C |. |881C38 ||mov byte ptr [eax+edi], bl
1000579F |. |8B5C24 4C ||mov ebx, dword ptr [esp+0x4C]
100057A3 |. |8BC2 ||mov eax, edx
100057A5 |. |25 FFFF0000 ||and eax, 0xFFFF
100057AA |. |8D7B 01 ||lea edi, dword ptr [ebx+0x1]
100057AD |. |3BC7 ||cmp eax, edi
100057AF |.^|7C DD |\jl short 1000578E
100057B1 |> |83C6 04 |add esi, 0x4
100057B4 |. |8D4C19 01 |lea ecx, dword ptr [ecx+ebx+0x1]
100057B8 |. |EB 52 |jmp short 1000580C
100057BA |> |24 0F |and al, 0xF
100057BC |. |33D2 |xor edx, edx
100057BE |. |66:0FB6C0 |movzx ax, al
100057C2 |. |83C0 02 |add eax, 0x2
100057C5 |. |25 FFFF0000 |and eax, 0xFFFF
100057CA |. |894424 4C |mov dword ptr [esp+0x4C], eax
100057CE |. |8D78 01 |lea edi, dword ptr [eax+0x1]
100057D1 |. |85FF |test edi, edi
100057D3 |. |7E 2C |jle short 10005801
100057D5 |. |33C0 |xor eax, eax
100057D7 |. |EB 04 |jmp short 100057DD
100057D9 |> |8B5C24 64 |/mov ebx, dword ptr [esp+0x64]
100057DD |> |8BE9 | mov ebp, ecx
100057DF |. |2BEB ||sub ebp, ebx
100057E1 |. |03E8 ||add ebp, eax
100057E3 |. |03C1 ||add eax, ecx ; inc counter
100057E5 |. |8BDD ||mov ebx, ebp
100057E7 |. |8B6C24 38 ||mov ebp, dword ptr [esp+0x38] ; outbuffer address
100057EB |. |42 ||inc edx
100057EC |. |8A1C2B ||mov bl, byte ptr [ebx+ebp]
100057EF |. |881C28 ||mov byte ptr [eax+ebp], bl ; write byte
100057F2 |. |8BC2 ||mov eax, edx
100057F4 |. |25 FFFF0000 ||and eax, 0xFFFF
100057F9 |. |3BC7 ||cmp eax, edi
100057FB |.^|7C DC |\jl short 100057D9
100057FD |. |8B4424 4C |mov eax, dword ptr [esp+0x4C]
10005801 |> |8B6C24 60 |mov ebp, dword ptr [esp+0x60]
10005805 |. |83C6 02 |add esi, 0x2
10005808 |. |8D4C01 01 |lea ecx, dword ptr [ecx+eax+0x1]
1000580C |> |8B7C24 30 |mov edi, dword ptr [esp+0x30]
10005810 |. |8A5424 17 |mov dl, byte ptr [esp+0x17]
10005814 |. |8B4424 54 |mov eax, dword ptr [esp+0x54] ; size + 1
10005818 |. |D1E7 |shl edi, 1
1000581A |. |FECA |dec dl
1000581C |. |3BF0 |cmp esi, eax
1000581E |. |897C24 30 |mov dword ptr [esp+0x30], edi
10005822 |. |885424 17 |mov byte ptr [esp+0x17], dl
10005826 |.^\0F82 CAFEFFFF \jb 100056F6
Finally, after passing unpacking cycles, function checks how many bytes have been unpacked and after that copy everything into the output buffer.
Code:
100058E6 |. F3:A5 rep movs dword ptr es:[edi], dword ptr [esi] ; copy data from temp buffer to the normal
We finished with direct code tracing, and from now we will concentrate on coding dta unpacker.
P.S. Parts of this tutorial were written in a different time. If you found mismatches or mistakes, let me know
P.P.S. Wait for Part3 and Part4... :)
Chameleon DTA files format.
Hello :)
In the Czech and Russian versions of this game, the file "tables.dta" is normally found in the "DTA" directory.
In my Polish edition of Chameleon there is no tables.dta file that should be in the DTA directory with the game
installed. The content of this file "tables.dta" (627 files) is located In the file "protect0.dat" this file is
part of the starforce 3 security system. I was able to extract all 627 files from the "protect0.dat" file (With
appropriate tools) which should normally be in a single "tables.dta" file. Would someone be able to pack my 627
files into one "tables.dta" file, or just write the paker to that file format ?
hxxp://rgho.st/6vDtL2rGG
Regards,
UrBi