### 0x00 - start point:

in this article, we will cover "what is MBA Expressions" as the main topic. but while we are talking about this subject, we will also talk about techniques such as Opaque Predicates for a full understanding of everything. It might be a long post. so if you're determined to read, you'd better make your coffee before you start.

### 0x01 - introduction:

MBA Expressions, which stands for Mixed Boolean Arithmetic, is one of the most common principles used by many obfuscators. MBA Expressions are used to confuse the data flow of the program using boolean operators, (e.g., ∧,∨,¬,⊕) and integer arithmetic operators, (e.g **+**(ADD) ,

*****(IMUL) ,

**-**(DEC) ) operators.

We can make a simple introduction by rewriting the x+y equation, one of the most common MBA expressions.

here we rewrite the equation x+y, which is basically the sum of two unknown values, using arithmetic and boolean operators (x ^ y) + 2 * (x & y). and for every value you give to these two functions, the result of both functions will be equal. In other words, to explain in the basic obfuscation logic, these two equations are equal to each other. the only difference is that one is in its simplest form and the other is a bit more complicated using MBA expression.

### 0x02 - a little math:

**MBA expressions often use affine functions. basically the formula is f(e) = (a * e) + b. usually**

*Note:**a,b*are n-bit constants and

*e*is our MBA subexpression.

*(f: x -> 39x + 23 / f -1: x -> 151x + 111)*

*E2*) we rewritten in the previous screenshot to

*E1;*

*E₂*= (x ⊕ y) + 2 × (x ∧ y)

- in
**step 1**, we wrote the*E2*subexpression where we saw x in our f(x) function. - in the
**2nd and 3rd steps**, we wrote the*E2*expression where we saw x in the f -1(x) function, the same as the function we did above, and we got the*E3*expression. - we expand the
*E3*expression we obtained in**step 4**and we come across an example obfuscated MBA expression.

**39x + 23**. Let's try to explain a bit where this equation comes from. (as much as my math knowledge)

*f(x) = ax + b (mod 2ⁿ)*

so to summarize simply;

*f(x) = ax + b (mod 2ⁿ)*

*g(x) = cx + d (mod 2ⁿ)*

*f(g(x)) == x // for every x value*

**f-1 (f(x)) = f(x)**

to be sure of the existence of such a function, the coefficient a of f(x) must be prime between 2ⁿ. Considering this point, since we are using mod 2ⁿ, a must be an odd number for this function to be suitable for mod 2ⁿ.

*(f: x -> 39x + 23 / f -1: x -> 151x + 111)*, which is used in all articles and we have also used :=)

### 0x04 - Opaque Predicates:

an example of a normal branching chart should look like this picture. the branches are significant and short. but when opaque predicate is used the control graph becomes horrible like this;

i hear your cursing. if we look a little closer, we will understand everything completely;

an example is an opaque predicate block. In this block, arithmetic operations are applied to two variables and at the end of the block, the equality of these two variables is checked. but since these two variables will never be equal, the code always branches in the

**same direction**. Let's explain a little more clearly with images

**z3-solver**module in python, we checked the equation model we obtained from the assembly code to see if this equation has any binary solution. and the program returned "

**unSAT**" (unSATisfiable). If you get the answer unsat, it means that there is no model that solves the equation.

if there was any solution model, he would return the answer to sell to us. let's make our equation solvable and see the positive output.

since we removed the subtraction of - 1, when 0 is given instead of x in both equations, the result will be equal to 0, so the two equations have a common solution model and the program returned the

**sat**response from the check() function. By using the

**model()**function, we have printed the model that provides us with these two equations.

*(we made a small introduction to*

**opaque predicate simplification**. I will explain it in more detail in another article. for now, that's enough to know.)### 0x05: MBA Simplification:

droidguard lib-*.so library graph view |

in this article, we will use the

*mba_challenge*file shared by Tim Blazytko.

*{((((((((RSI[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + RSI[0:32]) ^ 0xFFFFFFFF) & RDX[0:32]) + ((RSI[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + RSI[0:32]) & (RDX[0:32] ^ 0xFFFFFFFF)) + -(((((((RSI[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + RSI[0:32]) ^ 0xFFFFFFFF) & RDX[0:32]) + ((RSI[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + RSI[0:32]) | (((RSI[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + RSI[0:32])) + ({RDI[0:32] & ({RDI[0:32] & RSI[0:32] 0 32, 0x0 32 64} * 0x2 + {RDI[0:32] ^ RSI[0:32] 0 32, 0x0 32 64})[0:32] 0 32, 0x0 32 64} * 0x2 + {((((RSI[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + RSI[0:32]) & RSI[0:32]) + (((RDI + {(RDI[0:32] ^ 0xFFFFFFFF) | RDX[0:32] 0 32, 0x0 32 64} + 0x1)[0:32] ^ 0xFFFFFFFF) & RDX[0:32]) + (RDI[0:32] ^ ({RDI[0:32] & RSI[0:32] 0 32, 0x0 32 64} * 0x2 + {RDI[0:32] ^ RSI[0:32] 0 32, 0x0 32 64})[0:32]) + ((RDI[0:32] ^ 0xFFFFFFFF) | RDX[0:32]) + (RDI + RDX + 0x1)[0:32] 0 32, 0x0 32 64})[0:32]) * 0x2 0 32, 0x0 32 64}*

*loc_db*) from the LocationDB class. we opened the

*mba_challenge*file in a Container. then we derived a Machine() class and created a compatible type Machine by giving its architecture the architecture of our example file (

*mba_challenge*).

after these definitions, we init a disassemble engine with the

*dis_engine*function and set the start address of the disassemble function with

*dis_block*.

then we create a new IR (Intermediate representation) variable using our lifter object that we created using our symbolic table variable. and with the help of add_asmblock_to_ircfg() function to this IR object, we give our IR variable (

*ira_cfg*) and our variable containing the value of the code we disassembled at the target address. (

*asm_block*)

then we derive an object from the

*SymbolicExecutionEngine*class and with the help of this object, we execute a symbolically execute from the start address we have defined.

finally, since we want to obtain the expression, we return the data carried by the RAX register by giving the arch type of our lifter variable.

well, now that we've explained the code, let's simplify this expression a bit.

*Simplifier*class of the Msynth code deobfuscation framework. we've found the little expression where the huge long expression we've just seen is actually equal :=)

All materials used in the article can be found at https://github.com/Ahmeth4n/ahmeth4n.github.io/tree/master/materials/mba address. i hope it was a useful article :=). see you later.

