Reverse engineering is the restoration of the structure of an object, whether it is a program or a part of it. This is done when the original code of a program is lost or not available to the public. Software reverse engineering is the restoration of the principles/algorithms of a program in order to understand how the program works and, possibly, to reproduce this mechanism. These actions may have the following goals:
- Analyzing viruses/trojans/other malware to create a countermeasure;
- Search for closed vulnerabilities in software for further creation of viruses or exploits;
- Create a description for data formats or protocols used in the program;
- Analyzing closed source drivers to create your own open source drivers;
- Recovery of lost information;
- Identification of side effects.
However, software hacking should not be confused with reverse engineering: hacking can be defined as actions such as returning debugging functions or those modules that were disabled by default due to vulnerabilities or other problems, as well as disassembling the way license checking works and disabling it. Reverse engineering, in turn, is a complete analysis of the program brick by brick and analysis of its behavior.
In general, reverse engineering can be divided into software and hardware. However, we are not interested in the latter, as this article is only concerned with software reverse engineering. For example, let’s take a program written in C++ or Java that other programmers can understand. But to run it on a computer, it needs to be translated by another program called a compiler into binary code. Compiled code is incomprehensible to most programmers, but there are ways to convert machine code into a more human-friendly format using a software tool called a decompiler. Reverse engineering consists of several stages:
- Information gathering. This step collects all possible information (i.e., initial design documentation, etc.) about the software;
- Study of the information. The information gathered in the previous step is studied to help the user familiarize himself with the system;
- Structuring. This step involves identifying the structure of the program in the form of a flowchart, where each node corresponds to a specific procedure;
- Functional description. This step handles the details of each module of the structure. Diagrams are written using a structured language such as a decision table, etc;
- Recording data flows. From the information obtained in steps 3 and 4, data flow diagrams are derived to show the flow of data between processes;
- Record control flow. The leading high-level software structure is recorded;
- Preview of the extracted design. The extracted design document is checked several times to ensure consistency and correctness;
- Creation of documentation. Finally, in this phase, all documentation, including SRS, project documentation, history, validation, etc. is recorded for future reference.