Keeping up with the ever growing and renewed list of applications is a constant challenge. As of January 2018, there are almost 6 million applications in the Google Play and Apple store combined. Popular applications such as Facebook or Telegram frequently release feature updates, and our modern day, tech-savvy criminals and even terrorists sometimes resort to niche applications to hide their communications.

Although Cellebrite and other large vendors invest in quality parsers and decoders for popular applications, most applications are not supported at all. So how does one overcome the obstacle of finding critical evidence in the darkness of unknown applications?
Cellebrite’s Forensic Research Group offers examiners a suite of tools and methods to successfully navigate through the ‘unknown’. These tools and methods improve over time and construct a platform for handling unknown arbitrary applications at scale without compromising quality, accuracy, and performance.
Two assumptions reside at the core of the approach addressing the challenge of unknown applications. The first – application developers use common libraries, methods, and practices, which share recognizable patterns that can be leveraged. Carving tools and Cellebrite’s Fuzzy engine are based on this assumption. We’ll go into more detail on that shortly. The second assumption is that any application can decode its own database. Cellebrite’s new Virtual Analyzer capability utilizes this assumption into practical means.
Before focusing on specific tools, here are Cellebrite’s full suite of tools for harvesting the ‘unknown’:
  • SQLite Wizard. UI tool for manually mapping artifacts from a database into modeled evidence.
  • Artifact Carvers. Automatic tools based on patterns that highlight possible locations and text strings within both intact and unallocated space regardless of structure.
  • Fuzzy Model Engine. Automatic tool, based on heuristics and machine learning, that produce modeled evidence from unfamiliar structured data.
  • Virtual Analyzer. Industry’s first forensic Android emulator, enabling dynamic analysis of evidence from the image.
Let’s review some of these tools to understand their power in lighting the darkness of unknown databases and applications.
Carving is a common practice used for data recovery and forensic investigation. In most cases, carving refers to file or object carving based on pattern matching of a header and/or a footer of known file types.
Cellebrite Physical Analyzer contains file carvers that search through unallocated space but also find embedded files within other files. However, carving unstructured artifacts is a different ballgame. Single artifacts do not have any header to anchor; any 4 bytes can be decoded as two floating point numbers, possibly representing a location if translates into a valid coordinate.  
The challenge, in addition to performance, is to harvest true locations with a minimum of false detection and misdetection. At Cellebrite we use a set of heuristics to produce highly probable location artifacts based on known intact locations. These additional locations highlight applications and databases that help examiners to locate significant pieces of evidence amongst a sea of clutter.
Artifact carvers can search through intact, deleted and unallocated spaces. They do not assume any a-priori structure, which is a strength as well as a weakness: carvers are resilient to proprietary structures. On the other hand, fragmented artifacts cannot be reconstructed and harvested. 
A complementary approach to carving, designed and implemented by Cellebrite, is the Fuzzy Model Engine. The basic hypothesis of the Fuzzy engine is that applications store data in a structured database: SQLite, Realm, LevelDB or even json, plist, or bplist. Once a mobile database had been recognized, the Fuzzy engine tries to automatically unravel the type and structure of evidence that might reside inside of it.
The engine relies on heuristics and machine learning. For instance, if a field named “lon” was found, there should be a “lat” somewhere. If both were related to an object, most probably these fields represent the location of that object.
The engine’s core and heuristics effectively represent the knowledge behind the common practices of storing information. This body of knowledge is constantly being updated and optimized to improve the accuracy of the modeled evidence harvested from any arbitrary database.
Running the Fuzzy model tool on a phone image reveals possible contacts, locations, messages, passwords and other artifacts hidden in all structured databases in the image. These artifacts are exposed on the spot, saving examiners time and efforts.
The final approach is the Virtual Analyzer; the forensic industry’s first and only tool to offer dynamic analysis of forensic Android data. At its core, the Virtual Analyzer runs the application with its private user data in an emulated environment, utilizing the vendor’s code to decode the database in question. The application runs in a quarantined environment preventing it from compromising the account or the phone image.
As described, the underlying assumption is that all the data needed for the decoding process is present in the image: the artifacts, encryption keys, etc. The application can decrypt, unlock, and decode the information, presenting it to the user on demand and displaying it exactly the way it would have appeared on the mobile device.
When considering a general approach, this approach can work with any Android application given that the data is in the image and no specific user input is needed. The main shortcoming of this approach is that it can only express data that was meant to be displayed. For instance, it would not be possible to present deleted artifacts.
The main challenges of implementing the Virtual Analyzer stem from injecting code and data into a mobile device, different from the one it was created on. Problems range from missing identifiers to differences in operating systems. Cellebrite’s Forensic Research Group had produced a system that mitigates most of the differences and does not compromise the integrity of the data. However, some challenges remain.
For example, although emulation engines for Android are in abundance, there is no public true emulation of iOS. There are plenty of simulators mimicking the look-and-feel of iOS, but the application’s binary code is compiled for a specific operating system and architecture and cannot run on another. Still, the Virtual Analyzer is a powerful tool able to decode almost any arbitrary Android application, which empowers examiners to follow and explore any investigative lead in no time.
To conclude, Cellebrite’s Forensic Research Group is constantly evolving existing and developing new tools and methods for harvesting significant and valuable evidence from the dark side of the mobile device. In this article, we have explored complementary techniques and tools that can illuminate this darkness.
One approach is to intelligently automate the analysis process and produce quality assumed evidence, providing the examiner with new sources of potential evidence. The other approach is enabling the examiner to effectively browse through the phone without compromising the integrity of the accounts or the evidence. 
For the answer to other questions about the latest in digital intelligence, stay tuned.  
Share this post