Searching Tips and Tricks in BlackLight

Starting in BlackLight 2019 R1, customers have two methods for searching for words or patterns in case data.  This post will walk you through the basics of both types of searching, as well as tips and techniques for how and when to use each method. Both methods are powerful tools to narrow in on what you need to find quickly and efficiently, but only if you understand their options and limitations.

Content vs. Index Searches in BlackLight

Searching functionality is available in the Component List with options to Add Content Searches or Index Searches.

Content Searching has moved to the Component List along with the new Index Searches

BlackLight users might be most familiar with Content Searching, also referred to as keyword or raw searching. Content Searching is the searching technique used in digital forensics the longest and was previously the only searching mechanism available in BlackLight.  It is the most comprehensive method to search for characters or patterns on disk, in both allocated and unallocated areas of the disk.  This type of search is created using specific keywords or character patterns to look for that keyword or character pattern stored on the drive or within a file.  Blacklight’s powerful content search provides the ability to limit the scope of the search and provides additional time-saving options to make these searches complete quickly so investigators can review search hits.

Index Searching is a two-part process. First an index is built when data is  processed. Then the index can be quickly searched for any words the investigator thinks will produce insights.  The primary advantage of indexing is, all the words are stored during the creation of the index and tied the documents or files containing the words.  So, if additional terms become important, examiners can immediately triage and see if they are present in the dataset.

Tips for Content Searching

When creating a Content Search, users click the Add button next to Content Searches in the Component List.  The Content Search interface prompts for a name. Then the user chooses which partitions, or defines specific paths, to search for keywords.  Multiple partitions can be selected.  If no paths are specified for a selected partition, the search will run across the whole partition.

Create a new search and select where to search Limit which partitions and paths to search

It is important when reviewing the content search options to consider turning on the Deep Search option.  This checkbox tells BlackLight to extract the text out of documents, like Office Documents, that are typically stored inside of containers.

Deep Search option enables Use the Deep Search option to search inside Office Documents and other complex data types that need text extraction with Content Searches

The following options are provided to speed up searching by either limiting the files searched or stop searching a file once a single hit is found:

  • Skip files larger than a specific size
  • Search only specific file types
  • Use a filter to search only files that match a specific criteria
  • Ignore extensions to skip certain files like executables or jpgs.
  • Report only first hit on file

Special content search techniques for patterns

In certain instances, examiners aren’t looking for specific words but instead are looking for patterns - like email address, credit cards, or ID numbers.  Regex allows users to specify their own patterns or to use BlackLight’s built-in pattern searches.

Content Searches can contain patterns using RegEx keywords

Since the search engine must evaluate more text to determine if there is a hit, and must consider operators like wildcards, RegEx searching takes longer than traditional keywords searching.  To account for this time difference, some examiners choose to run RegEx patterns after they have kicked off their traditional keywords.  Every RegEx pattern that is searched for will increase searching time.

For example, if a user would like to search for Social Security Numbers, they could add it from the Preset Menu.

searching for RegEx Default RegEx patterns are provided in BlackLight for content search keywords

This adds the pattern for Social Security numbers to the Keyword list and adds word bounding for that item.  As you can see, RegEx patterns use special characters to indicate number ranges, repeating frequencies, and other pattern specifiers.  Want to learn more about RegEx?  See our users guide or the history of RegEx.

Social Security Number Searching Searching for Social Security numbers using RegEx pattern

 

Tips for Index Searching

Creating an index of text documents on a device allows an examiner to quickly find if a particular topic or subject is mentioned within the evidence set. While the process of creating an index has historically been time-consuming and resulted in bloated case sizes, advancements in this field allow BlackLight to now provide users with a quick and efficient process to build the index. Once built, investigators can follow where the leads take them by making fast sequential queries of the index for words without waiting for a traditional search of the drive contents. Index searching also includes operations like proximity and Boolean logic to define which files are most relevant.

With the initial release, BlackLight 2019 R1 provides index capabilities only for allocated documents on the file system.  These are the files most relevant and likely to be useful for prosecution. Data extracted by BlackLight as a result of processing like inside of container files like internet history, PST email containers, or archives like zips, are not included in the index in this initial release but will follow shortly.

Creating an Index

Indexing is a processing option that can be selected when the Evidence is added or after initial processing from the Evidence Status page.

Choose Smart Indexing under Ingestion Options to create the index during processing

 

Searching the index

Once an Index is created, users can search the index by creating a new Index Search in the left-hand pane.  An Index search allows the examiner to search for specific words, combinations of words, pathnames, file size, date created, date modified, date accessed, and date changed using AND, OR, and NOT operators

When the index is created you can go to the new Index Search area to add a new search.

Searching the index Index Searches are entered into the Query field; Note you can name your Query using the Query Name field and the operators are listed above the query field.

Creating an index query

BlackLight uses an implementation of Elastic search for smart indexing. By default, searching the index, also called querying the index, will look across all the fields and documents that have been indexed.  Additional functionality has been built in to allow examiners to further specify which files they are interested in. Common searching techniques and examples are summarized below.

When searching the index, users enter a query string that is interpreted by the index engine into a series of terms and operators.

A term can be a single word — quick or brown — or a phrase, surrounded by double quotes — "quick brown" — which searches for all the words in the phrase, in the same order. By default, entering only terms will search the index for any items that contain one or more of those words, exactly as you enter it in the search field.  Index searching is not case sensitive.

Operators allow you to customize the search — the available options are explained below.


Date and Number Ranges

Ranges can be specified for date, numeric or string fields. Inclusive ranges are specified with square brackets [min TO max] and exclusive ranges with curly brackets {min TO max}.

Top 10 Takeaways for Searching

For each case, there are constraints on time, scope, and what is required to be produced.  If you are limited to only specific keywords and you want to complete the most exhaustive search, content searching is a powerful tool.  If you are reviewing a device to identify leads and triage to determine if it contains references to the matter you are investigating, index searching provides flexibility and performance.  Below are 10 key points to remember when deciding how to configure your next search.

Content Searching

  1. Most Comprehensive option - allows users to search everywhere on a drive
  2. Can choose to search for keywords that are a word, a partial word, or pattern
  3. Especially helpful in unallocated and memory
  4. Can be combined with filtering options to limit the scope of a search
  5. Use deep search to locate words in office documents

Index Searching

  1. Designed to look for whole words or terms, can narrow the search using metadata fields
  2. Smart indexing focuses on the files that contain words and metadata
  3. Create the index as part of processing and then run queries as needed instantly
  4. Easily locate words near each other or in a specific order
  5. Not all areas of the drive are indexed, if a term becomes critical you can run it through a content search for a comprehensive review

Looking forward, BlackBag will be providing additional index support for items like email, internet history, and messages that are currently not indexed.  We will be providing a Part 2 of this post when those features are available later this year.  If you have questions or want to provide feedback please use the product feedback form.

Click here for more information on BlackLight, to request a quote, or see BlackLight in action.

Leave a Reply

Sorry, you must be logged in to post a comment.