Marking bad clusters data-hiding technique is more common with file systems

In this paper we examine the methods of hiding data in the NTFS file system. Further we discuss the analysis techniques which can be applied to detect and recover data hidden using each of these methods. We focus on sophisticated data hiding where the goal is to prevent detection by forensic analysis. Obvious data hiding techniques, for example setting the hidden attribute of a file, will not be included. Hidden data can be further obfuscated by file system independent approaches like data encryption and steganography. This paper is only concerned with the methods which are made possible by the structure of the NTFS file system, and with the recovery of hidden data, not its interpretation.

Show

Introduction

This paper discusses the file system dependent methods that can be used to hide data in the NTFS file system, and the analysis techniques that can be applied to detect and recover this hidden data. Target readers for this paper are computer forensic analysts and examiners, but the content may also be useful for system administrators and managers.

We restrict the discussion to those methods of data hiding techniques in NTFS file system which meet the criteria of security and capacity as specified in Provos and Honeyman:

Standard file system check with a utility such as chkdsk should not return any errors.

Hidden data will not be overwritten or the possibility of data being overwritten is low.

Hidden data will not be revealed when using standard GUI file system interface.

A reasonable amount of hidden data can be stored.

A method which fails to meet any one of the above criteria is not effective in the long term. If data can be easily detected with standard utilities by a normal user then it is not hidden properly. And if the data can be overwritten by unrelated file system activities, it can never be retrieved by the hiding party or the forensic examiner.

To order our discussion, we classify the effective hiding methods into two main categories:

Specific methods based on unique NTFS data structures.

Generic slack space based methods in relation to NTFS.

We further explore how NTFS specific methods can exploit the metadata structures and user data and directories in NTFS. For the sake of completeness we will also briefly discuss some methods which we would not under general circumstances consider effective according to the definition given above.

Section snippets

Methodology and tools

We used a number of software tools to first hide data using each of the discussed methods, and then recover these data and restore its original format. Runtime's DiskExplorer for NTFS v2.31 (DiskExplorer) was used to manually create the hidden data for testing purposes for all but one of the methods. The only exception was hidden data for alternate data streams (ADS) which was created by a standard DOS command. The test data were created on a system with Windows XP version 5.1.2600. Throughout

NTFS background

In the NTFS file system, every object is a file, and as such it inherits all the characteristics of a file. This includes file system metadata which defines the structure of the file system itself. NTFS organizes structures on a disk volume in four logical blocks (How NTFS Works) as shown in Fig. 1.

The most important feature of NTFS is the Master File Table (MFT), which is implemented as an array of records (Russinovich and Solomon, 2005). Every file and every directory has at least one entry

NTFS specific data hiding methods

As stated before any object in NTFS is a file, and as such can have any file attribute. Some attributes, for example $DATA, $INDEX_ROOT or $INDEX_ALLOCATION may legitimately appear multiple times in a single file, and in such case they are given unique names. Naturally depending on the file type, certain attributes are required and expected, but it appears that NTFS is not sensitive to a file including attributes which are unnecessary. This feature can be exploited for hiding data in such way

Slack space based hiding methods in NTFS

Slack space relates to all the areas on disk surface which cannot be utilised by the file system because of discrete nature of space allocation. Existence of slack space is characteristic of all file systems, not just NTFS. The discussion of data hiding methods in NTFS would not be complete without discussing slack space, and we will highlight those aspects of the methods which are specific to NTFS.

Ineffective data hiding methods

As stated in the introduction, effective data hiding techniques in a file system should meet the goals of security and capacity (Provos and Honeyman). Below we list some uncommon and ineffective techniques possible with NTFS which do not satisfy all of these goals.

Set the allocation bit of certain unallocated clusters to 1 and hide data in these clusters: a simple chkdsk validation would detect this.

Hide data in unallocated clusters: there is a high possibility of the hidden data being erased

NTFS integrity

A check of general integrity should be performed on NTFS before any analysis starts. The chkdsk command should be run, and if it returns any error, this is an indication that the file system could have been manipulated and left in an unstable state. Next the standard Windows utility Fsutil or the Sleuth Kit command fsstat should be run to obtain general information about NTFS system under investigation.

The cluster size of the system should be checked against the default size matching the file

Conclusion

In general, the analysis of hidden data in NTFS file system is divided into three phases. The first phase is to determine whether any data is hidden. This process can either systematically check for all hiding methods discussed in this paper, or search the system for anomalies. For example, it is abnormal for an operating system to detect bad sectors before a hard disk does, so the very presence of any bad clusters is suspicious and worth further analysis. Similarly, if the integrity of the

References (28)

  • D. Bem et al.

    Alternate data streams in forensic investigations of file systems backups

    (2006)

  • B. Carrier

    File system forensic analysis

    (2005)

  • Carrier B. The Sleuth Kit. Available from:...
  • H. Carvey

    Data hiding on a live system

    (2004)

  • H. Carvey

    Windows forensics and incident recovery

    (2004)

  • A. Chuvakin

    LinuxSecurity.com

  • K. Eckstein et al.

    Data hiding in journaling file systems

  • D. Farmer et al.

    Forensic discovery

    (2005)

  • Foremost,...
  • How NTFS Works, Windows Server 2003 Technical Reference. Available from:...

  • Incident handling/forensics FAQs....
  • W.G. Kruse et al.

    Computer forensics: incident response essentials

    (2002)

  • K. Li

    A guide to programming Intel IA32 PC architecture

  • J.R. Mallery

    Secure file deletion, fact or fiction?

  • Navigate DownView more references

    Cited by (40)

    • Characteristics and detectability of Windows auto-start extensibility points in memory forensics

      2019, Digital Investigation

      Citation Excerpt :

      Hence, the tool analyzes the string and marks as suspicious those paths that contain any reference to the folders %APPDATA% (i.e., C:∖Users∖[username]∖AppData∖Roaming), %TMP% or %TEMP% (i.e., C:∖Users∖[username]∖AppData∖Local∖Temp), or the common parent folder AppData. Additionally, the tool also marks as suspicious any string command that contains operations regarding NTFS Alternate Data Streams (ADS) (Huebner et al., 2006) or well-known shell commands that indirectly launch programs (Mosuela, 2016; Reichel, 2017). For instance, the command rundll32.exe shell32.dll,ShellExecute_RunDLL [filename] will load the system library shell32.dll and will execute the function ShellExecute_RunDLL with the parameter filename.

      Show abstractNavigate Down

      Computer forensics is performed during a security incident response process on disk devices or on the memory of the compromised system. The latter case, known as memory forensics, consists in dumping the memory to a file and analyzing it with the appropriate tools. Many security incidents are caused by malware that targets and persists as long as possible in a Windows system within an organization. The persistence is achieved using Auto-Start Extensibility Points (ASEPs), the subset of OS and application extensibility points that allow a program to auto-start without any explicit user invocation. In this paper, we propose a taxonomy of the Windows ASEPs, considering the features that are used or abused by malware to achieve persistence. This taxonomy splits into four categories: system persistence mechanisms, program loader abuse, application abuse, and system behavior abuse. We detail the characteristics of each extensibility point (namely, write permissions, execution privileges, detectability in memory forensics, freshness of system requirements, and execution and configuration scopes). Many of these ASEPs rely on the Windows Registry. We also introduce the tool Winesap, a Volatility plugin that analyzes the registry-based Windows ASEPs in a memory dump. Furthermore, we state the order of execution of some of these registry-based extensibility points and evaluate the effectiveness of our tool in memory dumps taken from a Windows OS where extensibility points were used. Winesap was successful in marking all the registry-based Windows ASEPs as suspicious registry keys.

    • Forensic Readiness: A Case Study on Digital CCTV Systems Antiforensics

      2017, Contemporary Digital Forensic Investigations of Cloud and Mobile Applications

      Show abstractNavigate Down

      Digital closed-circuit television (CCTV) systems are deployed by government agencies, the private sector, and in private households for surveillance and to deter crime. These systems use digital video recorders (DVRs) to record events with timestamps on storage media and are capable of streaming the captured scenes to smartphones via mobile applications for mobile monitoring. With digital CCTV systems widely integrated into society, the role of digital forensics is more important than ever and the concept central to digital forensics is digital evidence. There are, however, challenges faced by digital forensic practitioners in acquiring the digital evidence from digital CCTV systems due to a range of reasons. These include damaged systems (intentionally or by accident, electronically or physically), and the use of antiforensic techniques by the suspect to obstruct the forensic data recovery of the multimedia evidence. In this book chapter we present an antiforensics framework for digital CCTV systems that allows permanent deletion of multimedia files based on selected timestamp, and an antiforensic prototype tool. Two DVRs of digital CCTV systems and an iPhone were used as case studies to demonstrate the utility of our framework and tool. Findings from this research provide digital forensic practitioners to have an in-depth understanding of residual evidence pertaining to the investigation of CCTV systems. This chapter also provides insights on the importance for manufacturers to include forensic readiness in the design of their digital CCTV systems.

    • Forensic Readiness: A Case Study on Digital CCTV Systems Antiforensics

      2016, Contemporary Digital Forensic Investigations of Cloud and Mobile Applications

      Show abstractNavigate Down

      Digital closed-circuit television (CCTV) systems are deployed by government agencies, the private sector, and in private households for surveillance and to deter crime. These systems use digital video recorders (DVRs) to record events with timestamps on storage media and are capable of streaming the captured scenes to smartphones via mobile applications for mobile monitoring. With digital CCTV systems widely integrated into society, the role of digital forensics is more important than ever and the concept central to digital forensics is digital evidence. There are, however, challenges faced by digital forensic practitioners in acquiring the digital evidence from digital CCTV systems due to a range of reasons. These include damaged systems (intentionally or by accident, electronically or physically), and the use of antiforensic techniques by the suspect to obstruct the forensic data recovery of the multimedia evidence. In this book chapter we present an antiforensics framework for digital CCTV systems that allows permanent deletion of multimedia files based on selected timestamp, and an antiforensic prototype tool. Two DVRs of digital CCTV systems and an iPhone were used as case studies to demonstrate the utility of our framework and tool. Findings from this research provide digital forensic practitioners to have an in-depth understanding of residual evidence pertaining to the investigation of CCTV systems. This chapter also provides insights on the importance for manufacturers to include forensic readiness in the design of their digital CCTV systems.

    • Data loss recovery for power failure in flash memory storage systems

      2015, Journal of Systems Architecture

      Citation Excerpt :

      If the host system cannot find the termination masked flag in the metadata area, it recognizes the power crash state and then implements PLR mode operations. The FTL recovers valid data depending on the recovery coverage and returns them to the host [32,48–50]. The implementation of PLR schemes should be affected by what mapping algorithm is used in FTL.

      Show abstractNavigate Down

      Due to the rapid development of flash memory technology, NAND flash has been widely used as a storage device in portable embedded systems, personal computers, and enterprise systems. However, flash memory is prone to performance degradation due to the long latency in flash program operations and flash erasure operations. One common technique for hiding long program latency is to use a temporal buffer to hold write data. Although DRAM is often used to implement the buffer because of its high performance and low bit cost, it is volatile; thus, that the data may be lost on power failure in the storage system. As a solution to this issue, recent operating systems frequently issue flush commands to force storage devices to permanently move data from the buffer into the non-volatile area. However, the excessive use of flush commands may worsen the write performance of the storage systems. In this paper, we propose two data loss recovery techniques that require fewer write operations to flash memory. These techniques remove unnecessary flash writes by storing storage metadata along with user data simultaneously by utilizing the spare area associated with each data page.

    • A Comprehensive Survey and Analysis on Multi-Domain Digital Forensic Tools, Techniques and Issues

      2022, ResearchSquare

    • A Generic Taxonomy for Steganography Methods

      2022, TechRxiv

    Arrow Up and RightView all citing articles on Scopus

    • Research article

      File system anti-forensics – types, techniques and tools

      Computer Fraud & Security, Volume 2020, Issue 3, 2020, pp. 14-19

      Show abstractNavigate Down

      Forensics paved the way for the growth of anti-forensics, and the time has come for anti-forensics to return the favour. For that purpose, it is imperative that forensic investigators and practitioners are armed with the knowledge of contemporary anti-forensics types, techniques and tools. This article aims to provide technical information and a comprehensive understanding of file system anti-forensics types, techniques and tools so as to facilitate investigators' ability to collect technically credible and legally admissible digital evidence from crime scenes.

      Forensics paved the way for the growth of anti-forensics, which tries to prevent, hinder or corrupt the forensic process of evidence acquisition and analysis.

      The time has come for anti-forensics to return the favour. Mohamad Ahtisham Wani, Ali AlZahrani and Wasim Ahmad Bhat provide technical information and a comprehensive understanding of file system anti-forensics types, techniques and tools so as to facilitate investigators' ability to collect technically credible and legally admissible digital evidence from crime scenes.

    • Research article

      Forensic analysis of B-tree file system (Btrfs)

      Digital Investigation, Volume 27, 2018, pp. 57-70

      Show abstractNavigate Down

      This paper identifies forensically important artifacts of B-tree file system (Btrfs), analyses changes that they incur due to node-balancing during file and directory operations, and based on the observed file system state-change proposes an evidence-extraction procedure. The findings suggested that retrieving forensic evidence in a fresh B-tree file system is difficult, the probability of evidence-extraction increases as the file system ages, internal nodes are the richest sources of forensic data, degree of evidence-extraction depends upon whether nodes are merged or redistributed, files with size less than 1 KB and greater than 4 KB have highest chances of recovery, and files with size 3–4 KB have least chances of recovery.

    • Research article

      HDFS file operation fingerprints for forensic investigations

      Digital Investigation, Volume 24, 2018, pp. 50-61

      Show abstractNavigate Down

      Understanding the Hadoop Distributed File System (HDFS) is currently an important issue for forensic investigators because it is the core of most Big Data environments. The HDFS requires more study to understand how forensic investigations should be performed and what artifacts can be extracted from this framework. The HDFS framework encompasses a large amount of data; thus, in most forensic analyses, it is not possible to gather all of the data, resulting in metadata and logs playing a vital role. In a good forensic analysis, metadata artifacts could be used to establish a timeline of events, highlight patterns of file-system operation, and point to gaps in the data.

      This paper provides metadata observations for HDFS operations based on fsimage and hdfs-audit logs. These observations draw a roadmap of metadata changes that aids in forensic investigations in an HDFS environment. Understanding metadata changes assists a forensic investigator in identifying what actions were performed on the HDFS.

      This study focuses on executing day-to-day (regular) file-system operations and recording which file metadata changes occur after each operation. Each operation was executed, and its fingerprints were detailed. The use of those fingerprints as artifacts for file-system forensic analysis was elaborated via two case studies. The results of the research include a detailed study of each operation, including which system entity (user or service) performed this operation and when, which is vital for most analysis cases. Moreover, the forensic value of examined observations is indicated by employing these artifacts in forensic analysis.

    • Research article

      De-Wipimization: Detection of data wiping traces for investigating NTFS file system

      Computers & Security, Volume 99, 2020, Article 102034

      Show abstractNavigate Down

      Data wiping is used to securely delete securely unwanted files. However, the misuse of data wiping can destroy pieces of evidence to be spoiled in a digital forensic investigation. To cope with the misuse of data wiping, we proposed an anti-anti-forensic method based on NTFS transaction features and a machine learning algorithm. This method allows investigators to obtain information regarding ‘which files are wiped’ and ‘which data wiping tools and data sanitization standards used’. In this study, we achieved good identification of data wiping traces in the NTFS file system. Leveraging the efficiency of machine learning models, our method effectively recognizes wiped partitions and files in the NTFS file system and identifies tools used in data sanitization.

    • Research article

      A RAM triage methodology for Hadoop HDFS forensics

      Digital Investigation, Volume 18, 2016, pp. 96-109

      Show abstractNavigate Down

      This paper discusses the challenges of performing a forensic investigation against a multi-node Hadoop cluster and proposes a methodology for examiners to use in such situations. The procedure's aim of minimising disruption to the data centre during the acquisition process is achieved through the use of RAM forensics. This affords initial cluster reconnaissance which in turn facilitates targeted data acquisition on the identified DataNodes. To evaluate the methodology's feasibility, a small Hadoop Distributed File System (HDFS) was configured and forensic artefacts simulated upon it by deleting data originally stored in the cluster. RAM acquisition and analysis was then performed on the NameNode in order to test the validity of the suggested methodology. The results are cautiously positive in establishing that RAM analysis of the NameNode can be used to pinpoint the data blocks affected by the attack, allowing a targeted approach to the acquisition of data from the DataNodes, provided that the physical locations can be determined. A full forensic analysis of the DataNodes was beyond the scope of this project.

    • Research article

      AFEIC: Advanced forensic Ext4 inode carving

      Digital Investigation, Volume 20, Supplement, 2017, pp. S83-S91

      Show abstractNavigate Down

      In forensic computing, especially in the field of postmortem file system forensics, the reconstruction of lost or deleted files plays a major role. The techniques that can be applied to this end strongly depend on the specifics of the file system in question. Various file systems are already well-investigated, such as FAT16/32, NTFS for Microsoft Windows systems and Ext2/3 as the most relevant file system for Linux systems. There also exist tools, such as the famous Sleuthkit (Carrier), that provide file recovery features for those file systems by interpreting the file system internal data structures. In case of an Ext file system, the interpretation of the so-called superblock is essential to interpret the data. The Ext4 file system can mainly be analyzed with the tools and techniques that have been developed for its predecessor Ext3, because most principles and internal structures remained unchanged. However, a few innovations have been implemented that have to be considered for file recovery. In this paper, we investigate those changes with respect to forensic file recovery and develop a novel approach to identify files in an Ext4 file system even in cases where the superblock is corrupted or overwritten, e.g. because of a re-formatting of the volume. Our approach applies heuristic search patterns for utilizing methods of file carving and combines them with metadata analysis. We implemented our approach as a proof of concept and integrated it into the Sleuthkit framework.

      What are three data hiding techniques?

      Data- hiding techniques include hiding entire partitions, changing file extensions, setting file attributes to hidden, bit-shifting, using encryption, and setting up password protection.

      What are some data hiding techniques?

      Digital watermarking, steganography and Reversible Data Hiding (RDH) are the types of data hiding approaches. Watermarking is a sequence of digital bits placed in a digital cover file that recognizes the file's copyright information4.

      What is a technique that involves changing or manipulating a file to conceal information?

      Data hiding involves changing or manipulating a file to conceal information. Data-hiding techniques include hiding partitions, changing file extensions, setting file attributes to hidden, bit- shifting, using steganography, and using encryption and password protection.

      What is the most common and flexible data acquisition method?

      Bit-stream disk-to-image files. This is the most common data acquisition method in the event of a cybercrime. It involves cloning a disk drive, which allows for the complete preservation of all necessary evidence. Programs used to create bit-stream disk-to-image files include FTK, SMART, and ProDiscover, among others.