File carving is a forensic technique used to recover files based solely on their content, without relying on file system metadata. It is especially useful when partitions are damaged, file tables are wiped, or metadata is missing due to corruption, formatting, or deliberate tampering. Carving scans raw disk data to identify file signatures and reconstruct files from the underlying binary structure.
Unlike standard recovery methods that depend on file system records, file carving focuses entirely on patterns found within the data itself.
What Is File Carving?
File carving is the process of extracting files from unallocated or partially damaged space by identifying file headers, footers, and internal structures.
Since file system metadata might be missing or overwritten, carving looks for recognizable binary patterns to locate and rebuild data.
This technique is commonly used when:
-
File system is corrupted
-
Metadata is missing or deleted
-
Device has been formatted
-
Malware or attackers have wiped entries
-
Only fragments of the file remain
How File Carving Works
At its core, file carving follows these steps:
-
Scan storage media for known file signatures
-
Identify file boundaries (header and footer)
-
Extract the data between these boundaries
-
Reconstruct the file in a usable format
-
Validate the file using internal structure
Carving can work even when:
-
There are no filenames
-
No directory structure exists
-
Timestamps are missing
-
File allocation tables are destroyed
Header and Footer Based Carving
Most file carving is done using header and footer analysis.
Header
A header is a unique sequence of bytes that appears at the start of a file.
Examples:
-
JPEG:
FF D8 FF -
PNG:
89 50 4E 47 0D 0A 1A 0A -
PDF:
%PDF -
ZIP:
50 4B 03 04
Footer
A footer marks the end of the file.
Examples:
-
JPEG ends with:
FF D9 -
PDF ends with:
%%EOF
When both header and footer are found, the file can be extracted accurately.
Header-Only Carving
Some files do not have a clear footer.
In these cases, carvers use:
-
Expected file size
-
Internal structures
-
Statistical patterns
-
Block boundaries
For example, many video formats lack fixed footers.
The algorithm guesses boundaries based on:
-
Consistent block sequences
-
Container metadata
-
File-type-specific rules
This makes header-only carving more complex and less accurate.
Fragmentation Issues in File Carving
File carving works best when files are stored contiguously.
However, modern file systems often store files in fragments.
Challenges include:
-
Carved files may be incomplete
-
Reconstructed files may mix different fragments
-
Large videos, archives, and databases are harder to carve
-
Fragmentation reduces accuracy significantly
Advanced carving tools attempt to rebuild fragmented files using:
-
File structure heuristics
-
Statistical analysis
-
Entropy analysis
-
Machine learning models
-
Known file format specifications
But success is not guaranteed.
Types of File Carving Techniques
1. Header–Footer Carving
Uses both header and footer to identify file boundaries.
Most accurate method for well-defined formats.
2. Header–Only Carving
Used when footers are missing.
Carver predicts end of file using format rules.
3. Content-Based Carving
Uses file structure instead of signatures.
Examples:
-
JPEG segments
-
ZIP directory structure
-
MP4 box structure (atoms)
This is useful when signatures are damaged.
4. Statistical Carving
Analyzes byte patterns to detect file types.
Often used for text files, logs, and unknown formats.
5. Fragment Reconstruction Carving
Advanced methods to reassemble broken files using:
-
Sequence matching
-
Graph-based analysis
-
Similarity scoring
-
Machine learning techniques
Often used in large-scale forensic labs.
Common File Types Suitable for Carving
-
Images (JPEG, PNG, BMP, GIF)
-
Documents (PDF, DOCX, TXT)
-
Archives (ZIP, RAR)
-
Videos (MP4, AVI)
-
Audio files (MP3, WAV)
-
Databases (SQLite)
Images and PDFs carve very well; fragmented videos carve poorly.
Tools Used for File Carving
Popular carving tools include:
-
PhotoRec
-
Scalpel
-
Foremost
-
Autopsy (carving modules)
-
Sleuth Kit (blkls, icat)
-
X-Ways Forensics
-
bulk_extractor
-
FTK and EnCase (built-in carving engines)
These tools automate scanning, extraction, and reconstruction.
When File Carving Is Most Effective
Carving works best when:
-
Files are small
-
File system is simple (like FAT32)
-
Disk has low fragmentation
-
Metadata is missing but content is intact
-
Media is USB, SD card, or old HDD
It becomes harder when:
-
SSDs are used (TRIM deletes blocks immediately)
-
APFS encryption is active
-
High fragmentation exists
-
Partial overwrites have occurred
Limitations of File Carving
-
No filenames or original paths
-
Missing timestamps
-
Fragmented files may not recover fully
-
Similar file headers can cause false positives
-
Sensitive to overwritten blocks
-
Cannot recover encrypted files without keys
-
SSD TRIM makes many deletions unrecoverable
Despite limitations, carving often recovers crucial evidence not available through standard recovery.
Summary
File carving is a powerful forensic technique used to recover files when metadata is missing, file systems are corrupted, or partitions are damaged. It works by identifying file signatures and reconstructing files directly from raw disk data. While fragmentation and modern storage technologies introduce challenges, carving remains an essential skill for forensic investigators and frequently leads to the recovery of critical evidence.