JPEG Forensics in Forensically
In this brave new world of alternative facts the people need the tools to tell true from false.
Well either that or maybe I was just playing with JPEG encoding and some of that crossed over into my little web based photo forensics tool in the form of some new tools. ;)
JPEG Comments
The JPEG file format contains a section for comments marked by 0xFFFE (COM). These exist in addition to the usual Exif, IPTC and XMP data. In some cases they can contain interesting information that is either not available in the other meta data or has been stripped.
For instance images from wikipedia contain a link back to the image:
File source: https://commons.wikimedia.org/wiki/File:...
Older versions of Photoshop also seem to leave a JPEG Comment too
File written by Adobe Photoshop 4.0
Some versions of libgd (commonly used in PHP web applications) seem to leave comments indicating the version of the library used and the quality the image was saved at:
CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90
The JPEG Analysis in Forensically allows you to view these.
Quantization Tables
This is probably the most interesting bit of information revealed by this new tool in Forensically.
A basic understanding of how JPEG works can help in understanding this tool so I will try to give you some intuition using the noble art of hand waving.
If you already understand JPEG you should probably skip over this gross oversimplification.
JPEG is in general a lossy image compression format. It achieves good compression rates by discarding some of the information contained in the original image.
For this compression the image is divided in 8x8 pixel blocks. Rather than storing the individual pixel values for each of the 64 pixels in the block directly JPEG saves how much they are like one of 64 fixed “patterns” (coefficients). If these patterns are chosen in the right way this transform is still essentially lossless (except for rounding errors) meaning you can back the original image by combining these patterns.
JPEG Patterns
JPEG DCT Coefficients by Devcore (Public Domain)
Now that the image is expressed in terms of these patterns JPEG can selectively discard some of the detail in the image.
How much information about which pattern is discarded is defined in a set of tables that is stored inside of each JPEG image. These tables are called quantization tables.
Example quantization table for quality 95
There are some suggestions in the JPEG standard on how to calculate these tables for a given quality value (1-99). As it turns out not everyone is using these same tables and quality values.
This is good for us as it means that by looking at the quantization tables used in a JPEG image we can learn something about the device that created the JPEG image.
Identifying manipulated images using JPEG quantization tables
Most computer software and internet services use the standard quantization tables. The very notable exception to this rule are Adobe products, namely Photoshop. This means that we can detect images that have been last saved using Photoshop just by looking at their quantization tables.
Many digital camera manufacturers also have their own secret sauce for creating quantization tables. Meaning that by comparing the quantization tables between different images taken with the same type of camera and setting we can identify whether an image was potentially created by that camera or not.
Automatic identification of quantization tables
Forensically currently automatically identifies quantization tables that have been created according to the standard. In that case it will display Standard JPEG Table Quality=95
.
It does also automatically recognize some of the quantization tables used by photoshop.
In this case it will display Photoshop quality=85
.
I’m missing a complete set of sample images for older photoshop versions using the 0-12 quality scale. If you happen to have one and would be willing to share it please let me know.
If the quantization table is not recognized it will output Non Standard JPEG Table, closest quality=82
or Unknown Table
.
Summary
JPEG images contain tables that specify how the image was compressed. Different software and devices use different quantization tables therefore by looking at the quantization tables we can learn something about the device or software that saved the image.
Additional Resources
- Presentation Using JPEG Quantization Tables to Identify Imagery Processed by Software by Jesse Kornblum
- Paper Using JPEG Quantization Tables to Identify Imagery Processed by Software by Jesse Kornblum
- Digital Image Ballistics from JPEG Quantization by Hany Farid
Structural Analysis
In addition to the quantization tables the order of the different sections (markers) of a JPEG image also reveal detail about it’s creation. In short images that were created in the same way should in general have the same structure. If they don’t it’s an indication that the image may have been tampered with.
String Extraction
Sometimes images contain (meta) data in odd places. A simple way to find these is to scan the image for sequences of sensible characters. A traditional tool to do this is the strings program in Unix-like operating systems.
For example I’ve found images that have been edited with Lightroom that contained a complete xml description of all the edits done to the image hidden in the XMP metadata.
Facebook Meta Data
When using this tool on an image downloaded from facebook one will often find a string like
FBMD01000a9...
From what I can tell this string is present in images that are uploaded via the web interface. A quick google does not reveal much about it’s contents. But it’s presence is a good indicator that an image came from facebook.
I might add a ‘facebook detector’ that looks for the presence & structure of these fields in the future.
Poke around using these new tools and see what you can find! :)