In this brave new world of alternative facts the people need the tools to tell
true from false.
Well either that or maybe I was just playing with JPEG encoding and some of that crossed
over into my little web based photo forensics tool in the form of some new tools. ;)
The JPEG file format contains a section for comments marked by 0xFFFE (COM).
These exist in addition to the usual Exif, IPTC and XMP data.
In some cases they can contain interesting information that is either not available
in the other meta data or has been stripped.
For instance images from wikipedia contain a link back to the image:
File source: https://commons.wikimedia.org/wiki/File:...
Older versions of Photoshop also seem to leave a JPEG Comment too
File written by Adobe Photoshop 4.0
Some versions of libgd (commonly used in PHP web applications) seem to leave
comments indicating the version of the library used and the quality the image was saved at:
CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90
The JPEG Analysis in Forensically allows you to view these.
This is probably the most interesting bit of information revealed by this new tool
A basic understanding of how JPEG works can help in understanding this tool so I will try to give you some intuition using the noble art of hand waving.
If you already understand JPEG you should probably skip over this gross oversimplification.
JPEG is in general a lossy image compression format. It achieves good compression rates
by discarding some of the information contained in the original image.
For this compression the image is divided in 8x8 pixel blocks.
Rather than storing the individual pixel values for each of the 64 pixels in the block directly JPEG saves how much they are like one of 64 fixed “patterns” (coefficients).
If these patterns are chosen in the right way this
transform is still essentially lossless (except for rounding errors)
meaning you can back the original image by combining these patterns.
JPEG DCT Coefficients by Devcore (Public Domain)
Now that the image is expressed in terms of these patterns JPEG can selectively discard some of the detail in the image.
How much information about which pattern is discarded is defined in a set of tables that is stored inside of each JPEG image.
These tables are called quantization tables.
Example quantization table for quality 95
There are some suggestions in the JPEG standard on how to calculate these tables for a given quality value (1-99). As it turns out not everyone is using these same tables and quality values.
This is good for us as it means that by looking at the quantization tables used in a JPEG image we can learn something about the device that created the JPEG image.
Identifying manipulated images using JPEG quantization tables
Most computer software and internet services use the standard quantization tables. The very notable exception to this rule are Adobe products, namely Photoshop. This means that we can detect images that have been last saved using Photoshop just by looking at their quantization tables.
Many digital camera manufacturers also have their own secret sauce for creating quantization tables.
Meaning that by comparing the quantization tables between different images taken with the same type of camera and setting we can identify whether an image was potentially created by that camera or not.
Automatic identification of quantization tables
Forensically currently automatically identifies quantization tables that have been created according to the standard. In that case it will display
Standard JPEG Table Quality=95.
It does also automatically recognize some of the quantization tables used by photoshop.
In this case it will display
I’m missing a complete set of sample images for older photoshop versions using the 0-12 quality scale. If you happen to have one and would be willing to share it please let me know.
If the quantization table is not recognized it will output
Non Standard JPEG Table, closest quality=82 or
JPEG images contain tables that specify how the image was compressed. Different software and devices use different quantization tables therefore by looking at the quantization tables we can learn something about the device or software that saved the image.
In addition to the quantization tables the order of the different sections (markers) of a JPEG image also reveal detail about it’s creation. In short images that were created in the same way should in general have the same structure. If they don’t it’s an indication that the image may have been tampered with.
Sometimes images contain (meta) data in odd places.
A simple way to find these is to scan the image for sequences of sensible characters. A traditional tool to do this is the strings program in Unix-like operating systems.
For example I’ve found images that have been edited with Lightroom that contained a complete xml description of all the edits done to the image hidden in the XMP metadata.
When using this tool on an image downloaded from facebook one will often find a string like
From what I can tell this string is present in images that are uploaded via the web interface.
A quick google does not reveal much about it’s contents. But it’s presence is a good indicator that an image came from facebook.
I might add a ‘facebook detector’ that looks for the presence & structure of these fields in the future.
Poke around using these new tools and see what you can find! :)
As mentioned earlier I have been playing around with Principal Component Analysis (PCA)
for photo forensics. The results of this have now landed in my Photo Forensics Tool.
In essence PCA offers a different perspective on the data which allows us to find outliers more easily.
For instance colors that just don’t quite fit into the image will often be more apparent
when looking at the principal components of an image.
Compression artifacts do also tend to be far more visible, especially in the
second and third principal components. Now before you fall asleep, let me give you an example.
This is a photo that I recently took:
To the naked eye this photo does not show any clear signs of manipulation.
Let’s see what we can find by looking at the principal components.
First Principal Component
Still nothing suspicious, let’s check the second one:
Second Principal Component
And indeed this is where I have removed an insect flying in front of the lens
using the inpainting algorithm algorithm (content aware fill in photoshop speak) provided by G’MIC.
If you are interested Pat David has a nice tutorial on how to use this
in the GIMP.
Resistance to Compression
This technique does still work with more heavily compressed images.
To illustrate this I have run the same analysis I did above on the smaller & more compressed
version of the photo used in this article rather than the original.
As you can clearly see the anomaly caused by the manipulation is still present
and quite clear but not as clear as when analyzing a less compressed version of the image.
You can also see that the PCA is quite good at revealing the artifacts caused by (re)compression.
If you found this interesting you should consider reading my article Black and White Conversion using PCA
which introduces a tool which applies the very same techniques to create beautiful black and white conversions of photographs.
If you want another image to play with try the one in this
by Neal Krawetz is interesting. It can be quite revealing. :)
While experimenting with Black and White Conversion using PCA
I also investigated dithering algorithms and played with those.
I found that Stucki Dithering would yield rather pleasant results.
So I created a little application for just that:
Photo by Tuncay (CC BY)
I hope you enjoy playing with it. :)
I have been hacking on my photo forensics tool lately.
I found a
that suggested that performing PCA on the colors of an image might reveal interesting information hidden to the naked eye.
When implementing this feature I noticed that it did a quite good job at doing black & white conversions of photos.
Thinking about this it does actually make some sense, the first principal component
maximizes the variance of the values. So it should result in a wide tonal range in the resulting photograph.
This led me to develop a tool to explore this idea in more detail.
This experimental tool is now available for you to play with:
To give you a quick example let’s start with one of my own photographs:
While the composition with so much empty space is debatable,
I find this photo fairly good example of an image where a straight luminosity conversion fails.
This is because the really saturated colors in the sky look bright/intense even if the straight
up luminosity values do not suggest that.
Hover it to see the results of a straight luminosity conversion instead.
In this case the PCA conversion does (in my opinion) a better job at reflecting the tonality in the sky.
I’d strongly suggest that you experiment with the tool yourself.
If you want a bit more detail on how exactly the conversions work please have a look at the help page.
Do I think this is the best technique for black and white conversions? No.
You will always be able to get better results by manually tweaking the conversion
to fit your vision. Is it an interesting result? I’d say so.
I’ve just released version 1.0 of smartcrop.js.
mainly for generating good thumbnails.
The new version includes much better support for node.js by dropping the canvas dependency
as well as support for face detection by providing annotations.
The API has been cleaned up a little bit and is now using Promises.
Another little takeaway from this release is that I should set up CI even for my little
open source projects. I come to this conclusion after having created a
dependency mess using
npm link locally which lead to everything working
fine on my machine but the published modules being broken. I’ve already set
up travis for smartcrop-gm,
More of my projects are likely to follow.
Back in 2010 I did a little experiment with normal mapping and the canvas element.
The normal mapping technique makes it possible to create interactive lighting effects based on textures.
Looking for an excuse to dive into computer graphics again,
I created a new version of this demo.
This time I used WebGL Shaders and a more advanced physically inspired material
system based on publications by
I also implemented FXAA 3.11 to smooth out some of the aliasing produced by the normal maps.
The results of this experiment are now available as a library called normalmap.js. Check out the demos.
It’s a lot faster and better looking than the old canvas version. Maybe you find a use for it. :)
You can view larger and sharper versions of these demos on 29a.ch/sandbox/2016/normalmap.js/.
You can get the source code for this library on github.
I plan to create some more demos as well as tutorials on creating normalmaps in the future.
I migrated this website HTTPS using certificates by Let’s Encrypt.
This has several benefits. The one I’m most excited about is being able to use
Service Workers to provide offline support
for my little apps.
Let’s Encrypt is an amazing new certificate authority which
allows you to install a SSL/TLS certificate automatically and for free.
This means getting a certificate installed on your server can be as little work as running a command on your server:
The service is currently still in beta but as you can hopefully see the certificates it produces are working just fine.
I encourage you to give it a try.
If anything on this website got broken because of the move to HTTPS, please let me know!
Seven years ago I wrote a piece of software called Play it Slowly.
It allows the user to change the speed and pitch of an audio file independently.
This is useful for example for practicing an instrument or doing transcriptions.
Now I created a new web based version of Play it Slowly called TimeStretch Player.
Open TimeStretch Player
It features (in my opinion) much better audio quality for larger time stretches as well as a 3.14159 × cooler looking user interface. But please not that this is beta stage software it’s still far from perfect and polished.
How the time stretching works
The time stretching algorithm that I use is based on a Phase Vocoder with some simple improvements.
It works by cutting the audio input into overlapping chunks, decomposing those into their individual components using a FFT, adjusting their phase and then resynthesizing them with a different overlap.
Suppose we have a simple wave like this:
We can cut it into overlapping pieces like this:
By changing the overlap between the pieces we can do the time stretching:
This messes up the phases so we need to fix them up:
Now we can just combine the pieces to get a longer version of the original:
In practice things are of course a bit more complicated and there are a lot of compromises to be made. ;)
Much better explanation
If you want a more precise description of the phase vocoder technique than the handwaving above I recommend you to read the paper Improved phase vocoder time-scale modification of audio by Jean Laroche and Mark Dolson.
It explains the basic phase vocoder was well as some simple improvements.
If you do read it and are wondering what the eff the angle sign used in an expression like means:
It denotes the phase - in practice the argument of the complex number of the term.
I think it took me about an hour to figure that out with any certainty.
YAY mathematical notation.
Pure CSS User Interface
I had some fun by creating all the user interface elements using pure CSS, no images are used except for the logo.
It’s all gradients, borders and shadows.
Do I recommend this approach?
Not really, but it sure is a lot of fun!
Feel free to poke around with your browsers dev tools to see how it was done.
While developing the time stretching functionality I also experimented with
some other features like a karaoke mode that can cancel individual parts of an audio file while keeping the rest of the stereo field intact.
This can be useful to remove or isolate parts of songs for instance the vocals or a guitar solo.
However, the quality of the results was not comparable to the time stretching so I decided to remove the feature for now.
But you might get to see that in another app in the future. ;)
I might release the phase vocoder code in a standalone node library in the future but it needs a serious cleanup before that can happen.
View & search all my articles