29a.ch experiments by Jonas Wagner

 Experiments

 View all my experiments

Recent Articles

JPEG Forensics in Forensically

Written by

In this brave new world of alternative facts the people need the tools to tell true from false.

screenshot

Well either that or maybe I was just playing with JPEG encoding and some of that crossed over into my little web based photo forensics tool in the form of some new tools. ;)

JPEG Comments

The JPEG file format contains a section for comments marked by 0xFFFE (COM). These exist in addition to the usual Exif, IPTC and XMP data. In some cases they can contain interesting information that is either not available in the other meta data or has been stripped.

For instance images from wikipedia contain a link back to the image:

File source: https://commons.wikimedia.org/wiki/File:...

Older versions of Photoshop also seem to leave a JPEG Comment too

File written by Adobe Photoshop 4.0

Some versions of libgd (commonly used in PHP web applications) seem to leave comments indicating the version of the library used and the quality the image was saved at:

CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90

The JPEG Analysis in Forensically allows you to view these.

Quantization Tables

This is probably the most interesting bit of information revealed by this new tool in Forensically.

A basic understanding of how JPEG works can help in understanding this tool so I will try to give you some intuition using the noble art of hand waving.

If you already understand JPEG you should probably skip over this gross oversimplification.

JPEG is in general a lossy image compression format. It achieves good compression rates by discarding some of the information contained in the original image.

For this compression the image is divided in 8x8 pixel blocks. Rather than storing the individual pixel values for each of the 64 pixels in the block directly JPEG saves how much they are like one of 64 fixed “patterns” (coefficients). If these patterns are chosen in the right way this transform is still essentially lossless (except for rounding errors) meaning you can back the original image by combining these patterns.

JPEG Patterns


JPEG DCT Coefficients by Devcore (Public Domain)

Now that the image is expressed in terms of these patterns JPEG can selectively discard some of the detail in the image.

How much information about which pattern is discarded is defined in a set of tables that is stored inside of each JPEG image. These tables are called quantization tables.

Example quantization table for quality 95

example jpeg quantization table

There are some suggestions in the JPEG standard on how to calculate these tables for a given quality value (1-99). As it turns out not everyone is using these same tables and quality values.

This is good for us as it means that by looking at the quantization tables used in a JPEG image we can learn something about the device that created the JPEG image.

Identifying manipulated images using JPEG quantization tables

Most computer software and internet services use the standard quantization tables. The very notable exception to this rule are Adobe products, namely Photoshop. This means that we can detect images that have been last saved using Photoshop just by looking at their quantization tables.

Many digital camera manufacturers also have their own secret sauce for creating quantization tables. Meaning that by comparing the quantization tables between different images taken with the same type of camera and setting we can identify whether an image was potentially created by that camera or not.

Automatic identification of quantization tables

Forensically currently automatically identifies quantization tables that have been created according to the standard. In that case it will display Standard JPEG Table Quality=95.

It does also automatically recognize some of the quantization tables used by photoshop. In this case it will display Photoshop quality=85.

I’m missing a complete set of sample images for older photoshop versions using the 0-12 quality scale. If you happen to have one and would be willing to share it please let me know.

If the quantization table is not recognized it will output Non Standard JPEG Table, closest quality=82 or Unknown Table.

Summary

JPEG images contain tables that specify how the image was compressed. Different software and devices use different quantization tables therefore by looking at the quantization tables we can learn something about the device or software that saved the image.

Additional Resources

Structural Analysis

In addition to the quantization tables the order of the different sections (markers) of a JPEG image also reveal detail about it’s creation. In short images that were created in the same way should in general have the same structure. If they don’t it’s an indication that the image may have been tampered with.

String Extraction

Sometimes images contain (meta) data in odd places. A simple way to find these is to scan the image for sequences of sensible characters. A traditional tool to do this is the strings program in Unix-like operating systems.

For example I’ve found images that have been edited with Lightroom that contained a complete xml description of all the edits done to the image hidden in the XMP metadata.

Facebook Meta Data

When using this tool on an image downloaded from facebook one will often find a string like

FBMD01000a9...

From what I can tell this string is present in images that are uploaded via the web interface. A quick google does not reveal much about it’s contents. But it’s presence is a good indicator that an image came from facebook.

I might add a ‘facebook detector’ that looks for the presence & structure of these fields in the future.

Poke around using these new tools and see what you can find! :)

Comments

Principal Component Analysis for Photo Forensics

Written by

As mentioned earlier I have been playing around with Principal Component Analysis (PCA) for photo forensics. The results of this have now landed in my Photo Forensics Tool.

In essence PCA offers a different perspective on the data which allows us to find outliers more easily. For instance colors that just don’t quite fit into the image will often be more apparent when looking at the principal components of an image. Compression artifacts do also tend to be far more visible, especially in the second and third principal components. Now before you fall asleep, let me give you an example.

Example

This is a photo that I recently took:

Photo of a Sunset

To the naked eye this photo does not show any clear signs of manipulation. Let’s see what we can find by looking at the principal components.

First Principal Component

First principal component

Still nothing suspicious, let’s check the second one:

Second Principal Component

Second principal component

And indeed this is where I have removed an insect flying in front of the lens using the inpainting algorithm algorithm (content aware fill in photoshop speak) provided by G’MIC. If you are interested Pat David has a nice tutorial on how to use this in the GIMP.

Resistance to Compression

Second principal component

This technique does still work with more heavily compressed images. To illustrate this I have run the same analysis I did above on the smaller & more compressed version of the photo used in this article rather than the original. As you can clearly see the anomaly caused by the manipulation is still present and quite clear but not as clear as when analyzing a less compressed version of the image. You can also see that the PCA is quite good at revealing the artifacts caused by (re)compression.

Further Reading

If you found this interesting you should consider reading my article Black and White Conversion using PCA which introduces a tool which applies the very same techniques to create beautiful black and white conversions of photographs.

If you want another image to play with try the one in this post by Neal Krawetz is interesting. It can be quite revealing. :)

Comments

Black and White Conversion using PCA

Written by

I have been hacking on my photo forensics tool lately. I found a few references that suggested that performing PCA on the colors of an image might reveal interesting information hidden to the naked eye. When implementing this feature I noticed that it did a quite good job at doing black & white conversions of photos. Thinking about this it does actually make some sense, the first principal component maximizes the variance of the values. So it should result in a wide tonal range in the resulting photograph. This led me to develop a tool to explore this idea in more detail.

This experimental tool is now available for you to play with:
29a.ch/sandbox/2016/monochrome-photo-pca/.

To give you a quick example let’s start with one of my own photographs: original photo

While the composition with so much empty space is debatable, I find this photo fairly good example of an image where a straight luminosity conversion fails. This is because the really saturated colors in the sky look bright/intense even if the straight up luminosity values do not suggest that.

pca conversion luminosity conversion
Hover it to see the results of a straight luminosity conversion instead.

In this case the PCA conversion does (in my opinion) a better job at reflecting the tonality in the sky. I’d strongly suggest that you experiment with the tool yourself.

If you want a bit more detail on how exactly the conversions work please have a look at the help page.

Do I think this is the best technique for black and white conversions? No. You will always be able to get better results by manually tweaking the conversion to fit your vision. Is it an interesting result? I’d say so.

Comments

Smartcrop.js 1.0

Written by

smartcrop illustration

I’ve just released version 1.0 of smartcrop.js. Smartcrop.js is a javascript library I wrote to perform smart image cropping, mainly for generating good thumbnails. The new version includes much better support for node.js by dropping the canvas dependency (via smartcrop-gm and smartcrop-sharp) as well as support for face detection by providing annotations. The API has been cleaned up a little bit and is now using Promises.

Another little takeaway from this release is that I should set up CI even for my little open source projects. I come to this conclusion after having created a dependency mess using npm link locally which lead to everything working fine on my machine but the published modules being broken. I’ve already set up travis for smartcrop-gm, smartcrop-sharp and simplex-noise.js. More of my projects are likely to follow.

Comments

Normalmap.js Javascript Lighting Effects

Written by

Logo Back in 2010 I did a little experiment with normal mapping and the canvas element. The normal mapping technique makes it possible to create interactive lighting effects based on textures. Looking for an excuse to dive into computer graphics again, I created a new version of this demo.

This time I used WebGL Shaders and a more advanced physically inspired material system based on publications by Epic Games and Disney. I also implemented FXAA 3.11 to smooth out some of the aliasing produced by the normal maps. The results of this experiment are now available as a library called normalmap.js. Check out the demos. It’s a lot faster and better looking than the old canvas version. Maybe you find a use for it. :)

Demos

You can view larger and sharper versions of these demos on 29a.ch/sandbox/2016/normalmap.js/.

You can get the source code for this library on github.

Future

I plan to create some more demos as well as tutorials on creating normalmaps in the future.

Comments

Let's encrypt 29a.ch

Written by

I migrated this website HTTPS using certificates by Let’s Encrypt. This has several benefits. The one I’m most excited about is being able to use Service Workers to provide offline support for my little apps.

Let's Encrypt

Let’s Encrypt is an amazing new certificate authority which allows you to install a SSL/TLS certificate automatically and for free. This means getting a certificate installed on your server can be as little work as running a command on your server:

./letsencrypt-auto --apache

The service is currently still in beta but as you can hopefully see the certificates it produces are working just fine. I encourage you to give it a try.

If anything on this website got broken because of the move to HTTPS, please let me know!

Comments

Time Stretching Audio in Javascript

Written by

Seven years ago I wrote a piece of software called Play it Slowly. It allows the user to change the speed and pitch of an audio file independently. This is useful for example for practicing an instrument or doing transcriptions.

Now I created a new web based version of Play it Slowly called TimeStretch Player. It’s written in Javascript and using the WebAudio API.

Screenshot

Open TimeStretch Player

It features (in my opinion) much better audio quality for larger time stretches as well as a 3.14159 × cooler looking user interface. But please not that this is beta stage software it’s still far from perfect and polished.

How the time stretching works

The time stretching algorithm that I use is based on a Phase Vocoder with some simple improvements.

It works by cutting the audio input into overlapping chunks, decomposing those into their individual components using a FFT, adjusting their phase and then resynthesizing them with a different overlap.

Oversimplified Explanation

Suppose we have a simple wave like this:

wave1

We can cut it into overlapping pieces like this:

wave1

By changing the overlap between the pieces we can do the time stretching:

wave1

This messes up the phases so we need to fix them up:

wave1

Now we can just combine the pieces to get a longer version of the original:

wave1

In practice things are of course a bit more complicated and there are a lot of compromises to be made. ;)

Much better explanation

If you want a more precise description of the phase vocoder technique than the handwaving above I recommend you to read the paper Improved phase vocoder time-scale modification of audio by Jean Laroche and Mark Dolson. It explains the basic phase vocoder was well as some simple improvements.

If you do read it and are wondering what the eff the angle sign used in an expression like phase of (t u a, omega k) means: It denotes the phase - in practice the argument of the complex number of the term. I think it took me about an hour to figure that out with any certainty. YAY mathematical notation.

Pure CSS User Interface

I had some fun by creating all the user interface elements using pure CSS, no images are used except for the logo. It’s all gradients, borders and shadows. Do I recommend this approach? Not really, but it sure is a lot of fun! Feel free to poke around with your browsers dev tools to see how it was done.

Future Features

While developing the time stretching functionality I also experimented with some other features like a karaoke mode that can cancel individual parts of an audio file while keeping the rest of the stereo field intact. This can be useful to remove or isolate parts of songs for instance the vocals or a guitar solo. However, the quality of the results was not comparable to the time stretching so I decided to remove the feature for now. But you might get to see that in another app in the future. ;)

Library Release

I might release the phase vocoder code in a standalone node library in the future but it needs a serious cleanup before that can happen.

Comments

 View & search all my articles