29a.ch by Jonas Wagner

 Experiments

 View all my experiments

Recent Articles

I made myself a guitar tuner

I first learned about the fourier transform at about the same time I started to play guitar. So obviously the first idea that came to my mind at that time was to build a tuner to tune my new guitar. While I eventually got it to work the accuracy was terrible so it never ended up seeing the light of the day.

Screenshot Try the chromatic tuner

Fast forward a bit over a decade. It’s 2020 and we are fighting a global pandemic using social distancing. I obviously tried to find ways to directly address the issue with code but in the end there is only so much that can be done on that front and a lot of really clever people on it already.

So rather than coming up with another well intended but flawed design of a mechanical ventilator I decided to revisit this old project of mine. :)

So what’s in it?

The tuner has been built with a whole lot of web tech like getUserMedia to access the microphone, WebAudio to get access to the audio data from the microphone as well as web workers to make it a bit faster. Framework wise I used React and TypeScript.

With that out of the way, the rest of the article will focus on the algorithm that makes the whole thing tick.

Disclaimer

In the following sections I will oversimplify a lot of things for the sake of accessibility and brevity.

If you already have a solid understanding of subject please excuse my oversimplifications.

If you don’t keep in mind that there is a lot more to learn and understand than what I will touch on in this description.

If you want to go deeper I highly recommend reading the papers by Philip McLeod et al. mentioned at the end. They formed the basis of this tuner.

Going beyond the fourier transform

Initially I decided to resume this project from where I stopped years ago, by doing a straight fourier transform on the input and then selecting the first significant peak (by magnitude).

Fourier Transform of A played on guitar

Not quite up to speed with the fourier transform? But what is the Fourier Transform? A visual introduction. is a video beautifully illustrating it.

The naive spectrum approach of course still works as badly as it did back then. In slightly oversimplified terms the frequency resolution of the discrete short-time fourier transform is sample rate divided by window size.

So taking a realistic sample rate of 48000 Hz and a (comparably large) window size of 8192 samples we arrive at a frequency resolution of about 6 Hz.

The low E of a guitar in standard tuning is at ~82 Hz. Add 6 Hz and you are already past F.

We need at least 10x that to build something resembling a tuner. In practice we should aim for a resolution of approximately 1 cent or about 100x the resolution we’d get from the straight fourier transform aproach.

There are approaches to improve the accuracy of this approach a bit, in fact we’ll meet one of them a bit later on in a different context. For now let’s focus on something a bit simpler.

Auto correlation

Compared to the fourier transform autocorrelation is fairly simple to explain. In essence it’s a measure of how similar a signal is to a shifted version of itself. This nicely reflect the frequency, or rather period of the signal we are trying to determine.

In a bit more concrete terms it’s the product of the signal and a time shifted version of the signal. In simplistic Javascript that could look a bit like this:

function autoCorrelation(signal) {
  const output = [];
  for(let lag = 0; lag < signal.length; lag++) {
    for(let i = 0; i + lag < signal.length; i++) {
      output[lag] += signal[i]*signal[i+lag]
    }
  }
  return output;
}

The result will look a bit like this: Auto Correlation Function of A played on guitar

Just by eyeballing it you can tell that it’s going to be easier to find the first significant of the autocorrelation compared to the spectrum yielded by the fourier transform above. Our resolution also changed a bit, this time we are measuring the period of the signal. Our resolution is limited by the sample rate of the signal. So to take the example above the period of a 82 Hz signal is 48000/82 or 585 samples. Being off by a sample we’d end up at 82.19 Hz. Not great but at least it’s still an E. At higher frequencies things will start to look different of course but for our purposes that’s a good point to start.

The actual algorithm used in the tuner is based on McLeod, Philip & Wyvill, Geoff. (2005). A smarter way to find pitch. but the straight autocorrelation above is enough to understand what’s going on.

Picking a peak

Now that we have the graph above we’ll need a robust way of determining the first significant peak in it, which hopefully will also be the perceived fundamental frequency of the tone we are analysing.

We’ll do this in two steps, first we will find all the peaks after the initial zero crossing. We can do this by just looping over the signal and keeping track of the highest value we’ve seen and it’s offset. Once the current value drops bellow 0 we can add it to the list of peaks and reset our maximum.

From this list we’ll now pick the first peak which is bigger than the highest peak multiplied by some tolerance factor like 0.9.

At this point we have a basic tuner. It’s not very robust. It’s not very fast or accurate but it should work.

Improving accuracy

The autocorrelation algorithm mentioned above is evaluated at descrete steps matching the samples of the audio input, that limits our accuracy. We can easily improve on this a bit by interpolating. I use parabolic interpolation in my tuner.

parabolic interpolation illustration

The implementation of this is also extremely simple

function parabolicPeakInterpolation(a, b, c) {
  const denominator = a - 2 * b + c;
  if (denominator === 0) return 0;
  return (a - c) / denominator / 2;
}

Improving reliability

So far everything went smoothly. I had a reasonably accurate tuner for as long as I fed it a clean signal (electric guitar straight into a nice interface).

For some reason I also wanted to get this to work using much more dirty signals from something like a smartphone microphone.

At this stage I spend quite a bit of time implementing and evaluating various noise reduction techniques like simple filters and variations on spectral subtraction. In the end their main benefit was in being able to reduce 50/60 Hz hum but the results were still miserable.

So after banging my head against the wall for a little while I embraced a bit of a paradigm shift and gave up on trying to find a magical filter that would give me a clean signal to feed the pitch detection algorithm.

Onset Locking Onset Locking

I now use the brief moment right after the note has been plucked to get a decent initial guess of the note being played. This is possible because the initial attack fo the note is fairly loud resulting in a decent signal to noise ratio.

I then use this initial guess to limit the window in which I look for the peak in the auto correlation caused by the note and combine the various measurements using a simple kalman filter.

I named the scheme onset locking in my code, but I’m certain it’s not a new idea.

Making it fast

I hope the O(n²) loop in the auto correlation section made you cringe a bit. Don’t do it that way. Both basic auto correlation and McLeods take on it (after applying a bit of basic algebra) can be accelerated using the fast fourier transform.

Good bye n squared, hello n log n. :)

Even with the relatively slow FFT implementation I’m using the speed up is between 10 and 100x. So the opimization is definitely worth doing in practice as well.

I’m also using web workers to get the calculations off the main thread and while at it also parallelized.

The result is that the tuner runs fast enough even on my aging Galaxy S7.

What is left to do

Performance in noisy environment is still bad. Using the microphone of a macbook the tuner barely works, if the fan spins up a bit too loudly it will fail completely.

I’d definitely like to improve this in the future but I also have the suspicion that it won’t be trivial, at least without making additional assumptions about the instrument being tuned.

Another front would be to add alternative tunings, and maybe even allow custom tunings. That should be relatively easy to do but I don’t currently have any use for it.

Further reading

McLeod, Philip & Wyvill, Geoff. (2005). A smarter way to find pitch.

McLeod, Philip (2008). Fast, Accurate Pitch Detection Tools for Music Analysis.

Comments

Lap Timer and Analyser for GoPro Videos

During the past two years I’ve had the occasional pleasure of riding my motorcycles on race tracks. While I’m still far away from being fast I really enjoy working on my riding.

This led me to considering different data recording and analysis solutions. A bit down the line I realized that the GoPro cameras I already owned actually contain a surprisingly good GPS unit recording data at 18hz. To make things even better GoPro also documented their meta data format and even published a parser for it on GitHub.

I was very curious how far I could get with that data and started to play around with it. Many hours and experiments later I somehow ended up with my own analysis software and filters tuned for the camera data.

screenshotTry the lap analyser

It loads, processes and displays the data, all in a web browser.

Data Extraction and Filtering

The data extraction is performed using the gopro-telemetry library by Juan Irache.

For filtering the data I wrote a little library for kalman filters in TypeScript. The Kalman Filter book by Roger Labbe helped a lot in learning about the topic. I highly recommend it if you want to dive into the topic yourself.

In addition to the obvious line plots of speed, acceleration and lean angles I also implemented a detailed map and what I call a G Map.

Lap Map

Lap Map

The lap map shows the line, speeds, breaking points and g forces in a spatial context. I especially like the shaded areas in the corners. They show the lateral (distance from the line) and total (color of the area) forces at play.

Looking at the map is a quick way to gauge how close to the limit one is potentially riding in each of the corners.

G-Map

G Map

The G-Map is a histogram of the G-Force acting on the vehicle over time. It can be used to gauge how close to the limit a rider is riding.

Assuming a perfect world where vehicles have isotropic grip, the vehicle sufficient power and the rider always operating it at the limit it would trace out a perfect circle, resembling the Circle of forces.

As you can see my example above is far away from that. It shows conservative riding, always staying within the 1 g that warm track tires can easily handle. It also shows that I still have a lot to learn with regards to consistency.

It also shows that the track in question has more right turns than left.

Future Plans

early draft

There is of course a lot more that can be done here.

The GoPro also has an accelerometer and gyro which could be integrated into the filters to yield more accurate results.

The additional data would yield the actual lean angle which could then be used in combination with the lateral acceleration to gauge the effectiveness of the body position/hangoff of the rider.

I also have another version of this running on a Raspberry PI coupled to an external GPS receiver. This combination results in an all in one integrated data recording solution. One can simply connect to the WiFi hotspot of the little computer onboard the vehicle and view the most recent sessions.

It’s rather nice because it doesn’t require the camera to be running all the time and is quite simple to use. The drawback is that it requires fiddling together a bunch of hardware which I guess most people don’t want to deal with.

Disclaimer

There are two things that need to be said here.

First off this is not a perfect solution and even if it was it couldn’t definitely answer the question of how close to the limit the rider is. Factors like the track surface, weight transfer and other shenanigans are not accounted for.

Secondly, this product and/or service is not affiliated with, endorsed by or in any way associated with GoPro Inc. or its products and services. GoPro, HERO and their respective logos are trademarks or registered trademarks of GoPro, Inc.

Comments

I made myself a noise generator

It’s been a while since the last release but I finally finished something again.

Noise tends to eject me from my focus and flow and sometimes noise canceling headphones just aren’t enough to prevent it. In those instances I often mask the remaining noise with less distracting pure noise.

There already are various tools for this purpose, so there isn’t really a strict need for another one. I just wanted to have some fun and build something that does exactly what I want and looks pretty while doing it. As a nice bonus it gave me an opportunity to play with some more recent web technologies.

screenshot

I don’t expect this to be useful to particularly many people other than myself but that’s why it’s a spare time project. :)

If you want to learn a bit more about it there is a little info on the noise generator help page.

Comments

Urban Astrophotography

Milkyway Core over Zurich The milkyway core over Zurich.

One of the first lessons in astrophotography is that you better find a dark place, far away from the lights of civilization if you want to take good pictures of the night sky.

Wouldn’t it be beautiful if it was possible to photograph the Milky Way in the middle of a city?

I wanted to try.

Step by Step

I packed my camera onto my bike and rode into night to take a few photos. This is what they looked like after I developed them using RawTherapee.

straight out of camera
Straigh out of camera

When you take a picture of the night sky in a city this is about what you will get. At least we can see Saturn and a few stars. Let’s try to peek through the haze.

The first step is to collect more light. The more light we capture with our camera the easier it will be to separate the photons coming from the nebulae in the galactic center from the noise. We can gather more light by capturing more photographs. The only problem is of course that the stars are moving.


The stars are moving

We can fix this problem by aligning the images based on the stars. I used Hugin for this job.


The earth slowly turning

The next step is to combine (stack) all of the images into one. The ground will look blurry because it moves but the stars will remain sharp. I used Siril for this task.

Sharp night sky with blurred ground

Now this is where the magic happens. We remove the ground and stars from the image and then blur it a lot.

Light pollution

All this image now contains is the light pollution. Let’s subtract1 it.

Night sky with ligh pollution removed

With all of the light pollution gone darkness remains.

Now we can amplify the faint light in the image, increase contrast and denoise.

Faint light amplified

Finally we add the recovered light back to one of the original images and apply some final tweaks.

Milkyway Core over Zurich

Why this is possible

This is possible because of two main reasons. Light pollution is the result of light being scattered (light bouncing of particles in the air) in the air. Unlike for instance dense smoke, light pollution does not block the light from the glowing gas clouds of the Milky Way. This means that the signal is still there just very weak compared to the city lights.

The other reason is that the light pollution, especially higher above the horizon becomes more and more even. That’s the property that allows us to separate it from the more focused light of the stars and nebula using a high pass filter.

Settings & Equipment

In case you are curious about the equipment and settings used: Nikon D810, Samyang 24/1.4 @ 2.8, ISO 100, 9 pictures @ 20s, combined using winsorized sigma clipping.

Conclusions

The result is definitely noisy and not of the highest quality but still it amazes me, that this is even possible.

A consumer grade camera and free software can reveal the center of our home galaxy behind the bright haze of city lights, showing us our place in our galaxy and the the universe beyond.

I’m curious how much farther I can push this technique with deliberately chosen framing, tweaked settings, more exposures and maybe a Didymium filter.

Further Reading

If you want to learn about astrophotography in general I recommend you to read lonelyspeck.com. Ian is a much better writer than I will ever be and he has written a lot of great articles.

1: In practice you want to use grain extract/merge here since subtraction in most graphics software clips negative values to zero.

Comments

JPEG Forensics in Forensically

In this brave new world of alternative facts the people need the tools to tell true from false.

screenshot

Well either that or maybe I was just playing with JPEG encoding and some of that crossed over into my little web based photo forensics tool in the form of some new tools. ;)

JPEG Comments

The JPEG file format contains a section for comments marked by 0xFFFE (COM). These exist in addition to the usual Exif, IPTC and XMP data. In some cases they can contain interesting information that is either not available in the other meta data or has been stripped.

For instance images from wikipedia contain a link back to the image:

File source: https://commons.wikimedia.org/wiki/File:...

Older versions of Photoshop also seem to leave a JPEG Comment too

File written by Adobe Photoshop 4.0

Some versions of libgd (commonly used in PHP web applications) seem to leave comments indicating the version of the library used and the quality the image was saved at:

CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90

The JPEG Analysis in Forensically allows you to view these.

Quantization Tables

This is probably the most interesting bit of information revealed by this new tool in Forensically.

A basic understanding of how JPEG works can help in understanding this tool so I will try to give you some intuition using the noble art of hand waving.

If you already understand JPEG you should probably skip over this gross oversimplification.

JPEG is in general a lossy image compression format. It achieves good compression rates by discarding some of the information contained in the original image.

For this compression the image is divided in 8x8 pixel blocks. Rather than storing the individual pixel values for each of the 64 pixels in the block directly JPEG saves how much they are like one of 64 fixed “patterns” (coefficients). If these patterns are chosen in the right way this transform is still essentially lossless (except for rounding errors) meaning you can back the original image by combining these patterns.

JPEG Patterns

JPEG DCT Coefficients
JPEG DCT Coefficients by Devcore (Public Domain)

Now that the image is expressed in terms of these patterns JPEG can selectively discard some of the detail in the image.

How much information about which pattern is discarded is defined in a set of tables that is stored inside of each JPEG image. These tables are called quantization tables.

Example quantization table for quality 95

example jpeg quantization table

There are some suggestions in the JPEG standard on how to calculate these tables for a given quality value (1-99). As it turns out not everyone is using these same tables and quality values.

This is good for us as it means that by looking at the quantization tables used in a JPEG image we can learn something about the device that created the JPEG image.

Identifying manipulated images using JPEG quantization tables

Most computer software and internet services use the standard quantization tables. The very notable exception to this rule are Adobe products, namely Photoshop. This means that we can detect images that have been last saved using Photoshop just by looking at their quantization tables.

Many digital camera manufacturers also have their own secret sauce for creating quantization tables. Meaning that by comparing the quantization tables between different images taken with the same type of camera and setting we can identify whether an image was potentially created by that camera or not.

Automatic identification of quantization tables

Forensically currently automatically identifies quantization tables that have been created according to the standard. In that case it will display Standard JPEG Table Quality=95.

It does also automatically recognize some of the quantization tables used by photoshop. In this case it will display Photoshop quality=85.

I’m missing a complete set of sample images for older photoshop versions using the 0-12 quality scale. If you happen to have one and would be willing to share it please let me know.

If the quantization table is not recognized it will output Non Standard JPEG Table, closest quality=82 or Unknown Table.

Summary

JPEG images contain tables that specify how the image was compressed. Different software and devices use different quantization tables therefore by looking at the quantization tables we can learn something about the device or software that saved the image.

Additional Resources

Structural Analysis

In addition to the quantization tables the order of the different sections (markers) of a JPEG image also reveal detail about it’s creation. In short images that were created in the same way should in general have the same structure. If they don’t it’s an indication that the image may have been tampered with.

String Extraction

Sometimes images contain (meta) data in odd places. A simple way to find these is to scan the image for sequences of sensible characters. A traditional tool to do this is the strings program in Unix-like operating systems.

For example I’ve found images that have been edited with Lightroom that contained a complete xml description of all the edits done to the image hidden in the XMP metadata.

Facebook Meta Data

When using this tool on an image downloaded from facebook one will often find a string like

FBMD01000a9...

From what I can tell this string is present in images that are uploaded via the web interface. A quick google does not reveal much about it’s contents. But it’s presence is a good indicator that an image came from facebook.

I might add a ‘facebook detector’ that looks for the presence & structure of these fields in the future.

Poke around using these new tools and see what you can find! :)

Comments

Principal Component Analysis for Photo Forensics

As mentioned earlier I have been playing around with Principal Component Analysis (PCA) for photo forensics. The results of this have now landed in my Photo Forensics Tool.

In essence PCA offers a different perspective on the data which allows us to find outliers more easily. For instance colors that just don’t quite fit into the image will often be more apparent when looking at the principal components of an image. Compression artifacts do also tend to be far more visible, especially in the second and third principal components. Now before you fall asleep, let me give you an example.

Example

This is a photo that I recently took:

Photo of a Sunset

To the naked eye this photo does not show any clear signs of manipulation. Let’s see what we can find by looking at the principal components.

First Principal Component

First principal component

Still nothing suspicious, let’s check the second one:

Second Principal Component

Second principal component

And indeed this is where I have removed an insect flying in front of the lens using the inpainting algorithm algorithm (content aware fill in photoshop speak) provided by G’MIC. If you are interested Pat David has a nice tutorial on how to use this in the GIMP.

Resistance to Compression

Second principal component

This technique does still work with more heavily compressed images. To illustrate this I have run the same analysis I did above on the smaller & more compressed version of the photo used in this article rather than the original. As you can clearly see the anomaly caused by the manipulation is still present and quite clear but not as clear as when analyzing a less compressed version of the image. You can also see that the PCA is quite good at revealing the artifacts caused by (re)compression.

Further Reading

If you found this interesting you should consider reading my article Black and White Conversion using PCA which introduces a tool which applies the very same techniques to create beautiful black and white conversions of photographs.

If you want another image to play with try the one in this post by Neal Krawetz is interesting. It can be quite revealing. :)

Comments

Ditherlicious - 1 Bit Image Dithering

While experimenting with Black and White Conversion using PCA I also investigated dithering algorithms and played with those. I found that Stucki Dithering would yield rather pleasant results. So I created a little application for just that: Ditherlicious.

screenshot

Open Ditherlicious

I find that it works really nicely on high key photographs like this: example

Photo by Tuncay (CC BY)

I hope you enjoy playing with it. :)

Comments

Black and White Conversion using PCA

I have been hacking on my photo forensics tool lately. I found a few references that suggested that performing PCA on the colors of an image might reveal interesting information hidden to the naked eye. When implementing this feature I noticed that it did a quite good job at doing black & white conversions of photos. Thinking about this it does actually make some sense, the first principal component maximizes the variance of the values. So it should result in a wide tonal range in the resulting photograph. This led me to develop a tool to explore this idea in more detail.

This experimental tool is now available for you to play with:
29a.ch/sandbox/2016/monochrome-photo-pca/.

To give you a quick example let’s start with one of my own photographs: original photo

While the composition with so much empty space is debatable, I find this photo fairly good example of an image where a straight luminosity conversion fails. This is because the really saturated colors in the sky look bright/intense even if the straight up luminosity values do not suggest that.

pca conversion luminosity conversion
Hover it to see the results of a straight luminosity conversion instead.

In this case the PCA conversion does (in my opinion) a better job at reflecting the tonality in the sky. I’d strongly suggest that you experiment with the tool yourself.

If you want a bit more detail on how exactly the conversions work please have a look at the help page.

Do I think this is the best technique for black and white conversions? No. You will always be able to get better results by manually tweaking the conversion to fit your vision. Is it an interesting result? I’d say so.

Comments

 View & search all my articles