Subscribe by email
Want updates? Enter your email


Delivered by Google FeedBurner
No spam, total privacy, opt out any time
News
« When to use vectors not rasters | Main | News of the month »
Tuesday
Aug072012

How to choose an image format

Choosing a file format for scientific images can be tricky. It seems simple enough on the outside, but the details turn out to be full of nuance and gotchas. Plenty of papers and presentations are spoiled by low quality images. Don't let yours be one! Get to know your image editor (I recommend GIMP), and your formats.

What determines quality?

The factors determining the quality of an image are:

  • The number of pixels in the image (aim for 1 million)
  • The size of the image (large images need more pixels)
  • If the image is compressed, e.g. a JPG, the fidelity of the compression (use 90% or more)
  • If the image is indexed, e.g. a GIF, the number of colours available (the bit-depth)

Beware: what really matters is the lowest-quality version of the image file over its entire history. In other words, it doesn't matter if you have a 1200 × 800 TIF today, if this same file was previously saved as a 600 × 400 GIF with 16 colours. You will never get the lost pixels or bit-depth back, though you can try to mitigate the quality loss with filters and careful editing. This seems obvious, but I have seen it catch people out.

JPG is only for photographs

Click on the image to see some artifacts.The problem with JPG is that the lossy compression can bite you, even if you're careful. What is lossy compression? The JPEG algorithm makes files much smaller by throwing some of the data away. It 'decides' which data to discard based on the smoothness of the image in the wavenumber domain, in which the algorithm looks for a property called sparseness. Once discarded, the data cannot be recovered. In discontinuous data — images with lots of variance or hard edges — you might see artifacts (e.g. see How to cheat at spot the difference). Bottom line: only use JPG for photographs with lots of pixels.

Formats in a nutshell

Rather than list advantages and disadvantages exhaustively, I've tried to summarize everything you need to know in the table below. There are lots of other formats, but you can do almost anything with the ones I've listed... except BMP, which you should just avoid completely. A couple of footnotes: PGM is strictly for geeks only; GIF is alone in supporting animation (animations are easy to make in GIMP). 

All this advice could have been much shorter: use PNG for everything. Unless file size is your main concern, or you need special features like animation or georeferencing, you really can't go wrong.

There's a version of this post on SubSurfWiki. Feel free to edit it!

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (9)

Well. Actually i prefer and suggest the vector format (pdf, eps etc) due to its picture quality. If you need to change into non vector format even then it will result in very good quality. Vector format is preferred for Latex documents. Above all if one is saving the images for a particular scientific paper to be compiled by latex it should be either in Vector or non vector formate.

August 7, 2012 | Unregistered CommenterToqeer

@Toqeer: Great point! I did not get into vector 'images'. I guess in my head 'image' means the same as 'raster' — but I am not sure why I've made that the case. I think of PDF, EPS, etc, as 'graphics files'. It's a pretty thin distinction, I know. I do find that people often create graphics in vector format, then export them as rasters for publishing. That's where things can go wrong.

I agree, if you are creating artwork — drawings, illustrations, composite figures, charts, etc — then vector-based formats are best. With vector formats (I use SVG created in Inkscape for drawings), there is no discretization — there are no pixels —so you get scalable resolution. That's why text in PDFs looks so nice when you zoom. I don't know a lot about how rasters are stored inside graphics files like PDFs... I gather they are kept in their original format, unless you choose to compress them, but I could be wrong there. Does anyone know?

Thanks for the comment — I'm really glad you brought this up.

August 7, 2012 | Registered CommenterMatt Hall

Rasters inside PDF are compressed, typically using a lossy JPEG-like compression. However, PDF also supports lossless compression (LZWEncode filter).

August 7, 2012 | Unregistered CommenterSergey Fomel

I kinda like the term Layout Format for PDF's considering it's history.


The thing with PDFs and Tiffs, is you never no what to expect.
-Tif files can be pretty bad, usually the first thing you do when you get them is convert to a usable format like PNG or a PSD if you have money :P The issue is you don't know if it has compression or what type of compression until you start working with it. Some software might think it supports tiff, but maybe not certain tif extensions. This can be a problem when you used a licensed compression algorithm like lzw was or jpeg2000.
-With PDF's you end up with a similar problem, lessened because there are fewer implementations and you are targeting one reader. I have had issues with 3rd party tools, for instance I've compressed jpgs with photoshop and thrown them in a pdf and they look great with acrobat reader, but when you open the pdf in open office or the gimp it's clear it reverts to the default pdf compression. Another issue is things like having links in pdfs, I've seen some implementations that allow dynamic content in pdfs. This is obviously a security issue and may work sometimes but not always.

Also, on the topic of vectors, Why do people in this industry seem to like CGM so much?

You know Jpeg2000 supports georeferencing in the headers like geotiffs?


it's hard to argue with TL;DR: use PNG

August 7, 2012 | Unregistered CommenterToastar

@Toastar: It makes me happy when I hear that I'm not the only one that struggles with things like reading TIFFs and getting things like links to work in PDFs. Well, OK, not 'happy'... less sad.

As for CGM... I have absolutely no idea why on earth those things exist. I bet Sergey would know what's so great about them. I'm amazed to read in Wikipedia that the format is alive and well and even being ported to a web-friendly version. Ugh.

Slightly off topic, but related to CGM, is plotting. I have yet to visit a plotter room that isn't full of postage-stamp-sized seismic sections printed on 8-foot-long pieces of paper, entire User Manuals printed out on a continuous roll, and six weeping geologists trying to change the 36-inch roll.

August 7, 2012 | Registered CommenterMatt Hall

Thanks for the great summary; I've had to give similar tutorials here in the office recently, but now I can just send a link to this article!

I've found recently that PNG seems to be emerging as the format of choice for some of the graphics programs I use. It's a good format, and the file sizes are not much bigger than jpegs in most cases. I do tend to prefer jpegs still though as they seem to have better support across different systems, although that gap is closing too. Mostly I'm just annoyed that PNGs want to open with the Quicktime Player, and QT asks me each and every time if I want to "go pro" (I don't) and if I want to associate all file types with it (I don't). If I spent the time to change the file type association I'd probably like PNGs better...

I don't know what other sort of metadata you can bury in certain file types, but I know that that can be extremely useful. I made my own filetype for animated 3-D objects a long time ago, and it included everything - model data, texture data, normals, inverse kinematic skeleton, physics data, etc.

A couple points from my own experience:

- you can vary the degree of compression for jpegs, so you can have lower lossiness with a bigger file size. These days when size is less of an issue I tend to keep my files as big as possible, and only reduce them if needed.

- I'm a Photoshop guy myself (I really want to like Gimp - I've tried, but I just can't) so I tend to work with PSDs and their gigantic file sizes (and layers! vector object support! masking! etc). If I am sending something to print, I tend to save my PSD as a TIF to prevent compression, and if I am using it on-line or if the image is staying digital I'll save as a jpeg for the file size savings.

- Also, it's worth noting that TIFs support layers and text and stuff like that, whereas jpegs and some simpler formats do not.

- In general, I tend to keep "working files" and "production/finals" separate. I keep the working files in an editable format (.psd, .ai, .dwg, et cetera) and the production files in a less-editable format (.tif, .jpg/.png, .pdf). I don't like sending out my working files because I want to know exactly what the file is going to look like when the receiver gets it, and plus the working files are usually more program specific and not everyone will be able to view them.

- I don't use gifs because of the low bit-depth. They were great when you were making webpages in 1997 so you could have dancing babies and flashing amber lights for your "Page Under Construction" sign that didn't change for three and a half years, but I don't think they have any place in the modern computing world.

- In defense of bitmaps, as a programmer who started in the pre-windows world, bitmaps were your friend. They are super easy to work with, you can edit them in a text file, and you can do a lot of neat tricks with them. They are almost useless now, granted, but they'll always have a special place in my heart.

August 8, 2012 | Unregistered CommenterReid

@Reid: Thanks for the fantastic insights, as always. It's a long time since I had weird 'QuickTime' issues with images, so I reckon there's something funny going on there. 'Dancing babies', lol.

On the subject of Photoshop, I can't resist a quick plug for Adobe's amazing CreativeCloud program. We use InDesign for layout work, so I was looking for an inexpensive and legal way to get it. Adobe's deal is so amazing I feel a bit like I'm stealing it: I get every application for about $45/month — InDesign, Illustrator, Photoshop, Dreamweaver, the lot. This is more or less what it cost to lease InDesign on its own. You can install the applications on two machines.

August 8, 2012 | Registered CommenterMatt Hall

Ok so you have said a lot that I agree with and thank you for explaining this decision making process. In Cartography, It is done daily until you get a good handle on a standard too use.

A few questions though, SIze of image is big for us. What is your reasoning for using a quality of 90% or higher?
And with the same thought in mind Is PNG still your preferred format with a complex image that includes images, text and drawing?

More and More I choose JPEG because of file size vs quality. The quality difference in my opinion is noticeable but very slight at 300 dpi.

August 9, 2012 | Unregistered CommenterMichael Wallace

@Michael: Thanks for the comment. I am wary of lossy compression, because the lost data is unrecoverable — the process is not reversible. If you have plenty of pixels, I guess you can compensate for most purposes. For continuous data, like satellite images or other photos, JEPG is not too bad—the artifacts are very subtle. For images with text and drawings too, JPEG is potentially harmful, because the artifacts are quite strong around abrupt edges (e.g. the edge of text objects). PNG is best for those images... if they have to be raster images, that is. As I mention in the next post, the very best approach for these composite graphics is to use a vector format like SVG, or a layout format like PDF.

August 9, 2012 | Registered CommenterMatt Hall

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>