Introducing Striplog

Last week I mentioned we'd been working on a project called striplog. I told you it was "a new Python library for manipulating well data, especially irregularly sampled, interval-based, qualitative data like cuttings descriptions"... but that's all. I thought I'd tell you a bit more about it — why we built it, what it does, and how you can use it yourself.

The problem we were trying to solve

The project was conceived with the Nova Scotia Department of Energy, who had a lot of cuttings and core descriptions that they wanted to digitize, visualize, and archive. They also had some hand-drawn striplog images — similar to the one on the right — that needed to be digitized in the same way. So there were a few problems to solve:

  • Read a striplog image and a legend, turn the striplog into tops, bases, and 'descriptions', and finally save the data to an archive-friendly LAS file.
  • Parse natural language 'descriptions', converting them into structured data via an arbitrary lexicon. The lexicon determines how we interpret the words 'sandstone' or 'fine grained'.
  • Plot striplogs with minimal effort, and keep plotting parameters separate from data. It should be easy to globally change the appearance of a particular lithology.
  • Make all of this completely agnostic to the data type, so 'descriptions' might be almost anything you can think of: special core analyses, palaeontological datums, chronostratigraphic intervals...

The usual workaround, I mean solution, to this problem is to convert the descriptions into some sort of code, e.g. sandstone = 1, siltstone = 2, shale = 3, limestone = 4. Then you make a log, and plot it alongside your other curves or make your crossplots. But this is rather clunky, and if you lose the mapping, the log is useless. And we still have the other problems: reading images, parsing descriptions, plotting...

What we built

One of the project requirements was a Python library, so don't look for a pretty GUI or fancy web app. (This project took about 6 person-weeks; user interfaces take much longer to craft.) Our approach is always to try to cope with chaos, not fix it. So we tried to design something that would let the user bring whatever data they have: XLS, CSV, LAS, images.

The library has tools to, for example, read a bunch of cuttings descriptions (e.g. "Fine red sandstone with greenish shale flakes"), and convert them into Rocks — structured data with attributes like 'lithology' and 'colour', or whatever you like: 'species', 'sample number', 'seismic facies'. Then you can gather Rocks into Intervals (basically a list of one or more Rocks, with a top and base depth, height, or age). Then you can gather Intervals into a Striplog, which can, with the help of a Legend if you wish, plot itself or write itself to a CSV or LAS file.

The Striplog object has some useful features. For example, it's iterable in Python, so it's trivial to step over every unit and perform some query or analysis. Some tasks are built-in: Striplogs can summarize their own statistics, for example, and searching for 'sandstone' returns another Striplog object containing only those units matching the query.

  >>> striplog.find('sandstone')
  Striplog(4 Intervals, start=230.328820116, stop=255.435203095)

We can also do a reverse lookup, and see what's at some arbitrary depth:

  >>> striplog.depth(260).primary  # 'primary' gives the first component
  Rock("colour":"grey", "lithology":"siltstone")

You can read more in the documentation. And here's Striplog in a picture:

An attempt to represent striplog's objects, more or less arranged according to a workflow.

Where to get it

For the time being, the tool is only available as a Python library, for you to use on the command line, or in IPython Notebooks (follow along here). You can install striplog very easily:

  pip install striplog

Or you can clone the repo on GitHub. 

As a new project, it has some rough edges. In particular, the Well object is rather rough. The natural language processing could be much more sophisticated. The plotting could be cuter. If and when we unearth more use cases, we'll be hacking some more on it. In the meantime, we would welcome code or docs contributions of any kind, of course.

And if you think you have a use for it, give us a call. We'd love to help.


I think it's awesome that the government reached out to a small, Nova Scotia-based company to do this work, keeping tax dollars in the province. But even more impressive is that they had the conviction not only to allow allow but even to encourage us to open source it. This is exactly how it should be. In contrast, I was contacted recently by a company that is building a commercial plug-in for Petrel. They had received funding from the federal government to do this. I find this... odd.

The perfect storm

Since starting Agile late in 2010, I have never not been busy. Like everyone else... there's always a lot going on. But March was unusual. Spinning plates started wobbling. One or three fell. One of those that fell was the blog. (Why is it always your favourite plate that smashes?)

But I'm back, feeling somewhat refreshed after my accidental quadrennial sabbatical and large amounts of Easter chocolate. And I thought a cathartic way to return might be to share with you what I've been up to.

Writing code for other people

We've always written code to support our consulting practice. We've written seismic facies algorithms, document transformation routines (for AAPG Wiki), seismic acquisition tools, and dozens of other things besides. But until January we'd never been contracted to build software as an end in itself.

Unfortunately for my sanity, the projects had to be finished by the end of March. The usual end-of-project crunch came along, as we tried to add features, fix bugs, remove cruft, and compile documentation without breaking anything. And we just about survived it, thanks to a lot of help from long-time Agile contributor, Ben Bougher. One of the products was striplog, a new Python library for manipulating well data, especially irregularly sampled, interval-based, qualitative data like cuttings descriptions. With some care and feeding, I think it might be really useful one day.

The HUB is moving

Alongside the fun with geoscience, we're in the midst of a fairly huge renovation. As you may know, I co-founded The HUB South Shore in my town in 2013. It's where I do my Agile work, day-to-day. It's been growing steadily and last year we ran out of space to accept new members. So we're moving down to the Main Street in Mahone Bay, right under the town's only pub. It's a great space, but it turns out that painting a 200 m² warehouse takes absolutely ages. Luckily, painting is easy for geologists, since it's basically just a lot of arm-waving. Anyway, that's where I'm spending my free time these days. [Pics.]

MAder's Wharf, by the frozen ocean.

MAder's Wharf, by the frozen ocean.

The ship's knees

The ship's knees

Co-founder Dave painting trim

Co-founder Dave painting trim

Shovelling snow

What my house has looked like for the last 8 weeks.

What my house has looked like for the last 8 weeks.

Seriously, it just will. Not. Stop. It's snowing now, for goodness sake. I'm pretty sure we have glaciers.

What does this have to do with work? Well, we're not talking about Calgary-style pixie dust here. We ain't nipping out with the shovel for a few minutes of peaceful exercise. We're talking about 90 minutes of the hardest workout you've ever endured, pointlessly pushing wet snow around because you ran out of places to put it three weeks ago. At the end, when you've finished and/or given up, Jack Frost tosses a silver coin to see if your reward will be a hot shower and a course of physiotherapy, or sudden cardiac arrest and a ride in the air ambulance.


There is lots of good techno-geophysics to look forward to. We're running the Geoscience Hackathon in Calgary at the beginning of May. You can sign up here... If you're not sure, sign up anyway: I guarantee you'll have fun. There's a bootcamp too, if you're just starting out or want some tips for hacking geophysics. Thank you to our awesome sponsors:

There's also the geophysics mini-symposium at SciPy in Austin in July (deadline approaching!). That should be fun. And I'm hoping the hackathon right before SEG in New Orleans will be even more epic than last year's event. The theme: Games.

Evan is out there somewhere

Normally when things at Agile World Headquarters get crazy, we can adapt and cope. But it wasn't so easy this time: Evan is on leave and in the middle of an epic world tour with his wife Tara. I don't actually know where he is right now. He was in Bali a couple of weeks ago... If you see him say Hi!

As I restart the engines on All The Things, I want to thank anyone who's been waiting for an email reply, or — in the case of the 52 Things... Rock Physics authors — a book, for their patience. Sometimes it all hits at once.

Onwards and upwards!

The hackathon is coming to Calgary

Before you stop reading and surf away thinking hackathons are not for you, stop. They are most definitely for you. If you still read this blog after me wittering on about Minecraft, anisotropy, and Python practically every week — then I'm convinced you'll have fun at a hackathon. And we're doing an new event this year for newbies.

For its fourth edition, the hackathon is coming to Calgary. The city is home to thousands of highly motivated and very creative geoscience nuts, so it should be just as epic as the last edition in Denver. The hackathon will be the weekend before the GeoConvention — 2 and 3 May. The location is the Global Business Centre, which is part of the Telus Convention Centre on 8th Avenue. The space is large and bright; it should be perfect, once it smells of coffee...

Now's the time to carpe diem and go sign up. You won't regret it. 

On the Friday before the hackathon, 1 May, we're trying something new. We'll be running a one-day bootcamp. you can sign up for the bootcamp here on the site. It's an easy, low-key way to experience the technology and goings-on of a hackathon. We'll be doing some gentle introductions to scientific computing for those who want it, and for the more seasoned hackers, we'll be looking at some previous projects, useful libraries, and tips and tricks for building a software tool in less than 2 days.

The event would definitely not be possible without the help of progressive people who want to see more creativity and invention in our industry and our science. These companies and the people that work there deserve your attention. 

Last quick thing: if you know a geeky geoscientist in Calgary, I'd love it if you forwarded this post to them right now. 

UPDATE later on 2 March

Great new: Ikon Science are joining our existing sponsors, dGB Earth Sciences and OpenGeoSolutions — both long-time supporters of the hackathon events — to help make something awesome happen. We're grateful for the support!

February linkfest

The linkfest is back! All the best bits from the news feed. Tips? Get in touch.

The latest QGIS — the free and open-source GIS we use — dropped last week. QGIS v2.8 'Wien' has lots of new features like expressions in property fields, better legends, and colour palettes.

On the subject of new open-source software, I've mentioned Wayne Mogg's OpendTect plug-ins before. This time he's outdone himself, with an epic new plug-in providing an easy way to write OpendTect attributes in Python. This means we can write seismic attribute algorithms in Python, using OpendTect for I/O,project management, visualization, and interpretation. 

It's not open source, but Google Earth Pro is now free! The free version was pretty great, but Pro has a few nice features, like better measuring tools, higher resolution screen-grabs, movies, and ESRI shapefile import. Great for scoping field areas.

Speaking of fieldwork, is this the most amazing outcrop you've ever seen? Those are house-sized blocks floating around in a mass-transport deposit. If you want to know more, you're in luck, because Zane Jobe blogged about it recently.  (You do follow his blog, right?)

By the way, if sedimentology is your thing, for some laboratory eye-candy, follow SedimentExp on Twitter. (Zane's on Twitter too!)

If you like to look after your figures, Rougier et al. recently offered 10 simple rules for making them better. Not only is the article open access (more amazing: it's public domain), the authors provide Python code for all their figures. Inspiring.

Open, even interactive, code will — it's clear — be de rigueur before the decade is out. Even Nature is at it. (Well, I shouldn't say 'even', because Nature is a progressive publishing hose, at the same time as being part of 'the establishment'.) Take a few minutes to play with it... it's pretty cool. We have published lots of static notebooks, as has SEG; interactivity is coming!

A question came up recently on the Earth Science Stack Exchange that made me stop and think: why do geophysicists use \(V_\mathrm{P}/V_\mathrm{S}\) ratio, and not \(V_\mathrm{S}/V_\mathrm{P}\) ratio, which is naturally bounded. (Or is it? Are there any materials for which \(V_\mathrm{S} > V_\mathrm{P}\)?) I think it's tradition, but maybe you have a better answer?

On the subject of geophysics, I think this is the best paper title I've seen for a while: A current look at geophysical detection of illicit tunnels (Steve Sloan in The Leading Edge, February 2015). Rather topical just now too.

At the SEG Annual Meeting in Denver, I recorded an interview with SEG's Isaac Farley about wikis and knowledge sharing...

OK, well if this is just going to turn into blatant self-promotion, I might as well ask you to check out Pick This, now with over 600 interpretations! Please be patient with it, we have a lot of optimization to do...

Rock property catalog


One of the first things I do on a new play is to start building a Big Giant Spreadsheet. What goes in the big giant spreadsheet? Everything — XRD results, petrography, geochemistry, curve values, elastic parameters, core photo attributes (e.g. RGB triples), and so on. If you're working in the Athabasca or the Eagle Ford then one thing you have is heaps of wells. So the spreadsheet is Big. And Giant. 

But other people's spreadsheets are hard to use. There's no documentation, no references. And how to share them? Email just generates obsolete duplicates and data chaos. And while XLS files are not hard to put on the intranet or Internet,  it's hard to do it in a way that doesn't involve asking people to download the entire spreadsheet — duplicates again. So spreadsheets are not the best choice for collaboration or open science. But wikis might be...

The wiki as database

Regular readers will know that I'm a big fan of MediaWiki. One of the most interesting extensions for the software is Semantic MediaWiki (SMW), which essentially turns a wiki into a database — I've written about it before. Of course we can read any wiki page over the web, but you can query an SMW-powered wiki, which means you can, for example, ask for the elastic properties of a rock, such as this Mesaverde sandstone from Thomsen (1986). And the wiki will send you this JSON string:

{u'exists': True,
 u'fulltext': u'Mesaverde immature sandstone 3 (Kelly 1983)',
 u'fullurl': u'',
 u'namespace': 0,
 u'printouts': {
    u'Lithology': [{u'exists': True,
      u'fulltext': u'Sandstone',
      u'fullurl': u'',
      u'namespace': 0}],
    u'Delta': [0.148],
    u'Epsilon': [0.091],
    u'Rho': [{u'unit': u'kg/m\xb3', u'value': 2460}],
    u'Vp': [{u'unit': u'm/s', u'value': 4349}],
    u'Vs': [{u'unit': u'm/s', u'value': 2571}]

This might look horrendous at first, or even at last, but it's actually perfectly legible to Python. A little bit of data wrangling and we end up with data we can easily plot. It takes no more than a few lines of code to read the wiki's data, and construct this plot of \(V_\text{P}\) vs \(V_\text{S}\) for all the rocks I have so far put in the wiki — grouped by gross lithology:

A page from the Rock Property Catalog in Very much an experiment, rocks contain only a few key properties today.

A page from the Rock Property Catalog in Very much an experiment, rocks contain only a few key properties today.

If you're interested in seeing how to make these queries, have a look at this IPython Notebook. It takes you through reading the data from my embryonic catalogue on Subsurfwiki, processing the JSON response from the wiki, and making the plot. Once you see how easy it is, I hope you can imagine a day when people are publishing open data on the web, and sharing tools to query and visualize it.

Imagine it, then figure out how you can help build it!


Thomsen, L (1986). Weak elastic anisotropy. Geophysics 51 (10), 1954–1966. DOI 10.1190/1.1442051.

Pick This! Social interpretation

PIck This is a new web app for social image interpretation. Sort of Stack Exchange or Quora (both awesome Q&A sites) meets Flickr. You look for an interesting image and offer your interpretation with a quick drawing. Interpretations earn reputation points. Once you have enough rep, you can upload images and invite others to interpret them. Find out how others would outline that subtle brain tumour on the MRI, or pick that bifurcated fault...

A section from the Penobscot 3D, offshore Nova Scotia, Canada. Overlain on the seismic image is a heatmap of interpretations of the main fault by 26 different interpreters. The distribution of interpretations prompts questions about what is 'the' answer. Pick this image yourself at

A section from the Penobscot 3D, offshore Nova Scotia, Canada. Overlain on the seismic image is a heatmap of interpretations of the main fault by 26 different interpreters. The distribution of interpretations prompts questions about what is 'the' answer. Pick this image yourself at

The app was born at the Geophysics Hackathon in Denver last year. The original team consisted of Ben Bougher, a UBC student and long-time Agile collaborator, Jacob Foshee, a co-founder of Durwella, Chris Chalcraft, a geoscientist at OpenGeoSolutions, Agile's own Evan Bianco of course, and me ordering pizzas and googling domain names. By demo time on Sunday afternoon, we had a rough prototype, good enough for the audience to provide the first seismic interpretations.

Getting from prototype to release

After the hackathon, we were very excited about Pick This, with lots of ideas for new features. We wanted it to be easy to upload an image, being clear about its provenance, and extremely easy to make an interpretation, right in the browser. After some great progress, we ran into trouble bending the drawing library, Raphael.js, to our will. The app languished until Steve Purves, an affable geoscientist–programmer who lives on a volcano in the middle of the Atlantic, came to the rescue a few days ago. Now we have something you can use, and it's fun! For example, how would you pick this unconformity

This data is proprietary to MultiKlient Invest AS. Licensed CC-BY-SA. 

This data is proprietary to MultiKlient Invest AS. Licensed CC-BY-SA. 

This beautiful section is part of this month's Tutorial in SEG's The Leading Edge magazine, and was the original inspiration for the app. The open access essay is by Don Herron, the creator of Interpreter Sam, and describes his approach to interpreting unconformities, using this image as the partially worked example. We wanted a way for readers to try the interpretation themselves, without having to download anything — it's always good to have a use case before building something new. 

What's next for Pick This?

I'm really excited about the possibilities ahead. Apart from the fun of interpreting other people's data, I'm especially excited about what we could learn from the tool — how long do people spend interpreting? How many edits do they make before submitting? And we'd love to add other modes to the tool, like choosing between two image enhancement results, or picking multiple features. And these possibilities only multiply when you think about applications outside earth science, in medical imaging, remote sensing, or astronomy. So much to do, so little time! 

We trust your opinion. Maybe you can help us:

  • Is Pick This at all interesting or fun or useful to you? Is there a use case that occurs to you? 
  • Making the app better will take time and therefore money. If your organization is interested in image enhancement, subjectivity in interpretation, or machine learning, then maybe we can work together. Get in touch!

Whatever you do, please have a look at Pick This and let us know what you think.

Minecraft for geoscience

The Isle of Wight, complete with geology. ©Crown copyright. 

The Isle of Wight, complete with geology. ©Crown copyright. 

You might have heard of Minecraft. If you live with any children, then you definitely have. It's a computer game, but it's a little unusual — there isn't really a score, and the gameplay has no particular goal or narrative, leaving everything to the player or players. It's more like playing with Lego than, say, playing chess or tennis or paintball. The game was created by Swede Markus Persson and then marketed by his company Mojang. Microsoft bought Mojang in September last year for $2.5 billion. 

What does this have to do with geoscience?

Apart from being played by 100 million people, the game has attracted a lot of attention from geospatial nerds over the last 12–18 months. Or rather, the Minecraft environment has. The game chiefly consists of fabricating, placing and breaking 1-m-cubed blocks of various materials. Even in normal use, people create remarkable structures, and I don't just mean 'big' or 'cool', I mean truly remarkable. So the attention from the British Geological Survey and the Danish Geodata Agency. If you've spent any time building geocellular models, then the process of constructing elaborate digital models is familiar to you. And perhaps it's not too big a leap to see how the virtual world of Minecraft could be an interesting way to model the subsurface. 

Still I was surprised when, chatting to Thomas Rapstine at the Geophysics Hackathon in Denver, he mentioned Joe Capriotti and Yaoguo Li, fellow researchers at Colorado School of Mines. Faced with the problem of building 3D earth models for simulating geophysical experiments — a problem we've faced with — they hit on the idea of adapting Minecraft models. This is not just a gimmick, because Minecraft is specifically designed for simulating and manipulating landscapes.

The Minecraft model (left) and synthetic gravity data (right). Image ©2014 SEG and Capriotti & Li. Used in acordance with SEG's permissions. 

The Minecraft model (left) and synthetic gravity data (right). Image ©2014 SEG and Capriotti & Li. Used in acordance with SEG's permissions

If you'd like to dabble in geospatial Minecraft yourself, the FME software from Safe now has a standardized way to get Minecraft data into and out of the environment. Essentially they treat the blocks as point clouds (e.g. as you might get from Lidar or a laser scan), so they can do conventional operations, such as differences or filtering, with the software. They recorded a webinar on the subject yesterday.

Minecraft is here to stay

There are two other important angles to Minecraft, both good reasons why it will probably be around for a while, and probably both something to do with why Microsoft bought Mojang...

  1. It is a programming gateway drug. Like web coding, and image processing, Minecraft might be another way to get people, especially young people, interested in computing. The tiny Linux machine Raspberry Pi comes with a version of the game with a full Python API, so you can control the game programmatically.  
  2. Its potential beyond programming as a STEM teaching aid and engagement tool. Here's another example. Indeed, the United Nations is involved in Block By Block, an effort around collaborative public space design echoing the Blockholm project, an early attempt to explore social city planning in the tool.

All of which is enough to make me more curious about the crazy-sounding world my kids have built, with its Houston-like city planning: house, school, house, Home Sense, house, rocket launch pad...


Capriotti, J and Yaoguo Li (2014) Gravity and gravity gradient data: Understanding their information content through joint inversions. SEG Technical Program Expanded Abstracts 2014: pp. 1329-1333. DOI 10.1190/segam2014-1581.1 

The thumbnail image is from an image by Terry Madeley.

UPDATE: Thank you to Andy for pointing out that Yaoguo Li is a prof, not a student.

What is anisotropy?


Geophysicists often assume that the earth is isotropic. This word comes from 'iso', meaning same, and 'tropikos', meaning something to do with turning. The idea is that isotropic materials look the same in all directions — they have no orientation, and we can make measurements in any direction and get the same result. Note that this is different from homogeneous, which is the quality of uniformity of composition. You can think of anisotropy as a directional (not just spatial) variation in homogeneity. 

In the illustration, I may have cheated a bit. The lower-left image shows a material that is homogeneous but anisotropic. The thin lines are supposed to indicate microfractures, say, or the alignment of clay flakes, or even just stress. So although the material has uniform composition, at least at this scale, it has an orientation.

The recognition of the earth's anisotropy is a dominant theme among papers in our forthcoming 52 Things book on rock physics. It's not exactly a new thing — it was an emerging trend 10 years ago when Larry Lines at U of C reviewed Milo Backus's famous 'challenges' (Lines 2005). And even then, the spread of anisotropic processing and analysis had been underway for almost 20 years since Leon Thomsen's classic 1986 paper, Weak elastic anisotropy. This paper introduced three parameters that we need—alongside the usual \(V_\text{P}\), \(V_\text{S}\), and \(\rho\)—to describe anisotropy. They are \(\delta\) (delta), \(\epsilon\) (epsilon), and \(\gamma\) (gamma), collectively referred to as Thomsen's parameters

  • \(\delta\) or delta — the short offset effect — captures the relationship between the velocity required to flatten gathers (the NMO velocity) and the zero-offset average velocity as recorded by checkshots. It's easy to measure, but perhaps hard to understand in physical terms.
  • \(\epsilon\) or epsilon — the long offset effect — is, according to Thomsen himself:  "the fractional difference between vertical and horizontal P velocities; i.e., it is the parameter usually referred to as 'the' anisotropy of a rock". Unfortunately, the horizontal velocity is rather hard to measure. 
  • \(\gamma\) or gamma — the shear wave effect — relates, as rock physics meister Colin Sayers put it on Twitter, a horizontal shear wave with horizontal polarization to a vertical shear wave. He added, "\(\gamma\) can be determined in a single well using sonic. So the correlation with \(\epsilon\) and \(\delta\) is of great interest."

Sidenote to aspiring authors: Thomsen's seminal paper, which has been cited over 2800 times, is barely 13 pages long. Three and a half of those pages are taken up by... data! A huge table containing the elastic parameters of almost 60 samples. And this is from a corporate scientist at Amoco. So no more excuses: publish you data! </rant>

Vertical transverse what now?

The other bit of jargon you will come across is the concept of transverse isotropy, which is a slightly perverse (to me) way of expressing the orientation of the anisotropy effect. In vertical transverse isotropy, the horizontal velocity is different from the vertical velocity. Think of flat-lying shales with gravity dominating the stress field. Usually, the velocity is faster along the beds than it is across the beds. This manifests as nonhyperbolic moveout in the far offsets, in particular a pull-up or 'hockey stick' effect in the gathers — the arrivals are unexpectedly early at long offsets. Clearly, this will also affect AVO analysis

There's more jargon. If the rocks are dipping, we call it tilted transverse isotropy, or TTI. But if the anisotropies, so to speak, are oriented vertically — as with fractures, for example, or simply horizontal stress — then it's horizontal transverse isotropy, or HTI. This causes azimuthal (compass directional) travel-time variations. We can even venture into situations where we encounter orthorhombic anisotropy, as in the combined VTI/HTI model shown above. It's easy to imagine how these effects, if not accounted for in processing, can (and do!) result in suboptimal seismic images. Accounting for them is not easy though, and trying can do more harm than good.

If you have handy rules of thumb of ways of conceptualizing anisotropy, I'd love to hear about them. Some time soon I want to write about thin-layer anisotropy, which is where this post was going until I got sidetracked...


Lines, L (2005). Addressing Milo's challenges with 25 years of seismic advances. The Leading Edge 24 (1), 32–35. DOI 10.1190/1.2112389.

Thomsen, L (1986). Weak elastic anisotropy. Geophysics 51 (10), 1954–1966. DOI 10.1190/1.1442051.

The (bad) stuff of legend

What is a legend? Merriam–Webster says:

  1. A story from the past that is believed by many people but cannot be proved to be true.
  2. An explanatory list of the symbols on a map or chart.

I think we can combine these:

An explanatory list from the past that is believed by many to be useful but which cannot be proved to be.

Maybe that goes too far, sometimes you need a legend. But often, very often, you don't. At the very least, you should always try hard to make the legend irrelevant. Why, and how, can you do this? 

A case study

On the right is a non-scientific caricature of a figure from a paper I just finished reviewing for Geophysics. I won't give any more details because I don't want to pick on it unduly — lots of authors make the same mistakes.

Here are some of the things I think are confusing about this figure, detracting from the science in the paper. 

  • Making the reader cross-reference the line decoration with the legend makes it harder to make the comparison you're asking them to make. Just label the lines directly. 
  • Using unhelpful, generic names like 1, 2, and 3 for the models leads the reader into cross-reference Inception. The models were shown and explained on the previous page. 
  • Inception again: the models 1, 2, and 3 were shown in the previous figure parts (a), (b), and (c) respectively. So I had to cross-reference deeper still to really find out about them. 
  • The paper used colour elsewhere, so the use of black and white line decoration here seems unnecessary. There are other ways to ensure clarity if the paper is photocopied.
  • Everything on the same visual plane, so to speak, so the chart cannot take any more detail, such as gridlines. 

Getting better

I have tried to fix some of this in the version of the figure shown here. It's the same size as the original. The legend, such as it is, is now a visual key to the models. Careful juxtaposition of figures could obviate the need even for this extra key. The idea would be to use the colours and names of the models in every figure, to link them more intuitively.

The principles at work:

  • Reduce the fatigue of reading by labeling things directly.
  • Avoid using 'a' and 'b' or other generic names. Call the parts before and after, or 8 ms gate and 16 ms gate
  • Put things you want people to compare next to each other: models with data, output with input, etc. 
  • Use less ink for decoration, more ink for data. Gently direct the reader's attention. 

I'm sure there are other improvements we could make. Do you have any tips to share for making better figures? Leave them in the comments. 

Update, 30 Jan 2015

Some great comments came in today, and the point about black and white is well taken. Indeed, our 52 Things books are all black and white, and I end up transforming most images and figures to (I hope) make them clearer without colour. Here's how I'd do this figure in black and white.

On breaking rules

Humans have a complicated relationship with rules. 

One of the mantras of the 21st century economy is 'first, break all the rules'. If the rules are merely stale conventions, then yes: break away. But it's tempting to go too far and scoff at all rules, and even laws, as the petty creations of boring bureaucrats, declaring, "Rules? Pah! We won't be tied down by your rules!"

But it's not that simple. We like some rules, like the rule about not smoking in aeroplanes, or parking in your reserved parking place. When others break those rules, it's annoying. And rules that define boundaries can heighten, not hinder, creativity and impact — look at code golf, Yves Klein, haiku (though the 5–7–5 thing is a myth), and Twitter

So what to do about a rule we don't like? There are usually a few options:

  1. Obey it. The rule worked! But maybe not for you.
  2. Change it. This might work, but it might take a whileGood luck!
  3. Break it. Easy! Just pretend it's not there. There's no need to feel bad: everyone else is doing it.

Is that it? Be boring, be brave, or stick it to the man? No, it's a false trichotomy. There is a fourth option:

  1. Make the rule irrelevant. Build or contribute to a new version of reality where the rule no longer applies.

In other words, don't break stupid rules — that doesn't change anything. Better to make your point by subverting the entire foundation of stupid rules. For example:

  • When lawyer Larry Lessig decided he'd had enough of copyright restrictions, he didn't say 'screw you guys' and start downloading movies on BitTorrent. He started Creative Commons and transformed the way the sharing economy functions. Result: not just reduced revenue, but reduced impact of traditional media — far more important.
  • The local government will partly fund training for small businesses from a marketing consultant. Apparently, it's common to game this system by hiring a consultant under this program, then simply having them do work for hire — website, branding, and so on. But these are normal business expenses; instead of coercing a broken system to channel public money into private enterprise, we'd all be better off beating a new path to small-scale investment and collaboration. 
  • There's a young would-be Robin Hood in the geoscience publishing world, hosting copyrighted textbook PDFs for free download. He believes he's helping to rid the world of the tyranny of over-priced technical literature, but he's going about it the wrong way. Better to promote open-access literature, and be a champion of legal re-use. This denies 'the establishment' their impact, instead of lauding it, and helps spread truly shareable content.

Next time you come across a rigid rule you don't like, don't break it. Ask instead how you can make the rule not matter.

No Trespassing image CC-BY-SA by Michael Dorausch on Flickr.