News
Tuesday
Jul292014

Graphics that repay careful study

The Visual Display of Quantitative Information by Edward Tufte (2nd ed., Graphics Press, 2001) celebrates communication through data graphics. The book provides a vocabulary and practical theory for data graphics, and Tufte pulls no punches — he suggests why some graphics are better than others, and even condemns failed ones as lost opportunities. The book outlines empirical measures of graphical performance, and describes the pursuit of graphic-making as one of sequential improvement through revision and editing. I see this book as a sort of moral authority on visualization, and as the reference book for developing graphical taste.

Through design, the graphic artist allows the viewer to enter into a transaction with the data. High performance graphics, according to Tufte, 'repay careful study'. They support discovery, probing questions, and a deeper narrative. These kinds of graphics take a lot of work, but they do a lot of work in return. In later books Tufte writes, 'To clarify, add detail.'

A stochastic AVO crossplot

Consider this graphic from the stochastic AVO modeling section of modelr. Its elements are constructed with code, and since it is a program, it is completely reproducible.

Let's dissect some of the conceptual high points. This graphic shows all the data simultaneously across 3 domains, one in each panel. The data points are sampled from probability density estimates of the physical model. It is a large dataset from many calculations of angle-dependent reflectivity at an interface. The data is revealed with a semi-transparent overlay, so that areas of certainty are visually opaque, and areas of uncertainty are harder to see.

At the same time, you can still see every data point that makes the graphic giving a broad overview (the range and additive intensity of the lines and points) as well as the finer structure. We place the two modeled dimensions with templates in the background, alongside the physical model histograms. We can see, for instance, how likely we are to see a phase reversal, or a Class 3 response subject to the physical probability estimates. The statistical and site-specific nature of subsurface modeling is represented in spirit. All the data has context, and all the data has uncertainty.

Rules for graphics that work

Tufte summarizes that excellent data graphics should:

  • Show all the data.
  • Provoke the viewer into thinking about meaning.
  • Avoid distorting what the data have to say.
  • Present many numbers in a small space.
  • Make large data sets coherent.
  • Encourage the eye to compare different pieces of the data.
  • Reveal the data at several levels of detail, from a broad overview to the fine structure.
  • Serve a reasonably clear purpose: description, exploration, tabulation, or decoration.
  • Be closely integrated with the statistical and verbal descriptions of a data set.

The data density, or data-to-ink ratio, looks reasonably high in my crossplot, but it could like still be optimized. What would you remove? What would you add? What elements need revision?

Wednesday
Jul232014

Whither technical books?

Pile of geophysics booksLeafing through our pile of new books on seismic analysis got me thinking about technical books and the future of technical publishing. In particular:

  • Why are these books so expensive? 
  • When will we start to see reproducibility?
  • Does all this stuff just belong on the web?

Why so expensive?

Should technical books really cost several times what ordinary books cost? Professors often ask us for discounts for modelr, our $9/mo seismic modeling tool. Students pay 10% of what pros pay in our geocomputing course. Yet academic books cost three times what consumer books cost. I know it's a volume game — but you're not going to sell many books at $100 a go! And unlike consumer books, technical authors usually don't make any money — a star writer may score 6% of net sales... once 500 books have been sold (see Handbook for Academic Authors).

Where's the reproducibility?

Compared to the amazing level of reproducibility we saw at SciPy — where the code to reproduce virtually every tutorial, talk, and poster was downloadable — books are still rather black box. For example, the figures are often drafted, not generated. A notable (but incomplete) exception is Chris Liner's fantastic (but ridiculously expensive) volume, Elements of 3D Seismology, in which most of the figures seem to have been generated by Mathematica. The crucial final step is to share the code that generated them, and he's exploring this in recent blog posts (e.g. right).

I can think of three examples of more reproducible geophysics in print:

  1. Gary Mavko has shared a lot of MATLAB code associated with Quantitative Seismic Interpretation and The Rock Physics Handbook. The code to reproduce the figures is not provided, and MATLAB is not really open, but it's a start.
  2. William Ashcroft's excellent book, A Petroleum Geologist's Guide to Seismic Reflection contains (proprietary, Windows only) code on a CD, so you could in theory make some of the figures yourself. But it wouldn't be easy.
  3. The series of tutorials I'm coordinating for The Leading Edge has, so far, includes all code to reproduce figures, exclusively written in open languages and using open or synthetic data. Kudos to SEG!

Will the web win?

None of this comes close to Sergey Fomel's brand of fully reproducible geophysics. He is a true pioneer in this space, up there with Jon Claerbout. (You should definitely read his blog!). One thing he's been experimenting with is 'live' reproducible documents in the cloud. If we don't see an easy way to publish live, interactive notebooks in the cloud this year, we'll see them next year for sure.

So imagine being able to read a technical document, a textbook say, with all the usual features you get online — links, hover-over, clickable images, etc. But then add the ability to not only see the code that produced each figure, but to edit and re-run that code. Or add slider widgets for parameters — "What happens to the gather if if I change Poisson's ratio?" Now, since you're on the web, you can share your modification with your colleagues, or the world.

Now that's a book I'd be glad to pay double for.

Some questions for you

We'd love to know what you think of technical books. Leave a comment below, or get in touch

  • Do you purchase technical books regularly? What prompts you to buy a book?
  • What book keeps getting pulled off your shelf, and which ones collect dust?
  • What's missing from the current offerings? Workflows, regional studies, atlases,...?
  • Would you rather just consume everything online? Do you care about reproducibility?

400 posts

The last post was our 400th on this blog. At an average of 500 words, that's about 200,000 words since we started at the end of 2010. Enough for a decent-sized novel, but slightly less likely to win a Pulitzer. In that time, according to Google, almost exactly 100,000 individuals have stopped by agilegeoscience.com — most of them lots of times — thank you readers for keeping us going! The most popular posts: Shale vs tight, Rock physics cheatsheet, and Well tie workflow. We hope you enjoy reading at least half as much as we enjoy writing.

Friday
Jul182014

Six books about seismic analysis

Last year, I did a round-up of six books about seismic interpretation. A raft of new geophysics books recently, mostly from Cambridge, prompts this look at six volumes on seismic analysis — the more quantitative side of interpretation. We seem to be a bit hopeless at full-blown book reviews, and I certainly haven't read all of these books from cover to cover, but I thought I could at least mention them, and give you my first impressions.

If you have read any of these books, I'd love to hear what you think of them! Please leave a comment. 

Observation: none of these volumes mention compressive sensing, borehole seismic, microseismic, tight gas, or source rock plays. So I guess we can look forward to another batch in a year or two, when Cambridge realizes that people will probably buy anything with 3 or more of those words in the title. Even at $75 a go.


Quantitative Seismic Interpretation

Per Avseth, Tapan Mukerji and Gary Mavko (2005). Cambridge University Press, 408 pages, ISBN 978-0-521-15135-1. List price USD 91, $81.90 at Amazon.com, £45.79 at Amazon.co.uk

You have this book, right?

Every seismic interpreter that's thinking about rock properties, AVO, inversion, or anything beyond pure basin-scale geological interpretation needs this book. And the MATLAB scripts.

Rock Physics Handbook

Gary Mavko, Tapan Mukerji & Jack Dvorkin (2009). Cambridge University Press, 511 pages, ISBN 978-0-521-19910-0. List price USD 100, $92.41 at Amazon.com, £40.50 at Amazon.co.uk

If QSI is the book for quantitative interpreters, this is the book for people helping those interpreters. It's the Aki & Richards of rock physics. So if you like sums, and QSI left you feeling unsatisifed, buy this too. It also has lots of MATLAB scripts.

Seismic Reflections of Rock Properties

Jack Dvorkin, Mario Gutierrez & Dario Grana (2014). Cambridge University Press, 365 pages, ISBN 978-0-521-89919-2. List price USD 75, $67.50 at Amazon.com, £40.50 at Amazon.co.uk

This book seems to be a companion to The Rock Physics Handbook. It feels quite academic, though it doesn't contain too much maths. Instead, it's more like a systematic catalog of log models — exploring the full range of seismic responses to rock properies.

Practical Seismic Data Analysis

Hua-Wei Zhou (2014). Cambridge University Press, 496 pages, ISBN 978-0-521-19910-0. List price USD 75, $67.50 at Amazon.com, £40.50 at Amazon.co.uk

Zhou is a professor at the University of Houston. His book leans towards imaging and velocity analysis — it's not really about interpretation. If you're into signal processing and tomography, this is the book for you. Mostly black and white, the book has lots of exercises (no solutions though).

Seismic Amplitude: An Interpreter's Handbook

Rob Simm & Mike Bacon (2014). Cambridge University Press, 279 pages, ISBN 978-1-107-01150-2 (hardback). List price USD 80, $72 at Amazon.com, £40.50 at Amazon.co.uk

Simm is a legend in quantitative interpretation and the similarly lauded Bacon is at Ikon, the pre-eminent rock physics company. These guys know their stuff, and they've filled this superbly illustrated book with the essentials. It belongs on every interpreter's desk.

Seismic Data Analysis Techniques...

Enwenode Onajite (2013). Elsevier. 256 pages, ISBN 978-0124200234. List price USD 130, $113.40 at Amazon.com. £74.91 at Amazon.co.uk.

This is the only book of the collection I don't have. From the preview I'd say it's aimed at undergraduates. It starts with a petroleum geology primer, then covers seismic acquisition, and seems to focus on processing, with a little on interpretation. The figures look rather weak, compared to the other books here. Not recommended, not at this price.

NOTE These prices are Amazon's discounted prices and are subject to change. The links contain a tag that gets us commission, but does not change the price to you. You can almost certainly buy these books elsewhere. 

Tuesday
Jul152014

The event that connects like the web

Last week, Matt, Ben, and I attended SciPy 2014, the 13th annual scientific computing with Python conference. On a superficial level, it was just another conference. But there were other elements, brought forth by the organizers and participants (definitely not just attendees) and slowly revealed over the week. Together, the community created the conditions for a truly remarkable experience.

Immutable accessibility

By design, the experience starts before the event, and continues after it is over. Before each of the four half-day tutorials I attended, the instructors posted their teaching materials, code, and setup instructions. Most oral presentations did the same. Most code and content was served through GitHub or Bitbucket and instructions were posted using Mozilla's Etherpad. Ultimately the tools don't matter — it's the intention that is important. Instructors and speakers plan to connect.

Enhancing the being there

Beyond talks and posters, here are some examples of other events that were executed with engagement in mind:

  • Keynote presentations. If a keynote is truly key, design the schedule so that everyone can show up — they're a great way to start the day on a high note.
  • Birds of a Feather sessions are better than a panel discussion or Q&A. Run around with a microphone, and record notes in Etherpad.
  • Lightning talks at the end the day. Anyone can request 5 minutes on a show & tell. It was the first time I've heard applause erupt in the middle of a talk — and it happened several times.
  • Developer sprints take an hour to teach newbies how to become active members of your community or your project. Then spend two-days showing them how you work.

Record all the things

SciPy is not a conference, it's a hypermedia stream that connects networks across organizational boundaries. And it happens in real time — I overheard several people remarking in astonishment that the video of so-and-so's talk earlier that same morning was already posted online. My trained habit of frantic note-taking was redundant, freeing my concentration for more active listening. Instructors and presenters published their media online, and the majority of presenters pulled up interactive iPython notebooks in the browser and executed code on the fly. 

As an example of this, here's Karl Schleicher of Sergey Fomel's group at UT, talking about reproducing the results from a classic paper in The Leading Edge, Spitz (1999)

We need this

On Friday evening Matt remarked to one of the sponsors, "This is the closest thing I have seen to what a conference should be". I think what he meant by that is that it should be about connecting. It should be about pushing our work out to the largest possible scope. It should be open by default, and designed to support ideas and conversations long after it is over. Just like all the things that the web is for as well.

Our question: Can we help SEG, AAPG, or EAGE deliver this to our community? Or do we have to go and build it? 

Friday
Jul112014

Geophysics at SciPy 2014

Wednesday was geophysics day at SciPy 2014, the conference for scientific Python in Austin. We had a mini-symposium in the afternoon, with 4 talks and 2 lightning talks about posters.

All the talks

Here's what went on in the session...

The talks should all be online eventually. For now, you can watch my talk and Joe's (awesome) talk right here...

And also...

There have been so many other highlights at this amazing conference that I can't resist sharing a couple of the non-geophysical gems...

Last thing... If you use the scientific Python stack in your work, please consider giving as generously as you can to the NumFOCUS Foundation. Support open source!