My StrataConf highlights

Lots went on at the geologically named, but not geologically inclined, Strata Conference in London. Here are my highlights:

George Dyson was one of the keynote speakers on the first morning. The son of the British–American mathematician Freeman Dyson, George is an author and historian of science and computing. He talked about the history of storage, starting with tally sticks, through the 53kB of global digital storage in 1953, to today. His talk was fascinating. 

Simon Rogers was one of several speakers from the Guardian newspaper, one of the most progressive and online-friendly news outlets in the world. The paper has a host of strategies for putting data first:

  • Their data and viz geeks sit in the middle of news room
  • They built their own software library for data viz, Miso
  • They share the data behind every story on their Datablog

Duncan Irving from Teradata gave the audience a glimpse of the big data geoscientists wield, as I alluded to yesterday. Teradata does data warehousing, but with high technology extras like distributed storage and level of detail layers. I was intrigued by one of the technologies he talked about — SQL on Hadoop. This sounds like gobbledygook, but here's the (possibly horribly misunderstood) gist: store statistical attributes of a massive seismic volume in a database, then you can query them. "Show me all the traces with such-and-such seismic facies."   

Hjalmar Gislason from Datamarket, whose recent products include Energy Portal, gave us his best practices for publishing data:

  • Use simple formats, like CSV
  • Aim for at least 3 stars in Tim Berners-Lee's system
  • Be consistent across the datasets you publish
  • Put unique IDs everywhere, especially on tables and columns
  • Provide FAQs and clear feedback channels for users
  • Be clear about the license terms of the data

Ben Goldacre, author and bad science crimefighter, gave a keynote on the second day. Almost vibrating with energy, he described how the most basic bias-fighting tool in medicine — randomized controlled trials — might be applied to improving government services (Haynes et al., 2012, Test, learn, adapt). 

At the end of the two days, I had the usual feeling of fullness, fatigue, and anticlimax... but also the inspired, impatient, creative energy that I hope for from events. The consistency of the themes was encouraging — data wants to be free, visualization is necessary but insufficient, reproducibility is core, stories drive us — these are ideas we embrace. They're at the heart of the quiet revolution going on in the world, but perhaps not yet at the heart of our subsurface professional communities. 

Photo by flickr user bjelkeman.


Big data in geoscience

Big data is what we got when the decision cost of deleting data became greater than the cost of storing it.
George Dyson, at Strata London

I was looking for something to do in London this week. Tempted by the Deep-water contintental margins meeting in Piccadilly, I instead took the opportunity to attend a different kind of conference. The media group O'Reilly, led by the inspired Tim O'Reilly, organizes conferences. They're known for being energetic, quirky, and small-company-friendly. I wanted to see one, so I came to Strata.

Strata is the conference for big data, one of the woolliest buzzwords in computer science today. Some people are skeptical that it's anything other than a new way to provoke fear and uncertainty in IT executives, the only known way to make them spend money. Indeed, Google "big data" and the top 5 hits are: Wikipedia (obvsly), IBM, McKinsey, Oracle, and EMC. It might be hype, but all this attention might lead somewhere good. 

We're all big data scientists

Geoscientists, especially geophysicists, are unphased by the concept of big data. The acquisition data from a 3D survey can easily require 10TB (10,240GB) or even 100TB of storage. The data must be written, read, processed, and re-written dozens of times during processing, then delivered, loaded, and interpreted. In geoscience, big data is normal data. 

So it's great that big data problems are being hacked on by thousands of developers, researchers, and companies that, until about a year ago, were only interested in games and the web. About 99% of them are not working on problems in geophysics or petroleum, but there will be insight and technology that will benefit our industry.

It's not just about data management. Some of the most creative data scientists in the world are at this conference. People are showing dense, and sometimes beautiful, visualizations of giant datasets, like the transport displays by James Cheshire's research group at UCL (right). I can't wait to show some of these people a SEG-Y or LAS file and, unencumbered by our curmudgeonly tradition of analog display metaphors, see how they would display it.

Would the wiggle display pass muster?


News of the month

Our more-or-less regular news round-up is here again. News tips?

Geophysics giant

On Monday the French geophysics company CGGVeritas announced a deal to buy most of Fugro's Geoscience division for €1.2 billion (a little over $1.5 billion). What's more, the two companies will enter into a joint venture in seabed acquisition. Fugro, based in the Netherlands, will pay CGGVeritas €225 million for the privilege. CGGVeritas also pick up commercial rights to Fugro's data library, which they will retain. Over 2500 people are involved in the deal — and CGGVeritas are now officially Really Big. 

Big open data?

As Evan mentioned in his reports from the SEG IQ Earth Forum, Statoil is releasing some of their Gullfaks dataset through the SEG. This dataset is already 'out there' as the Petrel demo data, though there has not yet been an announcement of exactly what's in the package. We hope it includes gathers, production data, core photos, and so on. The industry needs more open data! What legacy dataset could your company release to kickstart innovation?

Journal innovation

Again, as Evan reported recently, SEG is launching a new peer-reviewed, quarterly journal — Interpretation. The first articles will appear in early 2013. The journal will be open access... but only till the end of 2013. Perhaps they will reconsider if they get hundreds of emails asking for it to remain open access! Imagine the impact on the reach and relevance of the SEG that would have. Why not email the editorial team?

In another dabble with openness, The Leading Edge has opened up its latest issue on reserves estimation, so you don't need to be an SEG member to read it. Why not forward it to your local geologist and reservoir engineer?

Updating a standard

It's all about SEG this month! The SEG is appealing for help revising the SEG-Y standard, for its revision 2. If you've ever whined about the lack of standardness in the existing standard, now's your chance to help fix it. If you haven't whined about SEG-Y, then I envy you, because you've obviously never had to load seismic data. This is a welcome step, though I wonder if the real problems are not in the standard itself, but in education and adoption.

The SEG-Y meeting is at the Annual Meeting, which is coming up in November. The technical program is now online, a fact which made me wonder why on earth I paid $15 for a flash drive with the abstracts on it.

Log analysis in OpendTect

We've written before about CLAS, a new OpendTect plug-in for well logs and petrophysics. It's now called CLAS Lite, and is advertised as being 'by Sitfal', though it was previously 'by Geoinfo'. We haven't tried it yet, but the screenshots look very promising.

This regular news feature is for information only. We aren't connected with any of these organizations, and don't necessarily endorse their products or services. Except OpendTect, which we definitely do endorse.


L is for Lambda

Hooke's law says that the force F exerted by a spring depends only on its displacement x from equilibrium, and the spring constant k of the spring:


We can think of k—and experience it—as stiffness. The spring constant is a property of the spring. In a sense, it is the spring. Rocks are like springs, in that they have some elasticity. We'd like to know the spring constant of our rocks, because it can help us predict useful things like porosity. 

Hooke's law is the basis for elasticity theory, in which we express the law as

stress [force per unit area] is equal to strain [deformation] times a constant

This time the constant of proportionality is called the elastic modulus. And there isn't just one of them. Why more complicated? Well, rocks are like springs, but they are three dimensional.

In three dimensions, assuming isotropy, the shear modulus μ plays the role of the spring constant for shear waves. But for compressional waves we need λ+2μ, a quantity called the P-wave modulus. So λ is one part of the term that tells us how rocks get squished by P-waves.

These mysterious quantities λ and µ are Lamé's first and second parameters. They are intrinsic properties of all materials, including rocks. Like all elastic moduli, they have units of force per unit area, or pascals [Pa].

So what is λ?

Matt and I have spent several hours discussing how to describe lambda. Unlike Young's modulus E, or Poisson's ratio ν, our friend λ does not have a simple physical description. Young's modulus just determines how much longer something gets when I stretch it. Poisson's ratio tells how much fatter something gets if I squeeze it. But lambda... what is lambda?

  • λ is sometimes called incompressibility, a name best avoided because it's sometimes also used for the bulk modulus, K.  
  • If we apply stress σ1 along the 1 direction to this linearly elastic isotropic cube (right), then λ represents the 'spring constant' that scales the strain ε along the directions perpendicular to the applied stress.
  • The derivation of Hooke's law in 3D requires tensors, which we're not getting into here. The point is that λ and μ help give the simplest form of the equations (right, shown for one dimension).

The significance of elastic properties is that they determine how a material is temporarily deformed by a passing seismic wave. Shear waves propagate by orthogonal displacements relative to the propagation direction—this deformation is determined by µ. In contrast, P-waves propagate by displacements parallel to the propagation direction, and this deformation is inversely proportional to M, which is 2µ + λ

Lambda rears its head in seismic petrophysics, AVO inversion, and is the first letter in the acronym of Bill Goodway's popular LMR inversion method (Goodway, 2001). Even though it is fundamental to seismic, there's no doubt that λ is not intuitively understood by most geoscientists. Have you ever tried to explain lambda to someone? What description of λ do you find useful? I'm open to suggestions. 

Goodway, B., 2001, AVO and Lame' constants for rock parameterization and fluid detection: CSEG Recorder, 26, no. 6, 39-60.


On being the world's smallest technical publishing company

Four months ago we launched our first book, 52 Things You Should Know About Geophysics. This little book contains 52 short essays by 37 amazing geoscientists. And me and Evan. 

Since it launched, we've been having fun hearing from people who have enjoyed it:

Yesterday's mail brought me an envelope from Stavanger — Matteo Niccoli sent me a copy of 52 Things. In doing so he beat me to the punch as I've been meaning to purchase a copy for some time. It's a good thing I didn't buy one — I'd like to buy a dozen. [a Calgary geoscientist]

A really valuable collection of advice from the elite in Geophysics to help you on your way to becoming a better more competent Geophysicist. [a review on]

We are interested in ordering 50 to 100 copies of the book 52 Things You Should Know About Geophysics [from an E&P company. They later ordered 100.]

The economics

We thought some people might be interested in the economics of self-publishing. If you want to know more, please ask in the comments — we're happy to share our experiences. 

We didn't approach a publisher with our book. We knew we wanted to bootstrap and learn — the Agile way. Before going with Amazon's CreateSpace platform, we considered Lightning Source (another print-on-demand provider), and an ordinary 'web press' printer in China. The advantages of CreateSpace are Amazon's obvious global reach, and not having to carry any inventory. The advantages of a web press are the low printing cost per book and the range of options — recycled paper, matte finish, gatefold cover, and so on.

So, what does a book cost?

  • You could publish a book this way for $0. But, unless you're an editor and designer, you might be a bit disappointed with your results. We spent about $4000 making the book: interior design about $2000, cover design was about $650, indexing about $450. We lease the publishing software (Adobe InDesign) for about $35 per month.
  • Each book costs $2.43 to manufacture. Books are printed just in time — Amazon's machines must be truly amazing. I'd love to see them in action. 
  • The cover price is $19 at, about €15 at Amazon's European stores, and £12 at Amazon are free to offer whatever discounts they like, at their expense (currently 10% at And of course you can get free shipping. Amazon charges a 40% fee, so after we pay for the manufacturing, we're left with about $8 per book. 
  • We also sell through our own estore, at $19. This is just a slightly customizable Amazon page. This channel is good for us because Amazon only charges 20% of the sale price as their fee. So we make about $12 per book this way. We can give discounts here too — for large orders, and for the authors.
  • Amazon also sells the book through a so-called expanded distribution channel, which puts the book on other websites and even into bookstores (highly unlikely in our case). Unfortunately, it doesn't give us access to universities and libraries. Amazon's take is 60% through this channel.
  • We sell a Kindle edition for $9. This is a bargain, by the way—making an attractive and functional ebook was not easy. The images and equations look terrible, ebook typography is poor, and it just feels less like a book, so we felt $9 was about right. The physical book is much nicer. Kindle royalties are complicated, but we make about $5 per sale. 

By the numbers

It doesn't pay to fixate on metrics—most of the things we care about are impossible to measure. But we're quantitative people, and numbers are hard to resist. To recoup our costs, not counting the time we lovingly invested, we need to sell 632 books. (Coincidentally, this is about how many people visit every week.) As of right now, there are 476 books out there 'in the wild', 271 of which were sold for actual money. That's a good audience of people — picture them, sitting there, reading about geophysics, just for the love of it.

The bottom line

My wife Kara is an experienced non-fiction publisher. She's worked all over the world in editorial and production. So we knew what we were getting into, more or less. The print-on-demand stuff was new to her, and the retail side of things. We already knew we suck at marketing. But the point is, we knew we weren't in this for the money, and it's about relevant and interesting books, not marketing.

And now we know what we're doing. Sorta. We're in the process of collecting 52 Things about geology, and are planning others. So we're in this for one or two more whatever happens, and we hope we get to do many more.

We can't say this often enough: Thank You to our wonderful authors. And Thank You to everyone who has put down some hard-earned cash for a copy. You are awesome.