Issue 69

Thar she blows! A whale of a metaphor.

April 24, 2019

Hello friend! Welcome to Scrap Facts.

I'm a reporter covering health and science with insatiable curiosity. I love everything I learn, not all of which gets its own story. Each week, I'll bring you some of my favorite facts that I picked up on the job or while out living life.

Archives from Tinyletter can be found here.

Today in Gene Reading: This easy tool will help you decide if you should take an at-home genetic test

And

Moby Dick: Or, the imprecision of genetic testing

Both of these stories are interactive. The first is a chatbot I helped come up with that will tell you if you should take a direct to consumer genetic test. The second is an amazing look at how these companies barely read your genome. Choice quote:

If reading Moby Dick front to back is a complete decoding of your genome, then these ancestry tests would be like skimming—not even reading!—the CliffNotes.

You’ll notice that these are both creations largely by my colleague Daniel Wolfe, who is a data reporter here at Quartz.

Today, I’ve got a short Q&A with Daniel about what he learned while reporting these stories. I’ve lightly edited and condensed it for clarity.


Katherine Ellen Foley: Where did the idea of Moby Dick to explain how much DNA direct-to-consumer genetic tests examine even come from?

Wolfe: When you understand the scale of the human genome, you start to understand why a direct-to-consumer test has an original sin of how it’s run: Only a very small fraction of your genome is actually looked at. So you know from the beginning that the visualization technique is one that’s looking at quantifying scale.

My dad was a molecular biologist for the National Institutes of Health, and I remember growing up and following the Human Genome race. I had heard many times how my dad would describe it and describe the sheer length of information and how difficult it would be.

I also went to school for English literature and writing in my undergrad, and my senior seminar was in Melville. Moby Dick has got to be one of my favorite works of American fiction, and it came to mind because I know the human genome is not a set of instructions. People characterize it as that, but it’s more than that—it’s a lot to do with interpretation.

Foley: Did you have a favorite part of working on this?

Wolfe: I remember one of our meetings where you, me and Elijah [Wolfson] talking about how we were going to get the ancestry part across. I said, “Oh we should just find common phrases and sentences [in Moby Dick] and see where this would appear in other literature.”

As it often happens, I said this without knowing how I would do it.

I reached out to Jeremy Merrill, another Quartz journalist, because I know he’s familiar with machine learning. He gave me advice on how to sort and manage the text, and I wrote a bunch of functions that took Moby Dick and stripped it down to its sentences, removed punctuation, removed capitalization and all that stuff and made an ordered list, like a csv or Excel spreadsheet, of all the words and phrases that are said in Moby Dick.

I did that to maybe a couple dozen other books that were all under public domain in the Gutenberg library.

Nothing beats that sensation of like writing a bunch of code and then firing a single execution script that fires it up and then it prints out a return statement and it works.

Foley: Was there anything particularly unique about this code that you wrote?

Wolfe: Because of such a Melville geek, I couldn’t help myself in writing and naming my variables and function names in my code. When the page loads, I have a function that initializes all my listeners—this is how we know where you are on the page, and what you click on, hence “listener.” The function is called `fastenYeListeners` because it’s like I’m battening down the hatches. Timers on the page are named after ships that appear in Moby Dick, Pequod being the most famous. My function that loads the entire page is called `tharSheBlows` , which is also the kicker of the article

Foley: Was there anything you learned about the genome that surprised you?

Wolfe: The idea of variance of unknown significance. I had learned that from talking to a genetic counselor. We were talking about BRCA1/2. I had known that there were a lot of variations, but this counselor said, “People don’t understand within a single gene how many tens of thousands of variations that we have no idea what they do.”

The knowing and not knowing part is wild to me—I was hoping to show that with the little animation that shows you how many different ways this paragraph could be written, and the idea that we don’t know if it could be cancer causing or not.

Foley: What about direct-to-consumer genetic tests?

Wolfe: We spent weeks getting up to speed and understanding these concepts. It just really blows my mind that this is how complicated it is. And yet, a company like 23andMe presents it so simply and plainly. It’s really made me think about design ethics more than I had before.

I think it’s one thing when having well-designed products that make a user feel good and gives a sense of certainty. When you take those same designers and put them on a product team that’s promoting that it’s doing science, you enter an ethical quandary. I think it’s pretty alarming that because everything looks so clean and neat on the sites, it gives you the sense that it is scientific and it is believable. If you’re telling people their risk of cancer, it’s not okay. A product team is going to think about what makes marketing look good, and not “is this dangerous?”

If you like Daniel’s work, check out his other stuff for Quartz (including a great story on the correct way to use port-a-potties) and his Twitter.


Curious about other direct-to-consumer genetic tests? Be sure to check out the state of play, and be on the lookout for more stories we’ll publish later this week. You can sign up for a free trial of Quartz membership here, or, if that is not feasible for you, email me at scrapfacts@gmail.com and I can get you a PDF.

In tomorrow’s issue I’ll bring you a story Daniel and I worked on together about the flaws with ancestry direct-to-consumer genetic tests that have larger social consequences. This story inspired the entire series, and I really think you’ll like it.

Additionally, on Friday, April 25, at 11 am US eastern, I’ll be on a members-only video call with editor Elijah Wolfson to talk about more of what I learned while reporting this series. Got questions you’d like me to answer then, or in an issue? Send them my way at scrapfacts@gmail.com.

That’s all for now. Stay curious, friend! <3

If you love Scrap Facts, consider sending it to a friend. Wanna keep in touch outside of this newsletter? Follow me on Twitter and Instagram. Top image by E. Y. Smith, headshot drawing by Richard Howard.