What I've Learned From Writing 100 Weekly Data Science Newsletters

Dataiku Company Alivia Smith

Almost two years ago, we got together with the marketing team at Dataiku and decided that if we wanted to be more cutting edge in our content, we should launch a weekly newsletter of the coolest data articles we were reading. I did not realize when I said, "Oh that sounds fun, I want to do it!" how much I would learn from this newsletter (and how much work it would be to scour the internet weekly in search of those gems).

100th banana data newsletter image

 

Thinking back on the past 100 issues of Banana Data News, here's what I got.

There's Tons of Exciting Stuff Happening in Data Science

It’s crazy how rich the field is. You never have to look very far to find an amazing new discovery or a great project every week. And there’s more and more happening. When we started, I would share five articles in one newsletter, and it took about half a day to make a selection from all the articles I found.

Now we share about 50 links a week between our intro blurb, our top four articles, our top video and infographic, and our other links. And I still spend half a day selecting these articles.

What’s interesting, though, is that we haven't seen much change in the technologies from when we first started Banana Data until today. People are doing cool projects using mostly natural language processing and neural networks (or deep neural networks), and these technologies have been around for a while now even though they’re becoming more available. Arguably, scholars and students are just getting more inventive with old tech.

AI Issues Are Never Black or White

Artificial intelligence (AI) in and of itself isn’t black or white, of course. On any given week, there will for sure be an article on how AI will eventually take all of our jobs, but there will also be an article on how AI can help diagnose heart failure better than doctors — making AI the largest gray area in science these days.

But beyond AI even,  for any technologies or use cases, you’ll see one article and then a following one taking the exact opposite stance. You’ll see one article on how you shouldn’t do deep learning if you don’t have massive amounts of data, and another stating that you could actually do it even if you don’t have loads of data.

Data People Are Interested in Content About Making Their Job Easier

When you listen to or read tech news these days, you hear about a lot of new research and about a lot of smart people giving their vision of what a future AI world would look like. But the data is in: the articles that work best in Banana Data are articles about making your day-to-day work with data more efficient and how other tech companies are handling their data. We actually took a look at those trends a little under a year ago, and they haven't changed.

Among the top clicked articles of all time, you’ll find this article on how Python makes working with data more difficult on the long term  and this one on democratizing data at Airbnb.

The rest of our top positions are held by awesome data visualization projects, like this one on the color scheme of Wes Anderson movies and this project by Flowing Data illustrating the "firsts" of major relationship milestones.

And the articles that work the best on Twitter, you ask? Well, unsurprisingly, data visualization projects and infographics are always a hit. And articles on learning new data tricks get the most click-throughs!

Tech Hype Cycles Are a Thing

When I started writing about data and AI, the big thing was chatbots. Everyone was launching their chatbot company and talking about a new AI-fueled era of automated customer success. The opportunities seemed endless. And nothing really came out of it (other than Microsoft Tay, and we should always be thankful for Microsoft Tay). Blockchain was big six months ago, and we’re not hearing that much about it anymore. And these days, it seems like we’re making so much progress with self driving cars that it’s a matter of months until we get to rate our robot Uber driver.

Gartner Hype Cycle

In the world of data science, the hype cycles are real and expectations become high before trends dip away.

Most interestingly though, it’s interesting to see how many of these trends, and of the global trends of the tech world in general, are powered by data and AI. Actually, arguably all of them are. You just can’t come up with something new in tech without putting that data to good use!

The Opposition to an Algorithmic World Is Getting More Concrete (and Convincing)

When I started working on Banana Data (and at Dataiku for that matter), I had a very optimistic view of data science and a world where algorithms helped people get better service or more interesting jobs. I was largely unconvinced by articles and reports about a future where robots will take our jobs. To me, they sounded like luddites holding on to a less efficient way of doing things. And I truly believed that through recommendations (and automation) that people’s lives were getting better.

Recent articles are altering that point of view. Why? Because they became more concrete, using extreme situations that we are actually seeing now and exposing what they could mean for the future (instead of vague threats of what a supreme AI could, in theory, do to humanity).

I now worry about the opaque algorithms behind social media and that they can shape democracy, and I worry about what kids get exposed to. I also realize that GAFA are using my data not only to provide a service, but mostly to maintain an unbeatable competitive advantage. And since they represent 70% of online traffic, they’re making me increasingly uncomfortable.

You May Also Like