When we were looking ahead to 2020 at the end of last year, everything was about scale and becoming bigger, better, and bolder. After a tumultuous year, teams are undoubtedly more focused on retooling, finding creative ways to collaborate efficiently, and working in a new (and ever-evolving) world. When things return to status quo, that need to collaborate isn't going to go away.
What else is going to stay and what will go? Have teams reached a tipping point between the collection of data and doing something with it? In the final episode of Season 4 of the Banana Data Podcast, hosts Triveni Gandhi and Christopher Peter Makris discuss topics like the normalization and accelerated adoption of AI, the continued rise of citizen data science (and the importance of bridging technical and domain expertise), the concept of Responsible AI, and more.
Read the Episode
Not a podcast person? Here's the full transcript of the 2021 AI trends episode for you to read at your leisure.
Triveni: This is the Banana Data Podcast, a podcast hosted by Dataiku, I'm Triveni.
Chris: And I'm Chris.
Triveni: And in our bi-weekly episodes, we'll discuss the good, the great, and the ugly of AI. If you're craving even more, check out our 15-minute Banana Bytes discussions on Dataiku's LinkedIn and Twitter every Wednesday at 2:00 p.m. EST. For our final episode of the season, we're looking at the biggest trends in data science and AI that will lead us into 2021. From latency, normalized AI, citizen data scientists, and actualized Responsible AI. Hey Chris.
Chris: Hey Triveni.
Triveni: So we are coming up on the end of the year 2020, which probably felt longer than 12 months to a lot of us, but we're here, we made it, and usually at the end of the year, the podcast likes to do a looking forward, thinking about what are the trends that we're going to see coming up in the next 12 to 18 months.
Chris: So we're using our data science expertise now to actually do a meta analysis on data science itself.
Triveni: That's exactly what we're doing. We were doing some research for this episode and we saw list after list of this trend and that trend, and if you list out 30 things as a trend, statistically speaking, you're going to get some of those things right.
Chris: We're going to some multiple testing corrections here, so don't get me started.
Triveni: Anyway, so I think all that said, we decided today for our trends episode to talk about the trends that we see coming up in 2021. We wanted to focus on big ideas. And instead of giving you a list of 30 big ideas or even 30 small ideas, we're going to just talk about some bigger concepts, maybe around four or five bigger concepts that we think are going to be really critical moving forward. So I think, Chris, you had something to kick us off.
Chris: So I think we're starting to see that adopting AI is the norm. We've been talking about this all the time, about how companies and different people are finding the value within having data and how they've collected all of their data and they've stored all their data, because they weren't able to do that before, and that was the rush over the last couple of years. But now we're at this tipping point where the starting point is no longer the collection of the data, but actually doing something with it.
Triveni: I think that's the interesting bit here is that, in 2021, we're going to see companies actually doing it. You might argue that well, no, my organization or most organizations I work with have been doing this pretty consistently, and that might be true, but I think what's going to happen is that that normalization of AI is actually going to happen across an organization.
Chris: It's time to finally actually put rubber to road, where, we've been talking about this all along, but if you're not doing it now, then you're actually going to be left behind. And the race is going to start without you.
Triveni: The past few years have been an experimentation phase or a trial and seeing what is this stuff all about? And now a lot of the benefits and actual bottom line that's improving because of AI is coming to the forefront.
Chris: Oftentimes we're on calls with clients and you hear them say something to the effect of, "We have all this data and we've been sitting on it for a while, but what do we do with it?" So my question to you, Triveni, is I think you've got some good ideas about another trend that we are seeing for 2021 about how we answer that question.
Triveni: I actually think the best way to enable this kind of giant transformation is something that I've been seeing a lot happening this year with my clients, and that's the idea of a citizen data science program. So you think about most industries and even think about what it is to be a data scientist, there are now Master's programs and boot camps and all these things. But the first data scientists were really just domain experts that learned important data science skills and how to apply data to their daily questions or how to better manipulate the data that they're working with.
So I'm seeing a lot more of these data science programs pop up across large organizations, and small, where the idea really is, "Okay, you guys, the chemist or the physicist or the researcher in a lab, are the experts here. We don't need to outsource the data science to someone else. You can be the data scientist as well." These programs have been really, really successful so far. At least the ones I've seen.
Chris: Yes, absolutely. And I completely agree. I think that there is a wealth of knowledge to be gained by being very specialized in one specific topic, but you lose out on missing information from other specialized verticals. So as you said, physicists, chemists, whoever, political scientists, they all may know very, very deeply their topic, but you don't have to have 20 years of experience in data science to actually open up many of those doors.
Triveni: That's the whole point of a data-driven organization. If you want to actually use data to improve your bottom line, improve the business, maybe even improve society if all goes well, then you need to make sure that that data science knowledge, that data science understanding, is integrated all along, and it's not just sitting in one little part of the organization. So again, back to the AI norm trend, it is really about normalizing it across the board. And these citizen data science programs are actually a really great way to do that. And there's so many different ways that you can do it. I worked with some clients on a really comprehensively built out, centralized program that brought together people from all these different parts of the organization, from the manufacturing side, the finance side, HR. And it was really great for them to be able to see and talk to each other about their use cases.
But then you also have plenty of external resources, online learning, different kinds of code camps, all that great stuff. And really, it's just about shifting the way that your domain experts think. Because they're already thinking in a data-driven way. They already know their field. They already know that, "When I see this thing happening, it indicates to me that something else is going to happen, or this other aspect might occur." But now you're just giving them, like you said, Chris, the tools or the language to talk about it in a way that is understandable by everybody in the organization.
Chris: Yeah. There's always this impactful moment whenever we're talking to clients or working on certain projects that do have a lot of domains, niche data realm, where I may show something to the client and say, "I'm seeing this in the data. Does this make sense?" And I don't know anything other than the programming and the numbers and maybe the variables or whatever, but the client actually comes back and says, "Oh my God, that's so informative. That actually can impact us in so-and-so way." I would never know that point without the domain, and they would never know that correlation or whatnot existing without the data science. And it's really that meeting of the minds that I think we're going to see more of in 2021.
Triveni: And now it's time for that part of the podcast where we explain complex data science concepts in plain English. So Chris, can you explain latency to me in English please?
Chris: Sure. So latency is really just a fancy word for a delay. And when we talk about latency in the data world, it's the delay between the transfer of that data from one place to another. And you can think about latency manifesting in a physical realm, whether it's data on my computer traversing a network and landing in somebody else's computer, or even you can think about latency between physical realms. How long does it take for my information to reach somebody else who's a mile away if I am shouting at them? That delay we usually want to actually be as small as possible so that we can get things done a lot faster.
Triveni: Thanks for explaining that in English.
Chris: Actually, Triveni, thanks for bringing that up because that's another trend that I really do want to discuss here. I think a lot of recent discussions have been about latency in the data world and how that's going to be reduced vastly in the future.
Triveni: And usually the discussion around latency is happening at a level where time is hypercritical, like surgeries or financial institutions where there are dollars being traded on the millisecond, whatever it might be. Those kinds of latency issues are always going to relevant. I think for us, this trend is that the question of latency and reducing it overall is going to become a little bit more pressing for these organizations that are taking AI to the next stage in their business, that are bringing more people to the table, and as a result, compute costs, cloud costs, all that fun stuff is going to go up. So how can we build solutions that are going to be low latency and actually help these new data scientists get things done?
Chris: And there's got to be a lot of research surrounding that. And I think it's been top of mind, especially recently, as a lot of us have been remote and distributed across the globe, trying to get many, many different things done. Not only is it a question of how we can support this from an infrastructure standpoint, but also what that reduction of latency actually will impact in the future. You can imagine, to your point, Triveni, about surgeries. A doctor being on the West coast and operating some physical item in front of them and operating on somebody in New York, and why latency would matter there is that the actual intricate movements of their hands would translate over to what they are seeing on a screen with no latency to a patient in a hospital bed in New York.
Triveni: I don't think that it's just the reduction in latency that's the trend. I think the infrastructure part is going to happen no matter what, and that's not even related to AI, that's just generally our globalized world. But for the AI-driven organization, I really think it means about shifting where we do our compute. And what I have seen a lot lately is that a client comes to me and says, "Oh, this thing is running so slow. "Why is it taking so long? I don't want to have to wait five hours."
And the fact of the matter is is that they're doing this complex pull data from point A, bring it down to computer C, do some operation, then push it out to point B, and it's not efficient. So I think that reduction of latency means that we're going to see a real push to computing where the data is. There's no need to actually move data around so much if we can actually just go to the data and say, "All right, here, give me these trends, build this model, do whatever it might be." That's really going to be, for the AI world, where reduction in latency is critical.
Chris: And why I think this matters to everyone is because it boils down to the old adage of time is money. And the more time you have, the more money you're going to have.
Triveni: Our last trend that I think we want to cover that we see coming up in 2021 is around the concept of Responsible AI, which we've been talking about for awhile, and you know me, I'm always talking about it, but I think what's new in 2021, what we're really going to see, is that people are going to stop talking and start doing. Up until now, I think a lot of the conversations have been, "We haven't thought about these things," or, "How can we think about them and how can we acknowledge the harms and impacts that AI has on the world?" People are frankly getting tired of talking. And now the trend I'm seeing is how do I put this into practice? How does it matter for a data scientist versus an analyst versus a product designer versus a business user? How do I make this real? And I think that's where 2021 is going to really take off in terms of putting Responsible AI to work.
Chris: Yeah, it's very similar to the adoption of AI becoming the norm. I think Responsible AI is also the norm now. It's not enough to just collect data, but you have to make sure that you're collecting that data well. It's not enough to have a research question of interest, but you have to make sure that you're defining that question in a way that is ethical. It's not good enough to just have some initiatives at a company, you have to make sure that those initiatives are crunching this data in a way that is actually going to step you forward rather than take a step back.
Triveni: I think that's the point, is that it's not lip service anymore. Now we're actually ready to do it. And all the things we've been talking about between AI being the norm, reducing latencies, you're making things a lot faster and easier to do, you're teaching new people how to use data, how to think about data science. All of that means that Responsible AI has to be a part of the growth, right, of the next step for any organization. Because you're giving more people access to data, you're making it easier for them to use it and manipulate it, and you're saying that this is what we want you to be doing anyway.
So if you're going to do that, you have to do it with some guidelines in place with some common agreed standards of what is okay and what is not okay. Now we're going to really start seeing these companies coming in and saying, "Sure, here you go citizen data scientist. Here's all the data on our customers. But by the way, you should be making sure you're not doing XYZ, and we have someone in charge who's going to review and make sure that you're not doing XYZ."
Chris: Yeah, we talk about this all the time. AI can be used for good or evil, and citizen data scientists, to be quite frank, they do have all the domain knowledge we talked about before, so they do have a lot to bring to the table with pairing their skills with data science tools, but they are more liable or more at risk for maybe possibly doing things incorrectly or maybe unethically if they are not educated on Responsible AI.
Triveni: We're seeing more regulations come forward across the globe, and I think governments and companies themselves are starting to recognize that, "Yeah, what we do has a really big impact, so we need to be very careful with what we're doing." It's funny, I had a client ask me, "I want you to come and give a talk on Responsible AI, but you don't need to explain to us what it. We need you to come in and tell us what to do." And it's just really that simple, where companies are saying, "We get it. But how?"
The way that I think we might see, at least in early 2021, is a lot of companies coming together and internally agreeing on a standard set of ethics for their company, and then saying, "Based on these values that we hold, we don't want to discriminate against a certain group of people, or we don't want our models to create harm in this way. Based on that, there are going to be these checklists put into place." Things like, "Okay, if your goal is to not have a negative impact on group X, what are the steps you're taking during your entire AI pipeline to permit that?"
Chris: I think you hit the nail on the head with that client interaction. It's no longer a recognition that we need to be doing this, but a question of how are we going to be doing this? And that's what we're seeing on the horizon.
Triveni: So as we've been talking about these trends, Chris, I think that there's sort of a thread going through all of them, and you might even argue, and I'm going to throw this out there, a little crazy. I'm going to argue that these aren't actually four separate trends. They're all part of one trend. I think that these are all part of this bigger move towards bridging technical expertise with domain expertise, towards bridging an organization around a common goal of data-driven insights that are both efficient in terms of compute, that involve a lot of different people across the pipeline who might have different layers of expertise, and that are actually ethically executed. That is our trend for 2021, that we're going to see this amazing collaboration across these previously siloed groups and more people becoming a part of the data drive in their organization.
Chris: Yes, absolutely. I think that's where all of the gold lies and consolidates everything that we talked about before. Whether it is adopting AI as the norm. Well, we're bridging that gap between the domain experts and the data science experts. If it's the citizen data scientists program, similar — educating and bringing that education to those individuals to enable them to broaden their skillsets. With Responsible AI, as you just mentioned, we don't need to convince people that this is the way to go, but bring the expertise to implement the ethical use of AI. And with reduction of latency, it's reducing that delay, bringing party A and party B together, closer to collaborate with one another.
Triveni: I don't know that this is really a new concept. What do you think Chris?
Chris: Nothing here really is novel. I just do think it is a turning point in the time of AI.
Triveni: I would argue that even this turning point wasn't going to happen unless certain global events happened this year. I think that the pandemic, the movement to everybody working remotely really changed the map. When I think back to where we were last year, 2020 was going to be all about scale. Bigger, better, bolder. And now it's, "Well, okay, we probably need to retool things a little bit and change how we work together, given that we're now all working in a new world." I don't think that when we return to normal life, whatever that might be, I don't think that need to collaborate is actually going to go away.
Chris: Well, that's it for season four. I know I had a lot of fun this season, my inaugural season with you Triveni, and I know it's been a big part with spending this time together with you.
Triveni: I'm really looking forward to where we go next.
Chris: And also thank you so much for all those listeners out there. We've enjoyed having you with us along this way, and rest assured we'll be back next year with much more data musings. You can find all of our Banana Data episodes here, enjoy!