At the Intersection of AI/ML and HCI with Douglas Eck of Google (MongoDB Podcast)

Michael Lynn, Anaiya Raisinghani30 min read • Published Aug 10, 2021 • Updated Feb 27, 2023

MongoDB

Rate this podcast

Doug Eck is a principal scientist at Google and a research director on the Brain Team. He created the ongoing research project, Magenta, which focuses on the role of machine learning in the process of creating art and music. He is joining Anaiya Raisinghani, Michael Lynn, and Nic Raboy today to discuss all things artificial intelligence, machine learning, and to give us some insight into his role at Google.

We are going to be diving head first into HCI (Human Computer Interaction), Google’s new GPT-3 language model, and discussing some of the hard issues with combining databases and deep learning. With all the hype surrounding AI, you may have some questions as to its past and potential future, so stay tuned to hear from one of Google’s best.

Doug Eck :[00:00:00] Hi everybody. My name is Doug Eck and welcome to the MongoDB podcast.

Michael Lynn : [00:00:08] Welcome to the show. Today we're talking with Doug Eck. He's a principal scientist at Google and a research director on the Brain Team. He also created and helps lead the Magenta team, an ongoing research project exploring the role of machine learning and the process of creating art and music. Today's episode was produced and the interview was led by Anaiya Raisinghani She's a summer intern here at MongoDB. She's doing a fantastic job. I hope you enjoy this episode.

We've got a couple of guests today and our first guest is a summer intern at MongoDB.

Anaiya Raisinghani : [00:00:55] Hi everyone. My name is Anaiya Raisinghani and I am the developer advocacy intern here at MongoDB.

Michael Lynn : [00:01:01] Well, welcome to the show. It's great to have you on the podcast. Before we begin, why don't you tell the folks a little bit about yourself?

Anaiya Raisinghani : [00:01:08] Yeah, of course. I'm from the Bay Area. I grew up here and I go to school in LA at the University of Southern California. My undergrad degree is in Computational Linguistics, which is half CS, half linguistics. And I want to say my overall interest in artificial intelligence, really came from the cool classes I have the unique opportunity to take, like speech recognition, natural language processing, and just being able to use machine learning libraries like TensorFlow in some of my school projects. So I feel very lucky to have had an early exposure to AI than most.

Michael Lynn : [00:01:42] Well, great. And I understand that you brought a guest with you today. Do you want to talk a little bit about who that is and what we're going to discuss today?

Anaiya Raisinghani : [00:01:48] Yes, definitely. So today we have a very, very special guest Doug Eck, who is a principal scientist at Google, a research director on the Brain Team and the creator of Magenta, so today we're going to be chatting about machine learning, AI, and some other fun topics. Thank you so much, Doug, for being here today.

Doug Eck :[00:02:07] I'm very happy to be here, Anaiya.

Michael Lynn : [00:02:08] Well, Doug, it's great to have you on the show. Thanks so much for taking the time to talk with us. And at this point, I kind of want to turn it over to Anaiya. She's got some prepared questions. This is kind of her field of study, and she's got some passion and interest around it. So we're going to get into some really interesting topics in the machine learning space. And Anaiya, I'll turn it over to you.

Anaiya Raisinghani : [00:02:30] Perfect. Thank you so much, Mike. Just to get us started, Doug, could you give us a little background about what you do at Google?

Doug Eck :[00:02:36] Sure, thanks, Anaiya. Well, right now in my career, I go to a lot of meetings. By that, I mean I'm running a large team of researchers on the Google brain team, and I'm trying to help keep things going. Sometimes it feels like herding cats because we hire very talented and very self motivated researchers who are doing fundamental research in machine learning. Going back a bit, I've been doing something like this, God, it's terrifying to think about, but almost 30 years. In a previous life when I was young, like you Anaiya, I was playing a lot of music, playing guitar. I was an English major as an undergrad, doing a lot of writing and I just kept getting drawn into technology. And once I finished my undergrad, I worked as a database programmer.

Well, well, well before MongoDB. And, uh, I did that for a few years and really enjoyed it. And then I decided that my passion was somewhere in the overlap between music and artificial intelligence. And at that point in my life, I'm not sure I could have provided a crisp definition of artificial intelligence, but I knew I wanted to do it.

I wanted to see if we can make intelligent computers help us make music. And so I made my way back into grad school. Somehow I tricked a computer science department into letting an English major do a PhD in computer science with a lot of extra math. And, uh, I made my way into an area of AI called machine learning, where our goal is to build computer programs that learn to solve problems, rather than kind of trying to write down the recipe ourselves.

And for the last 20 years, I've been active in machine learning as a post-doc doing a post-doctoral fellowship in Switzerland. And then I moved to Canada and became a professor there and worked with some great people at the University of Montreal, just like changing my career every, every few years.

So, uh, after seven years there, I switched and came to California and became a research scientist at Google. And I've been very happily working here at Google, uh, ever since for 11 years, I feel really lucky to have had a chance to be part of the growth and the, I guess, Renaissance of neural networks and machine learning across a number of really important disciplines and to have been part of spearheading a bit of interest in AI and creativity.

Anaiya Raisinghani : [00:04:45] That's great. Thank you so much. So there's currently a lot of hype around just AI in general and machine learning, but for some of our listeners who may not know what it is, how would you describe it in the way that you understand it?

Doug Eck :[00:04:56] I was afraid you were going to ask that because I said, you know, 30 years ago, I couldn't have given you a crisp definition of AI and I'm not sure I can now without resorting to Wikipedia and cheating, I would define artificial intelligence as the task of building software that behaves intelligently. And traditionally there have been two basic approaches to AI in the past, in the distant past, in the eighties and nineties, we called this neat versus scruffy. Where neat was the idea of writing down sets of rules, writing down a recipe that defined complex behavior like translate a translation maybe, or writing a book, and then having computer programs that can execute those rules. Contrast that with scruffy scruffy, because it's a bit messier. Um, instead of thinking we know the rules, instead we build programs that can examine data can look at large data sets. Sometimes datasets that have labels, like this is a picture, this is a picture of an orangutan. This is a picture of a banana, et cetera, and learn the relationship between those labels and that data. And that's a kind of machine learning where our goal is to help the machine learn, to solve a problem, as opposed to building in the answer. And long-term at least the current evidence where we are right now in 2021, is that for many, many hard tasks, probably most of them it's better to teach the machine how to learn rather than to try to provide the solution to the problem. And so that's how I would define a machine learning is writing software that learns to solve problems by processing information like data sets, uh, what might come out of a camera, what might come out of a microphone. And then learn to leverage what it's learned from that data, uh, to solve specific sub problems like translation or, or labeling, or you pick it. There are thousands of possible examples.

Anaiya Raisinghani : [00:06:51] That's awesome. Thank you so much. So I also wanted to ask, because you said from 30 years ago, you wouldn't have known that definition. What has it been like to see how machine learning has improved over the years? Especially now from an inside perspective at Google.

Doug Eck :[00:07:07] I think I've consistently underestimated how fast we can move. Perhaps that's human nature. I noticed a statistic that, this isn't about machine learning, but something less than 70 years, 60, 61 years passed between the first flight, the Wright brothers and landing on the moon. And like 60 years, isn't very long. That's pretty shocking how fast we moved. And so I guess it shouldn't be in retrospect, a surprise that we've, we've moved so fast. I did a retrospective where I'm looking at the quality of image generation. I'm sure all of you have seen these hyper-realistic faces that are not really faces, or maybe you've heard some very realistic sounding music, or you've seen a machine learning algorithm able to generate really realistic text, and this was all happening. You know, in the last five years, really, I mean, the work has been there and the ideas have been there and the efforts have been there for at least two decades, but somehow I think the combination of scale, so having very large datasets and also processing power, having large or one large computer or many coupled computers, usually running a GPU is basically, or TPU is what you think of as a video card, giving us the processing power to scale much more information. And, uh, I don't know. It's been really fun. I mean, every year I'm surprised I get up in the morning on Monday morning and I don't dread going to work, which makes me feel extremely lucky. And, uh, I'm really proud of the work that we've done at Google, but I'm really proud of what what's happened in the entire research community.

Michael Lynn : [00:08:40] So Doug, I want to ask you and you kind of alluded to it, but I'm curious about the advances that we've made. And I realize we are very much standing on the shoulders of giants and the exponential rate at which we increase in the advances. I'm curious from your perspective, whether you think that's software or hardware and maybe what, you know, what's your perspective on both of those avenues that we're advancing in.

Doug Eck :[00:09:08] I think it's a trade off. It's a very clear trade off. When you have slow hardware or not enough hardware, then you need to be much, much more clever with your software. So arguably the, the models, the approaches that we were using in the late 1990s, if you like terminology, if your crowd likes buzzwords support, vector machines, random forests, boosting, these are all especially SVM support vector machines are all relatively complicated. There's a lot of machinery there. And for very small data sets and for limited processing power, they can outperform simpler approaches, a simpler approach, it may not sound simple because it's got a fancy name, a neural network, the underlying mechanism is actually quite simple and it's all about having a very simple rule to update a few numbers. We call them parameters, or maybe we call them weights and neural networks don't work all that well for small datasets and for small neural networks compared to other solutions. So in the 1980s and 1990s, it looked like they weren't really very good. If you scale these up and you run a simple, very simple neural network on with a lot of weights, a lot of parameters that you can adjust, and you have a lot of data allowing the model to have some information, to really grab onto they work astonishingly well, and they seem to keep working better and better as you make the datasets larger and you add more processing power. And that could be because they're simple. There's an argument to be made there that there's something so simple that it scales to different data sets, sizes and different, different processing power. We can talk about calculus, if you want. We can dive into the chain rule. It's only two applications on the chain rule to get to backprop.

Michael Lynn : [00:10:51] I appreciate your perspective. I do want to ask one more question about, you know, we've all come from this conventional digital, you know, binary based computing background and fascinating things are happening in the quantum space. I'm curious, you know, is there anything happening at Google that you can talk about in that space?

Doug Eck :[00:11:11] Well, absolutely. We have. So first caveat, I am not an expert in quantum. We have a top tier quantum group down in Santa Barbara and they have made a couple of. It had been making great progress all along a couple of breakthroughs last year, my understanding of the situation that there's a certain class of problems that are extraordinarily difficult to solve with the traditional computer, but which a quantum computer will solve relatively easily. And that in fact, some of these core problems can form the basis for solving a much broader class of problems if you kind of rewrite these other problems as one of these core problems, like factorizing prime numbers, et cetera. And I have to admit, I am just simply not a quantum expert. I'm as fascinated about it as you are, we're invested. I think the big question mark is whether the class of problems that matter to us is big enough to warrant the investment and basically I've underestimated every other technological revolution. Right. You know, like I didn't think we'd get to where we are now. So I guess, you know, my skepticism about quantum is just, this is my personality, but I'm super excited about what it could be. It's also, you know, possible that we'll be in a situation where Quantum yield some breakthroughs that provides us with some challenges, especially with respect to security and cryptography. If we find new ways to solve massive problems that lead indirectly for us to be able to crack cryptographic puzzles. But if there's any quantum folks in the audience and you're shrugging your shoulders and be like, this guy doesn't know what he's talking about. This guy admits he doesn't really know what he's talking about.

Michael Lynn : [00:12:44] I appreciate that. So I kind of derailed the conversation Anaiya, you can pick back up if you like.

Anaiya Raisinghani : [00:12:51] Perfect. Thank you. Um, I wanted to ask you a little bit about HCI which is human computer interaction and what you do in that space. So a lot of people may not have heard about human computer interaction and the listeners. I can get like a little bit of a background if you guys would like, so it's really just a field that focuses on the design of computer technology and the way that humans and computers interact. And I feel like when people think about artificial intelligence, the first thing that they think about are, you know, robots or big spaces. So I wanted to ask you with what you've been doing at Google. Do you believe that machine learning can really help advance human computer interaction and the way that human beings and machines interact ethically?

Doug Eck :[00:13:36] Thank you for that. That's an amazingly important question. So first a bit of a preface. I think we've made a fairly serious error in how we talk about AI and machine learning. And specifically I'm really turned off by the personification of AI. Like the AI is going to come and get you, right? Like it's a conscious thing that has volition and wants to help you or hurt you. And this link with AI and robotics, and I'm very skeptical of this sort of techno-utopian folks who believe that we can solve all problems in the world by building a sentient AI. Like there are a lot of real problems in front of us to solve. And I think we can use technology to help help us solve them. But I'm much more interested in solving the problems that are right in front of us, on the planet, rather than thinking about super intelligence or AGI, which is artificial general intelligence, meaning something smarter than us. So what does this mean for HCI human computer interaction? I believe fundamentally. We use technology to help us solve problems. We always have, we have from the very beginning of humanity with things like arrowheads and fire, right. And I fundamentally don't see AI and machine learning as any different. I think what we're trying to do is use technology to solve problems like translation or, you know, maybe automatic identification of objects and images and things like that. Ideally many more interesting problems than that. And one of the big roadblocks comes from taking a basic neural network or some other model trained on some data and actually doing something useful with it. And often it's a vast, vast, vast distance between a model and a lab that can, whatever, take a photograph and identify whether there's an orangutan or a banana in it and build something really useful, like perhaps some sort of medical software that will help you identify skin cancer. Right. And that, that distance ends up being more and more about how to actually make the software work for people deal with the messy real-world constraints that exist in our real, you know, in our actual world. And, you know, this means that like I personally and our team in general, the brain team we've become much more interested in HCI. And I wouldn't say, I think the way you worded it was can machine learning help revolutionize HCI or help HCI or help move HCI along. It's the wrong direction we need there like we need HCI's help. So, so we've, we've been humbled, I think by our inability to take like our fancy algorithms and actually have them matter in people's lives. And I think partially it's because we haven't engaged enough in the past decade or so with the HCI community. And, you know, I personally and a number of people on my, in my world are trying really hard to address that. By tackling problems with like joint viewpoints, that viewpoint of like the mathematically driven AI researcher, caring about what the data is. And then the HCI and the user interface folks were saying, wait, what problem are you trying to solve? And how are you going to actually take what this model can do and put it in the hands of users and how are you going to do it in a way that's ethical per your comment Anaiya? And I hope someone grabbed the analogy of going from an image recognition algorithm to identifying skincancers. This has been one topic, for example, this generated a lot of discussion because skin cancers and skin color correlates with race and the ability for these algorithms to work across a spectrum of skin colors may differ, um, and our ability to build trust with doctors so that they want to use the software and patients, they believe they can trust the software. Like these issues are like so, so complicated and it's so important for us to get them right. So you can tell I'm a passionate about this. I guess I should bring this to a close, which is to say I'm a convert. I guess I have the fervor of a convert who didn't think much about HCI, maybe five, six years ago. I just started to see as these models get more and more powerful that the limiting factor is really how we use them and how we deploy them and how we make them work for us human beings. We're the personified ones, not the software, not the AI.

Anaiya Raisinghani : [00:17:37] That's awesome. Thank you so much for answering my question, that was great. And I appreciate all the points you brought up because I feel like those need to be talked about a lot more, especially in the AI community. I do want to like pivot a little bit and take part of what you said and talk about some of the issues that come with deep learning and AI, and kind of connect them with neural networks and databases, because I would love to hear about some of the things that have come up in the past when deep learning has been tried to be integrated into databases. And I know that there can be a lot of issues with deep learning and tabular databases, but what about document collection based databases? And if the documents are analogous to records or rows in a relational database, do you think that machine learning might work or do you believe that the same issues might come up?

Doug Eck :[00:18:24] Another great question. So, so first to put this all in content, arguably a machine learning researcher. Who's really writing code day to day, which I did in the past and now I'm doing more management work, but you're, you know, you're writing code day-to-day, you're trying to solve a hard problem. Maybe 70 or 80% of your time is spent dealing with data and how to manage data and how to make sure that you don't have data errors and how to move the data through your system. Probably like in, in other areas of computer science, you know, we tend to call it plumbing. You spend a lot of time working on plumbing. And this is a manageable task. When you have a dataset of the sort we might've worked with 15 years ago, 10,000, 28 by 28 pixel images or something like that. I hope I got the pixels, right. Something called eminence, a bunch of written digits. If we start looking at datasets that are all of the web basically represented in some way or another, all of the books in the library of Congress as a, as a hypothetical massive, massive image, data sets, massive video data sets, right? The ability to just kind of fake it. Right, write a little bit of Python code that processes your data and throws it in a flat file of some sort becomes, you know, becomes basically untraceable. And so I think we're at an inflection point right now maybe we were even at that inflection point a year or two ago. Where a lot of machine learning researchers are thinking about scalable ways to handle data. So that's the first thing. The second thing is that we're also specifically with respect to very large neural networks, wanting predictions to be factual. If we have a chat bot that chats with you and that chat bot is driven by a neural network and you ask it, what's the capital of Indiana, my home state. We hope it says Indianapolis every time. Uh, we don't want this to be a roll of the dice. We don't want it to be a probabilistic model that rolls the dice and says Indianapolis, you know, 50 times, but 51 time that 51st time instead says Springfield. So there's this very, very active and rich research area of bridging between databases and neural networks, which are probabilistic and finding ways to land in the database and actually get the right answer. And it's the right answer because we verify that it's the right answer. We have a separate team working with that database and we understand how to relate that to some decision-making algorithm that might ask a question: should I go to Indianapolis? Maybe that's a probabilistic question. Maybe it's role as a dice. Maybe you all don't want to come to Indianapolis. It's up to you, but I'm trying to make the distinction between, between these two kinds of, of decisions. Two kinds of information. One of them is probabilistic. Every sentence is unique. We might describe the same scene with a million different sentences. But we don't want to miss on facts, especially if we want to solve hard problems. And so there's an open challenge. I do not have an answer for it. There are many, many smarter people than me working on ways in which we can bridge the gap between products like MongoDB and machine learning. It doesn't take long to realize there are a lot of people thinking about this. If you do a Google search and you limit to the site, reddit.com and you put them on MongoDB and machine learning, you see a lot of discussion about how can we back machine learning algorithms with, with databases. So, um, it's definitely an open topic. Finally. Third, you mentioned something about rows and columns and the actual structure of a relational database. I think that's also very interesting because algorithms that are sensitive, I say algorithm, I mean a neural network or some other model program designed to solve a problem. You know, those algorithms might actually take advantage of that structure. Not just like cope with it, but actually understand in some ways how, in ways that it's learning how to leverage the structure of the database to make it easier to solve certain problems. And then there's evidence outside of, of databases for general machine learning to believe that's possible. So, for example, in work, for example, predicting the structure of proteins and other molecules, we have some what we might call structural prior information we have some idea about the geometry of what molecules should look like. And there are ways to leverage that geometry to kind of limit the space of predictions that the model would make. It's kind of given that structure as, as foundation for, for, for the productions, predictions is making such that it won't likely make predictions that violate that structure. For example, graph neural networks that actually work on a graph. You can write down a database structure as a graph if you'd like, and, and take advantage of that graph for solving hard problems. Sorry, that was, it's like a 10 minute answer. I'll try to make them shorter next time, Anaiya, but that's my answer.

Anaiya Raisinghani : [00:23:03] Yeah. Cause I, well, I was researching for this and then also when I got the job, a lot of the questions during the interview were, like how you would use machine learning, uh, during my internship and I saw articles like stretching all the way back the early two thousands talking about just how applying, sorry, artificial neural networks and ANN's to large modern databases seems like such a great idea in theory, because you know, like they, they offer potential fault tolerance, they're inherently parallel. Um, and the intersection between them just looks really super attractive. But I found this article about that and like, the date was 2000 and then I looked for other stuff and everything from there was the issues between connecting databases and deep learning. So thank you so much for your answer. I really appreciate that. I feel like, I feel like, especially on this podcast, it was a great, great answer to a hard question.

Doug Eck :[00:23:57] Can I throw, can I throw one more thing before you move on? There are also some like what I call low hanging fruit. Like a bunch of simpler problems that we can tackle. So one of the big areas of machine learning that I've been working in is, is that of models of, of language of text. Right? And so think of translation, you type in a string in one language, and we translate it to another language or if, and if, if your listeners have paid attention to some, some new um, machine learning models that can, you can chat with them like chatbots, like Google's Lambda or some large language models that can write stories. We're realizing we can use those for data augmentation and, and maybe indirectly for data verification. So we may be able to use neural networks to predict bad data entries. We may be able to, for example, let's say your database is trying to provide a thousand different ways to describe a scene. We may be able to help automate that. And then you'd have a human who's coming in. Like the humans always needs to be there I think to be responsible, you know, saying, okay, here's like, you know, 20 different ways to describe this scene at different levels of complexity, but we use the neural network to help make their work much, much faster. And so if we move beyond trying to solve the entire problem of like, what is a database and how do we generate it, or how do we do upkeep on it? Like, that's one thing that's like the holy grail, but we can be thinking about using neural networks in particularly language models to, to like basically super charge human data, data quality people in ways that I think are just gonna go to sweep through the field and help us do a much, much better job of, of that kind of validation. And even I remember from like a long time ago, when I did databases, data validation is a pain, right? Everybody hates bad data. It's garbage in, garbage out. So if we can make cleaner, better data, then we all win.

Anaiya Raisinghani : [00:25:39] Yeah. And on the subject of language models, I also wanted to talk about the GPT 3 and I saw an article from MIT recently about how they're thinking it can replace Google's page rank. And I would just love to hear your thoughts on what you think might happen in the future and if language models actually could replace indexing.

Doug Eck :[00:25:58] So to be clear, we will still need to do indexing, right? We still need to index the documents and we have to have some idea of what they mean. Here's the best way to think about it. So we, we talked to IO this year about using some large language models to improve our search in our products. And we've talked about it in other blogs. I don't want to get myself in trouble by poorly stating what has already been stated. I'd refer you there because you know, nobody wants, nobody wants to have to talk to their boss after the podcast comes out and says, why did you say that? You know, but here's the thing. This strikes me. And this is just my opinion. Google's page rank. For those of you who don't know what page rank is, the basic idea is instead of looking at a document and what the document contains. We decide the value of the document by other documents that link into that document and how much we trust the other documents. So if a number of high profile websites link to a document that happens to be about automobiles, we'll trust that that document is about automobiles, right? Um, and so it's, it's a graph problem where we assign trust and propagate it from, from incoming links. Um, thank you, Larry and Sergei. Behind that is this like fundamental mistrust of being able to figure out what's in a document. Right, like the whole idea is to say, we don't really know what's in this document. So we're going to come up with a trick that allows us to value this document based upon what other documents think about it. Right. And one way you could think about this revolution and large language models, um, like GPT-3 which came from open AI and, um, which is based upon some core technology that came from our group called transformer. That's the T in GPT-3 with there's always friendly rivalries that the folks at Open AI are great. And I think our team is great too. We'll kind of ratcheting up who can, who can move faster, um, cheers to Open AI. Now we have some pretty good ways of taking a document full of words. And if you want to think about this abstractly, projecting it into another space of numbers. So maybe for that document, which may have like as many words as you need for the document, let's say it's between 500 and 2,000 words, right. We take a neural network and we run that sequence through the neural network. And we come out with this vector of numbers that vector, that sequence of numbers maybe it's a thousand numbers right, now, thanks to the neural network that thousand numbers actually does a really good job of describing what's in the document. We can't read it with our eyes, cause it's just a sequence of numbers. But if we take that vector and compare it to other vectors, what we'll find is similar vectors actually contain documents that contain very similar information and they might be written completely differently. Right. But topically they're similar. And so what we get is the ability to understand massive, massive data sets of text vis-a-vis what it's about, what it means, who it's for. And so we have a much better job of what's in a document now, and we can use that information to augment what we know about how people use documents, how they link to them and how much they trust them. And so that just gives us a better way to surface relevant documents for people. And that's kind of the crux in my mind, or at least in my view of why a large language model might matter for a search company. It helps us understand language and fundamentally most of search is about language.

Anaiya Raisinghani : [00:29:11] I also wanted to talk to you about, because language is one of the big things with AI, but then now there's been a lot of movement towards art and music. And I know that you're really big into that. So I wanted to ask you about for the listeners, if you could explain a little bit behind Magenta, and then I also wanted to talk to you about Yacht because I heard that they used Magenta for yeah. For their new album. And so like, what are your thoughts on utilizing AI to continue on legacies in art and music and just creation?

Doug Eck :[00:29:45] Okay, cool. Well, this is a fun question for me. Uh, so first what's Magenta? Magenta is an open source project that I'm very proud to say I created initially about six years ago. And our goal with Magenta is to explore the role of machine learning as a tool in the creative process. If you want to find it, it's at g.co/magenta. We've been out there for a long time. You could also just search for Google Magenta and you'll find us, um, everything we do goes in open source basically provide tools for musicians and artists, mostly musicians based upon the team. We are musicians at heart. That you can use to extend your musical, uh, your musical self. You can generate new melodies, you can change how things sound you can understand more, uh, the technology. You can use us to learn JavaScript or Python, but everything we do is about extending people and their music making. So one of the first things I always say is I think it would be, it's kind of cool that we can generate realistic sounding melodies that, you know, maybe sound like Bach or sound like another composer, but that's just not the point. That's not fun. Like, I think music is about people communicating with people. And so we're really more in the, in the heritage of, you know, Les Paul who invented was one of the inventors of the electric guitar or the cool folks that invented guitar pedals or amplifiers, or pick your favorite technology that we use to make a new kind of music. Our real question is can we like build a new kind of musical instrument or a new kind of music making experience using machine learning. And we've spent a lot of time doing fundamental research in this space, published in conferences and journals of the sort that all computer scientists do. And then we've done a lot of open source work in JavaScript so that you can do stuff really fast in the browser. Also plugins for popular software for musicians like Ableton and then sort of core hardcore machine learning in Python, and we've done some experimental work with some artists. So we've tried to understand better on the HCI side, how this all works for real artists. And one of the first groups we worked with is in fact, thank you for asking a group called Yacht. They're phenomenal in my mind, a phenomenal pop band. I think some part LCD sound system. I don't know who else to even add. They're from LA their front person. We don't say front man, because it's Claire is Claire Evans. She's an amazing singer, an utterly astonishing presence on stage. She's also a tech person, a tech writer, and she has a great book out that everybody should read, especially every woman in tech, Anaiya, called BroadBand the story of, um, of women in the internet. I mean, I don't remember if I've got the subtitle, right. So anyway very interesting people and what they did was they came to us and they worked with a bunch of other AI folks, not just Google at all. Like we're one of like five or six collaborators and they just dove in headfirst and they just wrestled with the technology and they tried to do something interesting. And what they did was they took from us, they took a machine learning model. That's able to generate variations on a theme. So, and they use pop music. So, you know, you give it right. And then suddenly the model is generating lots of different variations and they can browse around the space and they can play around and find different things. And so they had this like a slight AI extension of themselves. Right. And what they did was utterly fascinating. I think it's important. Um, they, they first just dove in and technically dealt with the problems we had. Our HCI game was very low then like we're like quite, quite literally first type this pro type this command into, into, into a console. And then it'll generate some midi files and, you know, there are musicians like they're actually quite technically good, but another set of musicians of like what's a command line. Right. You know, like what's terminal. So, you know, you have these people that don't work with our tooling, so we didn't have anything like fancy for them. But then they also set constraints. So, uh, Jona and Rob the other two folks in the band, they came up with kind of a rule book, which I think is really interesting. They said, for example, if we take a melody generated by the Magenta model, we won't edit it ever, ever, ever. Right. We might reject it. Right. We might listen to a bunch of them, but we won't edit it. And so in some sense, they force themselves to like, and I think if they didn't do that, it would just become this mush. Like they, they wouldn't know what the AI had actually done in the end. Right. So they did that and they did the same with another, uh, some other folks, uh, generating lyrics, same idea. They generated lots and lots of lyrics. And then Claire curated them. So curation was important for them. And, uh, this curation process proved to be really valuable for them. I guess I would summarize it as curation, without editing. They also liked the mistakes. They liked when the networks didn't do the right thing. So they liked breakage like this idea that, oh, this didn't do what it was supposed to. I like that. And so this combination of like curiosity work they said it was really hard work. Um, and in a sense of kind of building some rules, building a kind of what I would call it, grammar around what they're doing the same way that like filmmakers have a grammar for how you tell a story. They told a really beautiful story, and I don't know. I'm I really love Chain Tripping. That's the album. If you listened to it, every baseline was written by a magenta model. The lyrics were written by, uh, an LSTM network by another group. The cover art is done by this brilliant, uh, artists in Australia, Tom white, you know, it's just a really cool album overall.

Anaiya Raisinghani : [00:35:09] Yeah, I've listened to it. It's great. I feel like it just alludes to how far technology has come.

Doug Eck :[00:35:16] I agree. Oh, by the way that the, the drum beats, the drum beats come from the same model. But we didn't actually have a drum model. So they just threw away the notes and kept the durations, you know, and the baselines come from a model that was trained on piano, where the both of, both of both Rob and Jona play bass, but Rob, the guy who usually plays bass in the band is like, it would generate these baselines that are really hard to play. So you have this like, idea of like the AI is like sort of generating stuff that they're just physically not used to playing on stage. And so I love that idea too, that it's like pushing them, even in ways that like onstage they're having to do things slightly differently with their hands than they would have to do. Um, so it's kind of pushes them out.

Michael Lynn : [00:35:54] So I'm curious about the authoring process with magenta and I mean, maybe even specifically with the way Yacht put this album together, what are the input files? What trains the system.

Doug Eck :[00:36:07] So in this case, this was great. We gave them the software, they provided their own midi stems from their own work. So, that they really controlled the process. You know, our software has put out and is licensed for, you know, it's an Apache license, but we make no claims on what's being created. They put in their own data, they own it all. And so that actually made the process much more interesting. They weren't like working with some like weird, like classical music, piano dataset, right. They were like working with their own stems from their own, um, their own previous recordings.

Michael Lynn : [00:36:36] Fantastic.

Anaiya Raisinghani : [00:36:38] Great. For my last question to kind of round this out, I just wanted to ask, what do you see that's shocking and exciting about the future of machine learning.

Doug Eck :[00:36:49] I'm so bad at crystal ball. Um,

Michael Lynn : [00:36:53] I love the question though.

Doug Eck :[00:36:56] Yeah. So, so here, I think, I think first, we should always be humble about what we've achieved. If you, if you look, you know, humans are really smart, like way smarter than machines. And if you look at the generated materials coming from deep learning, for example, faces, when they first come out, whatever new model first comes out, like, oh my God, I can't tell them from human faces. And then if you play with them for a while, you're like, oh yeah, they're not quite right. They're not quite right. And this has always been true. I remember reading about like when the phonograph first came out and they would, they would demo the phonograph on, on like a stage in a theater. And this is like a, with a wax cylinder, you know? People will leave saying it sounds exactly like an orchestra. I can't tell it apart. Right. They're just not used to it. Right. And so like first I think we should be a little bit humble about what we've achieved. I think, especially with like GPT-3, like models, large language models, we've achieved a kind of fluency that we've never achieved before. So the model sounds like it's doing something, but like it's not really going anywhere. Right. And so I think, I think by and large, the real shocking new, new breakthroughs are going to come as we think about how to make these models controllable so can a user really shape the output of one of these models? Can a policymaker add layers to the model that allow it to be safer? Right. So can we really have like use this core neural network as, you know, as a learning device to learn the things that needs to define patterns in data, but to provide users with much, much more control about how, how those patterns are used in a product. And that's where I think we're going to see the real wins, um, an ability to actually harness this, to solve problems in the right way.

Anaiya Raisinghani : [00:38:33] Perfect. Doug, thank you so much for coming on today. It was so great to hear from you.

Doug Eck :[00:38:39] That was great. Thanks for all the great questions, Anaiya, was fantastic

Michael Lynn : [00:38:44] I'll reiterate that. Thanks so much, Doug. It's been great chatting with you. Thanks for listening. If you enjoyed this episode, please like, and subscribe, have a question or a suggestion for the show? Visit us in the MongoDB community forums at community.Mongodb.com.

Thank you so much for taking the time to listen to our episode today. If you would like to learn more about Doug’s work at Google, you can find him through his LinkedIn profile or his Google Research profile. If you have any questions or comments about the episode, please feel free to reach out to Anaiya Raisinghani, Michael Lynn, or Nic Raboy.

You can also find this, and all episodes of the MongoDB Podcast on your favorite podcast network.

Rate this podcast

Tutorial

Searching for Nearby Points of Interest with MongoDB and Mapbox

Feb 03, 2023 | 5 min read

Quickstart

Getting Started with MongoDB and Java - CRUD Operations Tutorial

Mar 01, 2024 | 24 min read

Article

Real-Time Card Fraud Solution Accelerator with MongoDB and Databricks

Jul 11, 2023 | 7 min read

Podcast

Making Diabetes Data More Accessible and Meaningful with Tidepool and MongoDB