Why Data Scientists Need MLOps with Mikiko Bazeley

Episode 
17

MLOps Weekly Podcast

Why Data Scientists Need MLOps with Mikiko Bazeley
Head of MLOps, Featureform

Listen on Spotify

[00:00:06.170] - Simba Khadder

Hey everyone. Simba Khadder here and you are listening tothe MLOps Weekly Podcast. Today we're doing something a little bit different.I'm going to be speaking with our head of MLOps here on Featureform, Mikiko.Before I get further Mikiko, why don't you go ahead and introduce yourself.

 

[00:00:22.930] - Mikiko Bazeley

Hey, everyone, my name is Mikiko Bazeley and I joinedFeatureform pretty recently, let's say around last October, as head of MLOps asSimba mentioned. Prior to joining Featureform, I've worked a number of roles asa data analyst, data scientist, and even more recently as like an MLOpsengineer working on the Mailchimp ML platform team. Was on the Mailchimpplatform team.

 

[00:00:51.780] - Mikiko Bazeley

I've also worked in a ton of different industries, like youname it, real estate tech, HR tech, antipiracy, which yeah, I know, I feel badabout that, but a girl's got to pay rent, solar, even for example 3D designmodeling, which has become more relevant for a company like Autodesk. Reallysuper happy to finally have this conversation on the podcast.

 

[00:01:16.610] - Simba Khadder

I know we've been talking about doing this forever, so I'mreally excited to be able to have you on. Man, there's so much to talk about. Iwould love to start maybe by talking about you started your career on asergeant career, but you went and made the jump from more applied datascientist to more ML engineer. I would love to first jump into that transition,that difference that you saw between being a data scientist and ML engineer.Yeah, let's just start there.

 

[00:01:43.430] - Mikiko Bazeley

Yeah, it's funny because when I've talked to other MLOpsengineers or the platform engineer types, everyone's origin story really startsoff from the story of trying to move your model from a Jupyter Notebook tosomething that's actually in production.

 

[00:01:58.930] - Mikiko Bazeley

Specifically that moment came to me early on. Right before Ithink quarantine hit, I was working as a data scientist focused on growthmarketing at Livongo, which was soon to be acquired by this company calledTeladoc, which was involved in telemedicine and was considered one of thebiggest telemedicine companies in the US at least.

 

[00:02:22.210] - Mikiko Bazeley

I worked for Livongo. Our main focus was really on IoT datafor people with chronic conditions, specifically with diabetes, hypertension,and there's a few others. We were just starting to get into kidney failure.Essentially the way we did that was Livongo created a glucose measure or adiabetes monitor. We get that information, analyse it, and then try to make thepatients or the users aware that, hey, you might be having a spike.

 

[00:02:52.860] - Mikiko Bazeley

My particular focus, though, at that company was actuallytrying to get people enrolled into the programs. At Livongo, we would partnerwith companies like, for example, Home Depot or a few others, where they areeither self-insured or even maybe they're part of Medicare Medicaid.

 

[00:03:10.790] - Mikiko Bazeley

My goal was to, number 1, make sure that when the companysent out any emails or even mail, they were actually dealing with mail, makesure that we understood well, first off, whether or not someone was going tosign up and get enrolled into the program. Secondly, what were the potentialfactors as to why they wouldn't sign up?

 

[00:03:34.050] - Mikiko Bazeley

There's like a number of things we were trying tounderstand. We wanted to get people enrolled in these health programs that wereally believe would improve their lives, especially with chronic conditions.Of course, in that instance, or in that scenario, what you really want is youwant to be able to very quickly segment users or patients who are going toenroll or not enroll.

 

[00:03:56.570] - Mikiko Bazeley

You want to be able to very quickly deliver that to ournurses, to our in service caretakers, etc. They need that information sooner orlater. This is especially true if you're trying to look at spikes in theirglucose or A1C for diabetes. Being able to have those inferences andpredictions served as real-time as possible, whether it's truly real-time orwhether it's in fact more of like a batch process, was really important.

 

[00:04:28.790] - Mikiko Bazeley

Of course, when I was first trying to develop predictivemodels, I had come out of a boot camp prior to that. But the actual deploymentand the serving and monitoring was really just not talked about or taught. I'mnot blaming the boot camp for it, but I think those are really, reallycomplicated problems.

 

[00:04:48.700] - Mikiko Bazeley

A lot of times we just did not have the either engineeringresources or even the experience on the engineering side dealing with machinelearning to be able to get those models in production. I was doing a lot ofbootstrapping. I saw people around me, even some of the senior data scientists,having that same issue.

 

[00:05:09.190] - Mikiko Bazeley

I realised that, look, this is actually like a realbottleneck for whatever reasons, and it's something that I really want to workon. I feel like it was really valuable because no matter what, I could createthe most beautiful models in the world or the most beautiful analyses, but if Icouldn't actually get it in front of people in a scalable way, my value as adata scientist was going to be severely bottlenecked and limited.

 

[00:05:33.910] - Simba Khadder

Got it. I guess you started seeing that problem, the MLengineering problem. It sounds like that's what drove you to want to work onthat problem in particular, moving to the ML engineering side at Mailchimp.

 

[00:05:46.930] - Mikiko Bazeley

Yeah, absolutely. A huge part of it, too, was I noticed thatearly on there was this gap between data science practitioners or people whoare building these data science or machine learning assets and the deploymentand the scaling layer. How do we standardise and how do we make this into apractice that can really be enabled?

 

[00:06:10.730] - Mikiko Bazeley

I was super interested in that, especially since I was likeboot camp grad, self-taught and formally taught. I wanted to really understandwhat are ways that we can enable experimentation and innovation at scale. Yes,that's when I started saying, okay, I want to pivot more to the platform side,start to make those moves, and then join the Mailchimp team to do exactly thatwork.

 

[00:06:36.930] - Simba Khadder

You talked about value both on the data science side and theMLOps side. On the data science side, you mentioned the MLOps was bottleneckingyour ability to create value as a data scientist. I want to first go into thatpiece. As the macro environment is changing, people especially like datascientists, I think many of them are looking around and trying to say hey, how amI bringing value to my organisation? And trying to be able to show that.

 

[00:07:03.940] - Simba Khadder

My first question to you is, how does a data scientist dothat? Where do you feel like data scientists provide value to an organisation?

 

[00:07:12.890] - Mikiko Bazeley

Yeah, and actually, I would argue that what I was being bottlenecked on was a lack of MLOps practice and tooling. I would say that wasactually the bigger bottleneck. In terms of where data scientists bring valuein an organisation, so I remember a few years ago and it's funny how stuff justchanges, what people think is really important in a data scientist role. A fewyears ago, everyone was all about the full stack data scientists, similar tohow they're all about the full stack, like web unicorn or what have you.Everyone was saying, look, data scientists should be able to do infrastructure.They should be able to do Kubernetes and Jenkins and Terraform, etc.

 

[00:07:57.960] - Mikiko Bazeley

I really don't think that's where data scientists excel. Forme, personally, I was very blessed in being able to see really awesome datascientists at Teldoc, who created some amazing analyses and research on whatwere some of the risk factors for our chronic condition population with regardsto COVID. There was some fantastic work there.

 

[00:08:23.220] - Mikiko Bazeley

I also saw some really awesome data scientists at Mailchimp,some of the really awesome projects that they were doing. Even before, I'mtrying to think. This was maybe around the time stable diffusion came out and Ithink a little bit after, and maybe right before ChatGPT-3 was being announced,or sorry, GPT-3 was being announced, not ChatGBT, but GPT-3 was beingannounced.

 

[00:08:48.270] - Mikiko Bazeley

Right in between that time period, for example, some of ourdata scientists, they were actually trying to figure out ways to createbusiness value using some of the existing generative AI models out there forsmall, medium-sized business users Mailchimp.

 

[00:09:03.680] - Mikiko Bazeley

I remember there's a very specific set of models that theywere working on to power what's called Creative Assistant. Specifically whatCreative Assistant does is in order to help small, medium-sized business ownersrelieve the burden of, for example, having to create email content like theimages, the copy, rather than having to go hire and organise like a designeroff of Fiverr to do that.

 

[00:09:33.830] - Mikiko Bazeley

Instead, what someone could do was they could use CreativeAssistant to scrape their own website or business or to get all the designelements out of their existing site or storefront with Mailchimp. They couldthen create pretty easily like in a UI, generate different email layouts, theycould generate different social media layouts. They could generate differentphotographic assets like color, copy, from a number of the machine learningpipelines that were running.

 

[00:10:06.530] - Mikiko Bazeley

I thought that was awesome work. The team at Mailchimp, theywere proving business value with cutting edge ML. A huge part of why they wereable to do that was one, they had a very strong understanding of ML systems,for example, ML systems, ML algorithms, specifically.

 

[00:10:29.140] - Mikiko Bazeley

For example, how would you design a recommendation pipeline?What are the best algorithms for certain tasks? How do you measure thosealgorithms? What are some things you need to look for and worry about in thedata?

 

[00:10:42.580] - Mikiko Bazeley

As well as like, for example, being able to interface withthe product teams, being able to interface with marketing. I think there's alot of things that data scientists can do. Largely it's adding the creativeinnovation and the awareness of research that's going on.

 

[00:11:00.680] - Mikiko Bazeley

I just don't feel like infrastructure is where they shouldbe spending their time or even figuring out what the happy path is for gettingmalls deployed. They should really be spending time on experimentation andtraining.

 

[00:11:14.500] - Simba Khadder

Got it. Just to repeat back to you what I think I heard, italmost sounds like what you're saying is most of the data scientists values,it's in doing data science. It's in taking data and finding insights from it oreven feeding those insights back into product features like you mentionedCreative Assistant. Is that fair? Is that how you should think about it?

 

[00:11:34.510] - Mikiko Bazeley

Yeah, absolutely.

 

[00:11:36.190] - Simba Khadder

What's the difference? I know you were talking about how youwere working on the growth side. On the growth side, it seemed like it'd bemore internal facing. There's also kind of this external facing, I think peoplecall it like a product data scientist now, where you're building like a recommendersystem is a very obvious example of this.

 

[00:11:52.630] - Simba Khadder

Do you feel like there's these different classes of datascience, or do you feel like it's all one umbrella and that any data scientistcan jump between them?

 

[00:12:00.300] - Mikiko Bazeley

I feel like when there is specialisation, it's more aboutdomain as opposed to external internal. I do feel like this external internalthing, I see it and I do feel like that's almost like a maladaptive practicewhere it should be about... Where a data scientist should be specialising is inmaybe domain and the types of problems that they're equipped to work on asopposed to whether they're working on stuff that's internal versus external.

 

[00:12:31.470] - Mikiko Bazeley

I think that's for the platform engineers to be able toenable data scientists, like in a company to create models regardless of wherethe ultimate outcome or the end goal is. I'm curious. I feel like I see datascientists when they do eventually specialise, it's like computer vision versusNLP versus forecasting. Do you feel like that's like a fair summarization?

 

[00:12:58.330] - Simba Khadder

I've actually never thought of it this way, but I actuallythink you're spot on. Why I haven't thought of it this way is my background wasmuch more recommender systems, which are almost always external. These have awide variety of problems that just don't exist in what we were calling internalproblems. There's some overlap, but I just think that recommended systems is aspecific specialisation that typically just happens to be tied to what we'recalling external.

 

[00:13:27.340] - Simba Khadder

But if you think about bureau vision, there are many usecase I could imagine. Let's say you're having or let's say you're doing NLP likedocument processing, you could be doing document processing because you areproviding people I know like autocorrect on their email or because you aretrying to do analytics for an internal team.

 

[00:13:50.830] - Simba Khadder

I feel like as a data scientist, assuming that the MLOpsplatform was very strong, most of the techniques you're going to be using arethe same. The things that are different is like the scale and other thingswhich again in theory are abstracted away by the MLOps platform. An ML platformyou're using.

 

[00:14:10.220] - Simba Khadder

I think that's right. I feel like as a data scientist, youmight become specialised in the type of data, the type of model, the type ofproblem space you're working in. And hey, I'm doing a million things thenexternal or hey, I'm like outputting a spreadsheet almost shouldn't matter.

 

[00:14:29.270] - Simba Khadder

If you're truly trying to extract the pure value a datascientist can provide, but no one else can provide, you want to just get themas close as possible to driving insights, building models. Everything else is adetail and all the other things about external and internal. Thou they doexist, those are just more maybe artifacts of just the fact that we're so naivein how we build our ML platforms today. It's just there's so much to be done.

 

[00:14:59.790] - Simba Khadder

We could almost draw the same analogy in building, let's saya typical dev service. If you're building an internal dev service, chances arethe scale is way lower. DUI doesn't have to be as good. There's all this stuffthat doesn't matter as much if it is internal as opposed to if you'rebuilding...

 

[00:15:19.190] - Simba Khadder

At the external, there's this whole new problem space. Butas time has gone on, sure we can ignore more things if we're building internaland external just based on scale. But I mean, nowadays it's all like Kubernetesanyway, they're all like services anyway, they're all in docker, it's allwritten in the same languages.

 

[00:15:37.550] - Simba Khadder

Over time, it's almost we've gotten rid of thatdifferentiation. The only difference is that your requirement space might beslightly different, but that's really the only place that comes up.

 

[00:15:48.750] - Mikiko Bazeley

Yeah, it's interesting because if you think about it like aplatform or an ML stack that's done well, the data science shouldn't even haveto think about the implementation details other than their work on thetraining, experimentation, and a good chunk of the data side.

 

[00:16:06.680] - Mikiko Bazeley

Yet I feel I've definitely seen teams where for whateverreason, they just get stuck in the tooling or the infrastructure of theplatform and then they essentially have to relearn really bad or atypicalpatterns. That might seem very intuitive to pure engineers, but from a datascience perspective, some patterns are just absolutely painful.

 

[00:16:33.590] - Simba Khadder

Yeah, I think there's a lot of differences between datascientists and engineers. Data science is inherently there's no clear path.It's not like every iteration you do is better or closer to where you're goingto end up. Where in software engineering, typically, you know where you'regoing. You're moving towards solving this requirement set.

 

[00:16:55.010] - Simba Khadder

With data science, it's a lot more of a windy path and a lotmore like, experimentation doesn't really exist in software engineering, not inthe same way. You might experiment on product features, you run tests, but youdon't really experiment in the sense of like, I'm just going to try this giantapproach and then, let's just throw it away and try this approach.

 

[00:17:17.490] - Simba Khadder

It just doesn't happen. You don't throw things away. Throwaway 99% of what you do in software engineering, that would be awful. Where indata science that would be pretty normal if you throw away most of what you did'cause it's all about learning more so that you can eventually build the bestthing.

 

[00:17:33.030] - Simba Khadder

But yeah, actually going into the ML Stack, we did talkabout these patterns are different and I think we've both talked in depthbefore about how the ML stack and ML platform should really be focused on thedata scientist fair to end user. If they hate it, then you aren't doing yourjob as the ML platform team.

 

[00:17:54.630] - Simba Khadder

But maybe just to even broaden that question a bit for you,what do you think is the goal of an ML platform? What are the key metrics ormaybe even the North Star of an ML platform?

 

[00:18:08.170] - Mikiko Bazeley

Yeah, totally. I think something that interests me, and I'dbe curious to hear your thoughts about this later is, how often it feels likeML platforms are almost not treated seriously as platforms or as even the conceptof platform as a product, like how often ML platforms are not treated asproducts.

 

[00:18:29.810] - Mikiko Bazeley

At least I've worked on all different maturity layers of theML stack from trying to create an ML stack, which was fun for a very earlystage real estate tech startup, which was hard. That was where I found out thatyou in fact cannot write a production data pipeline using pandas and that is avery bad way to go.

 

[00:18:52.490] - Mikiko Bazeley

Early stage to late stage, like for example, Mailchimp, Ithought there's like a few pieces that were a little bit missing, like thatbridging between the dev training experimentation to the serving part was alittle bit rough, but I think they had 80%-90% the way there.

 

[00:19:12.160] - Mikiko Bazeley

In terms of North Star metrics, the thing that isfascinating is for I guess people who aren't aware, North Star metrics andthere's also the one metric that matters. The idea of a North Star metric was asingle metric that everyone in a company or business could rally aroundincluding marketing, sales, revenue, product, to keep the boat going forward.

 

[00:19:42.090] - Mikiko Bazeley

I feel like we do a very bad job of measuring ROI or evenefficacy when it comes to ML platforms. Like for example, there is a funconversation, actually there's been a few conversations right in the MLScommunity, but also in other discords I've seen where people are trying tofigure out what metric to measure.

 

[00:20:04.750] - Mikiko Bazeley

Should it be time to deploy? Should it be deploymentfrequency? Should it be change failure rate? Should it be number of manualtasks or what have you? I think a huge part of it comes down to we like toconflate project metrics with platform metrics. Very specifically, a lot of timesplatform teams, they like to try to measure their velocity through how long ittakes projects to get up and running.

 

[00:20:36.620] - Mikiko Bazeley

That's challenging for a number of reasons. One, datascience projects can kind of go sideways early on. Because there's always thisbig cloud question mark, I think it's really hard to peg like, we're going topeg the efficacy of our platform to how well the data science pod is gettingtheir project through.

 

[00:20:58.390] - Mikiko Bazeley

Then the second part is a lot of platform teams, they focuson too many metrics that are not tied to the direct behavior that they'retrying to measure. They're measuring all these baseline metrics that won't eventell you, for example, like deployment frequency to some degree that's relatedto the size of your data science team. Same with meantime to restore could bean interesting one, but at the same time, ideally, your models aren't breaking.

 

[00:21:32.450] - Mikiko Bazeley

I think there's a couple of areas where we could streamlineour understanding of metrics. I don't necessarily feel like the DORA, SREplatform metrics of time to restore, change failure rate, lead time for change,deployment frequency, I don't think these are really adequate or they're superrelevant for measuring what is a data scientist relationship with an MLplatform.

 

[00:22:00.010] - Simba Khadder

Yeah, it's almost like SRE again, like software in general,you write your code and you test it and you expect the business logic to be correct.But in any distributed system, which is where SREs are going to be, there'sjust so much that can go wrong. It's like you can have a network outage, youcan have a partition, you can have weird delays in packets where just like allof a sudden you get these really strange race conditions.

 

[00:22:30.930] - Simba Khadder

I feel like a lot of what SREs focus on is solving thoseproblems where it's just like, hey, there's no way we can write, or it justdoesn't make sense to write your code so it's actually perfect. Let's justwrite it so it's good enough and just let the platform abstract away. Like,yeah, one in every million requests on Netflix is a 404. That's fine. We canaccept that. The UI will just refresh it. That's okay.

 

[00:22:57.700] - Simba Khadder

It's better to do that and just handle the error really wellthan to try to write the code in such a way where it's perfect, becausehonestly, it can't be. If it's a certain type of network partition justactually can't work.

 

[00:23:09.230] - Simba Khadder

Whereas it sounds like what you're saying, which I reallylike is, for data science, the problem space is less about like, hey, let'srelease things quickly and make sure they don't break. It sounds like a lot ofthe... There's just different stages that just almost don't exist, likeexperimentation and even just like data analysis.

 

[00:23:30.390] - Simba Khadder

The types of things that can break, it's almost like thingsdon't break in a very binary way, too. Like a model might just drift and thenall of a sudden you end up with like, hey, we just might want to retrain or wemight want to change our training set in such a way where it betterencapsulates the behavior we're seeing in production today.

 

[00:23:51.180] - Simba Khadder

You mentioned metrics that aren't great or maybe areimperfect. Is there such a thing as a set of metrics or even just a singleNorth Star metric for an ML platform team, if you had to say one? Or do youthink that in general, what does it depend on? If I'm running an ML platformteam and I'm trying to decide, hey, how do we measure our efficacy, what do Ido?

 

[00:24:12.920] - Mikiko Bazeley

Yeah, and I'd be curious to hear your thoughts about thisafterwards. I do think it depends on the stage of the... It's almost like thematurity of the data science and ml.org and then the stage they are in terms ofimplementing tooling and platforms.

 

[00:24:31.490] - Mikiko Bazeley

Because for example, what's really interesting is that somuch of the conversation about Stacks has really been driven by the Googlescale companies. But I think what we're seeing, for example, even Featureform,is there are so many companies that want to build ML platforms that just don'teven fit. They're not the big tech co-companies.

 

[00:24:54.700] - Mikiko Bazeley

Some of them are startups, some of them are mid size. Someof them, for example, they've maybe only deployed a few models and some of themhave deployed like hundreds of models. For some folks, they have one datascientist who's also the data person. For other teams they have like 10-20 datascience.

 

[00:25:13.000] - Mikiko Bazeley

I feel like there's so many different requirements onstacks. I don't think there's one single metric that's ever going to be rightfor all the stages. But I think if we're thinking more about platforms asproducts, we should be thinking, for example, about how would you measure aproduct, how would you measure adoption and engagement?

 

[00:25:31.970] - Mikiko Bazeley

Reliability metrics are great, but really if you are tryingto develop a platform that people love, I mean, sometimes also I think themetric could just be like how many people are using it. But I'd be curious,what do you think about the different flavors and maturities of stacks that areout there? Is there a single way to design and build a stack?

 

[00:25:55.160] - Simba Khadder

Yeah, there's always two sets of problems that exist. One iswhat I would call the people organisation problems, which is like getting agroup of data scientists to work together productively or even in themselves.It's like being productive alone and organised and giving them the tools theyneed to do that.

 

[00:26:17.130] - Simba Khadder

Then there is almost the infrastructure problems where it'sthings like, I need to hit this level of latency, I need to be able to handlethis much data. I need to be able... It's usually very binary. Those look likemore traditional, I guess, like North Star KPIs, like metrics. Like, hey, weneed to be able to handle this much data and this latency, this P99 latency.

 

[00:26:45.010] - Simba Khadder

Those are great. Depending on type of company, whether bigor small, you might have that. It just really depends. If you're a SaaScompany, even though you might have a lot of revenue and a lot of employees, alot of data scientists, chances are your amount of data is dwarfed by even amuch smaller B2C company or fintech company, which just typically have way, waymore data way earlier on. You actually can see that in the fact that they have waymore data scientists per headcount as a ratio, even like early on, they'll havebig data science teams.

 

[00:27:20.510] - Simba Khadder

That's the first set of problems which I think there is aplace for it and it's like the Google scale thing is a good way to think of it,but it's almost not necessarily how big of a company you are, it's almost likescale of data and where are you deployed. If you're doing real timerecommendations on a ton of users nowadays, you can be a small company andstill have a problem.

 

[00:27:40.390] - Simba Khadder

Then the other set of problems which are like theorganisational problems are very much going to be correlated to how big a teamyou are. Typically also there's other aspects like how regulated are you? Ifyou're in banking, you probably have a lot more regulation, and you'd be muchmore likely to make your data science team less productive to make sure thatthey were not going against regulation and vice versa. In the sense of like,I'd rather make sure that we are correct of our regulation, even if it requiresa few more steps from data scientists than to risk them self managing it.

 

[00:28:14.160] - Simba Khadder

That's one piece, and I think that piece looks very similarto the SRE type metrics. The other types of metrics which I think are way moreimportant typically and way less understood are the exact ones you're talkingabout, which are the product metrics.

 

[00:28:28.760] - Simba Khadder

Engagement is obviously a good one here. I mean, I would bevery surprised if any MLOps platform lead has ever run an NPS score on theirplatform of [inaudible 00:28:41] data scientists. It would be superinteresting. You would learn so much. Just go ask your data scientist like youlike our platform, why do you like it? Why do you not like it?

 

[00:28:50.670] - Simba Khadder

It's almost like the age old startup like, have you talkedto your users? It's like, no. It's like, okay, well, maybe start there. Thenyou'll probably learn a million things you didn't know before just by talkingto your users.

 

[00:29:04.500] - Simba Khadder

We have funny stories where we talked to a data scientistwho was using Featureform. We're looking at using Featureform and they're like,yeah, we don't have a feature store. Then I had one situation where thishappened and then a month later the person emails me and says, yeah, I wasstarting to use it more and then I was notified that we actually have aninternal feature store.

 

[00:29:25.990] - Simba Khadder

Then from my perspective, it was so funny because it wasabout a huge data science team. I'm like, wait, so there's a whole platformteam building a feature store, but one of the dozens of data scientists arecompletely unaware that all this stuff exists internally.

 

[00:29:40.850] - Simba Khadder

It just shows that the platform team is building for sake ofbuilding, and maybe focus on these other metrics of what's our latency, what'sour uptime, and not really worry about, hey, are we actually being used? Dopeople like using it? Are people going to use it?

 

[00:29:53.170] - Simba Khadder

'Cause if you don't have that, it's not going to stick andyou're going to end up with like 45 different AML platforms, which I've seensome of the large banks where they have dozens of different ammo platforms, andthere's never really one standard rule in law 'cause they all have their ownproblems. None of them, I think, ran in that product-focused mindset.

 

[00:30:12.630] - Mikiko Bazeley

Yeah, it's really fascinating how enablement is always likethe last mile of an ML platform tool that just no one ever wants to do, even assimple as making sure you have a clear, centralised documentation for where youcan find things. For example, like, what were the models that were trained?What features are being used within models? It's really fascinating how hardsome of those problems are of just making stuff usable.

 

[00:30:38.040] - Mikiko Bazeley

Something to me that's really fascinating and I feel like somany people, especially when they look at the most recent MAD landscape of2023, just feel this absolute sense of chaos looking at that map. To me, what'sreally fascinating is if we were to go back and look at all the tools that areon there, I would be so curious about that NPS score, like the end users of thetools and all that.

 

[00:31:09.340] - Mikiko Bazeley

What's the NPS score of, for example, someone who's using adata orchestration tool or a training or mall registry tool or what have you,because the current landscape, it just lays out everything, but not all thosetools are equal.

 

[00:31:28.500] - Mikiko Bazeley

It's fascinating because I think when companies and teamsare looking at risk and tool adoption, they'll look at all those tools as beingthe exact same when there's actually a lot of differences between those toolsand the implication on which ones are best for which stacks or use cases.

 

[00:31:47.390] - Simba Khadder

Yeah, I bet if you take the MAD landscape, show it to, let'scall it 1,000 data scientists, ask them to point out or circle the tools thatthey love. You would take the MAD landscape and probably drop it down to like15 products or something. It's really very part...

 

[00:32:07.030] - Simba Khadder

I mean, we're so early. It's not like there's a lot ofproducts that are finding their way and that have a ton of potential. But I dothink that especially if you count internal products, the right metric for alot of companies, once you check off, the metrics about latency and all thoseare like checkoff.

 

[00:32:26.110] - Simba Khadder

You need four nines of uptime. Do we have that yes or no? Ifno, like, solve? If yes, then you're good. Chances are... I mean, it depends,but in a lot of these situations, it's binary given latency.

 

[00:32:42.150] - Simba Khadder

You can make it better, but let's say you have a recommendersystem and you need to serve a feature in under eight milliseconds. Let's sayyou were at eight milliseconds or seven milliseconds, and you drop it to five.Well, if it takes the page a certain amount of time to load anyway, you haven'treally made anything better by making that metric better.

 

[00:33:03.910] - Simba Khadder

A lot of these metrics are also very binary, but a metricthat is not binary, and it's surely a good one to focus on if you're buildingor leading the ML platform team is NPS and engagement, just like, do peoplelike using this thing?

 

[00:33:18.980] - Simba Khadder

Obviously, it needs to solve all the other problems, but theUX is, in my opinion, the hardest part to solve. That's why products likeTerraform and you could name a lot of... Snowflake is another good example. Whyis Snowflake such a big company? Well, they just really figured out NPS. Peoplelove using Snowflake because it just works and it's really simple.

 

[00:33:40.150] - Simba Khadder

I feel like the closer MLOps platforms can move to that, alot of times they want to add that new feature that bell and whistle, and a lotof times it's just like, make it easier, make it simpler. Just solve theproblem so that you check off all these binary check marks. But as a datascientist, it just feels natural. It fits my flow, it fits my way of thinking.

 

[00:33:59.500] - Mikiko Bazeley

Yeah. I think that's one of the reasons why I was so excitedto join Featureform and to work with everyone here is because I feel like insome ways, MLOps is failing the original end users of MLOps, which is the datascientists.

 

[00:34:18.370] - Mikiko Bazeley

We should be making it really easy for them to do the rightthings. I feel like as an ecosystem, it's questionable how much we've reallysucceeded with that. I think asking data scientists to learn infrastructure wasmaybe not the right direction.

 

[00:34:33.920] - Mikiko Bazeley

I don't know. I'm super excited for the new cycle of MLOpstools, both in the orchestration and workflows in terms of getting back to thatstate of making it easy and seamless for data scientists to do the right thingsto make themselves more productive. Also the platform engineers are pullingtheir hair out a little bit less.

 

[00:34:58.830] - Simba Khadder

I think one thing that's great that's happening is with thehype around MLOps, like the bad type of the hype around MLOps finally startingto fall out, all that's left is value creation. Anyone who's still buildingMLOps and still cares about MLOps is not doing it because there's a ton of VCmoney or there's another unicorn every month. Because if you want to be doingthat, you'd be doing during the VI now.

 

[00:35:27.930] - Simba Khadder

If you're still doing boring MLOps, it's because youactually care about the problem. If you actually care about the problem, Ithink what we're seeing a lot of now is people finally doing a lot of whatwe're talking about and treating their MLS platform as a product and not asthis weird low level infrared tool.

 

[00:35:47.220] - Simba Khadder

It's a product that data scientists use and I'm very excitedfor it. I think that I'm excited for us as an ecosystem beyond just likeFeatureform to actually start to make data scientists lives easier and to getto this point where it's almost like a duh, like CI/CD. There was a time whereit was like, that's like crazy that you do that, and nowadays it's like crazyif you don't.

 

[00:36:12.130] - Simba Khadder

I think we're going to see the same in ML platforms. Itbecomes so easy, it becomes so part of the workflow that everyone just uses itbecause that's just how you do data science.

 

[00:36:21.660] - Simba Khadder

When you do a boot camp, it's not like the MLOps chapter islike a whole never complicated chapter. It just fits in with the originalthings, like the way you do it from the beginning. This inherently is built tobe scaled, to be reliable, to easily fit into a platform.

 

[00:36:37.100] - Simba Khadder

We already do that with Docker, we do that with a lot oftools and traditional software engineering. I think we're going to see the samething in data science.

 

[00:36:44.760] - Simba Khadder

Mikiko I feel like we could talk all day and we probablywill continue to talk after this, but I do know we're at time and I just wantto thank you for making the time, hopping on and chatting with me for everyoneto listen to.

 

[00:36:57.970] - Mikiko Bazeley

Yeah, this is great. Luckily, since I'm part of the teamnow, hopefully we'll have future conversations, too.

 

[00:37:03.830] - Simba Khadder

I think we will. I think we will. Awesome. Thank you.

Related Listening

From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.

explore our resources

Ready to get started?

See what a virtual feature store means for your organization.