Home Events IN – Navigating the Efficient Frontier for Language Models (LLMs) View Recordings

Webinar

IN – Navigating the Efficient Frontier for Language Models (LLMs)

Balancing Security, Cost, Quality, and Speed

Date
15 Feb'24
Time
11:00 AM - 12:00 PM IST
Duration
60 mins

Transcript

24
00:03:52.330 –> 00:04:06.500
Renga: We are going to introduce you to a webinar where we talk about the efficient frontier for LLMs simply put, we’re going to talk about

25
00:04:06.650 –> 00:04:21.859
Renga: in the enterprise for AI adoption and what it entails for you in terms of cost quality. and you know, mainly security and the quality of each results that you get from deploying each of these Lms

26
00:04:21.959 –> 00:04:41.939
Renga: to be able to take you through this webinar. We have our CTO Ranjit, who’s going to take us through various aspects and considerations that you need to keep in mind while choosing these elements, and what challenges and considerations we kept in mind when we started deploying our own genii infrastructure on our enterprise

27
00:04:42.150 –> 00:04:45.490
Renga: without much further ado can just go to you.

28
00:04:46.530 –> 00:05:02.800
Ranjit: Thanks, Ranga, and thanks to everyone who’s joined. Thanks for making the time. So my goal is to spend the maybe the next 40 to 50 min with you. Discussing some of the capabilities that we have here at autonomate, but mostly focusing on some ideas that we have

29
00:05:02.860 –> 00:05:11.390
Ranjit: built, and that we are taking forward as a and you know, as as we join this generative AI revolution.

30
00:05:11.420 –> 00:05:19.309
Ranjit: So let me get started by giving you a little bit of description of what we do at autonomate, and this will give you an idea of

31
00:05:19.330 –> 00:05:25.330
Ranjit: of why, you know why, we’re approaching these problems. Why, these problems are important for us to solve.

32
00:05:25.370 –> 00:05:44.020
Ranjit: So at autonomate we have our slogan is enabling the autonomy auto autonomous enterprise. And what this really means is that over time we believe that very much just in, in much the same way that vehicles became autonomous enterprises can become autonomous by becoming self-driving, learning

33
00:05:44.020 –> 00:06:04.550
Ranjit: and having the ability to change course based on real data. So that’s that’s the high level idea. And today, I wanted to talk about how we can start using Llms and the capabilities that they bring into the the whole idea of the autonomous enterprise and into workflow automation, which is our core capability.

34
00:06:04.560 –> 00:06:13.410
Ranjit: So around this, I want to talk about the concept of an efficient frontier, and I’ll explain what this means, and I’ll show you a few slides and a few demos.

35
00:06:14.350 –> 00:06:19.750
Ranjit: So just I wanted to spend maybe 3 to 5 min describing what we do at autonomate.

36
00:06:20.020 –> 00:06:45.389
Ranjit: In. So, as as indicated on this slide, we hyper automate enterprise workflows and hyper automation is is kind of a loose term. It was defined by Gartner a couple of years ago. But the idea is that it’s an aggressive approach to automation where everything that has the potential to be automated is, and this includes, you know, E, everything ranging from voice

37
00:06:45.390 –> 00:06:55.680
Ranjit: to text to customer, interaction to back-end processes and all of that. And by doing this we can we help, you know, save our customers to save time to save money.

38
00:06:55.700 –> 00:07:07.099
Ranjit: and specifically, you know, to to to one of our key segments, which is the financial segment. We enable them to to be flexible with change and maintain compliance with regulations.

39
00:07:08.150 –> 00:07:27.310
Ranjit: So, first of all, just a high level, you know bird’s eye view of the platform we we offer. So we offer, what is called an intelligent platform as a service. And what this means is that we give customers and in this case enterprises the capability to build and deploy their own workflows.

40
00:07:27.500 –> 00:07:41.300
Ranjit: and these are end to end workflows. Starting from when a customer initiates an engagement with the enterprise all the way, till when that put, you know whatever issue they came with has been resolved, and so broadly, we have.

41
00:07:41.370 –> 00:07:48.100
Ranjit: you know, 3 key capabilities in in any workflow. We think of it as a customer journey. The first step

42
00:07:48.350 –> 00:08:12.690
Ranjit: element of this is what we call the intake process, whereby a customer comes in and engages with the enterprise, and they can come in and do this through many different channels they can come through a virtual agent like a Chatbot live agent. Through mobile apps! And all of these kinds of things. And so we provide a suite of low code tools that enable enterprises to shape this front end experience.

43
00:08:12.690 –> 00:08:31.970
Ranjit: Once the information comes in through the front end. We have a back end you know, a powerful workflow engine that can use the information that came in and to and resolve the customers problem. So, for example, if someone’s applying for a loan, they would enter information about the loan through through a bunch of different channels.

44
00:08:31.990 –> 00:08:47.380
Ranjit: And then, once this information was collected on the back end, they could put into place many different workflows like credit checks. And you know, salary checks and financial appraisals, and all of that. And these can integrate with back end systems so that

45
00:08:47.580 –> 00:08:54.910
Ranjit: the entire loan application can be carried through from start to finish, and eventually a decision gets sent back to the customer.

46
00:08:54.990 –> 00:09:02.810
Ranjit: During this process we also provide significant analytics, capability with a bunch of off-the-shelf tools as well.

47
00:09:03.080 –> 00:09:09.170
Ranjit: Now the reason I’m telling you this is that on what we’ve done is, as we saw

48
00:09:09.210 –> 00:09:24.830
Ranjit: over the past 1824 months. As we saw the wave of generative AI coming along, we have started to infuse the capabilities of Gen. AI into our platform, and so I wanted to give you a little bit of of a breakdown of how we do that.

49
00:09:25.130 –> 00:09:44.700
Ranjit: If you look at the high level modules that are present in our platform there. There’s front end capabilities, low code tools to build front end experiences. You can build conversations with a chat. You can build apps with 8 solo, and you can manage. You can, you know, have the human interaction in the loop, through the

50
00:09:44.700 –> 00:09:56.580
Ranjit: through through the age live product and then on the back end you have 8 studio, which is kind of the umbrella tool with which enterprises can design these journeys. Age flow is our back end

51
00:09:56.770 –> 00:10:11.899
Ranjit: engine that enables you to create and deploy business processes. And finally, the all of this is built on top of an analytics engine that gives you reports, analytics, and those kinds of things. So we saw the promise of Jenni a long time ago.

52
00:10:11.900 –> 00:10:31.769
Ranjit: and we’ve infused it into every stage of of our local platform. So every screen in our platform allows you to enter a prompt, whether it’s to generate code, whether it’s to create workflows, whether it’s to create SQL queries for reports or or to, you know, to to

53
00:10:31.890 –> 00:10:40.229
Ranjit: create user interfaces. And some of these most of these tools are out and a few of these are coming out in the next couple of months. So so geniiiii is important.

54
00:10:40.490 –> 00:10:52.380
Ranjit: And so I wanted to start getting into what what Jenni I means. And you know how you can efficiently use Llms when when you’re using Gen. AI.

55
00:10:52.580 –> 00:11:19.150
Ranjit: So if you look at how Jenia has impacted companies. Everyone’s excited about the the capabilities on the promise of geniiii. But they’re a little bit hesitant about some of the underlying issues related to cost related to security and related to integration with with legacy, tools and things like that. So these are some of the challenges that we’re trying to address, and I’ll talk about some of the ways we address these challenges in in today’s webinar.

56
00:11:20.240 –> 00:11:26.129
Ranjit: So to to get into that, I wanted to introduce this idea of an efficient frontier

57
00:11:26.310 –> 00:11:31.269
Ranjit: an efficient, efficient frontier is an old term that’s been used by the stock market.

58
00:11:31.320 –> 00:11:41.659
Ranjit: and the the idea is that at any given time. You want to have a portfolio of stocks that manages the risk and reward

59
00:11:41.720 –> 00:11:52.169
Ranjit: of your investments in a way that’s more suitable, suitable for you. So so that’s what it is. So so to maintain an efficient frontier. What you want to do is you want to be.

60
00:11:52.240 –> 00:12:03.549
Ranjit: have the ability to buy and sell stocks, for example, with, you know, on a continuous basis, so that you maintain this proper ratio of risk, risk, and reward.

61
00:12:04.160 –> 00:12:29.979
Ranjit: Now we are taking the idea of the efficient frontier and applying it for using large language models when people like. I said, what some of the challenges around using Llms are security, cost, quality and speed. These are the 4 challenges that we’ve identified. And what we’ve done is we are building and sort of an adaptive framework that has access to multiple models

62
00:12:30.550 –> 00:12:47.949
Ranjit: that allow you at any given time to decide what’s most important to you. If you decide that cost is most important. We can balance cost by using. the models that will give you the highest quality at the cost that you prefer, and things like that. So the idea is that just like you have risk and reward

63
00:12:47.950 –> 00:13:01.149
Ranjit: being balanced in in the world of stock in the world world of investing. You have security cost, quality and speed being managed through this efficient frontier that we’re building for large language models.

64
00:13:01.760 –> 00:13:06.969
Ranjit: So one of the key lessons of using these large language models is that you need

65
00:13:07.250 –> 00:13:30.239
Ranjit: to have a diversification strategy. If you if you buy stocks, you, you don’t put all your money in in in one company. What you do is you, you diversify, and you you need to diversify in the same way when you’re investing in large language models. You need to decide which one is suitable for which use case, and you need to have a flexible approach to acquiring these models integrating with them and deploying them.

66
00:13:30.240 –> 00:13:47.160
Ranjit: And you can see that some of the large vendors ha, are supporting this idea. So Google has this thing called the Model Garden that allows you to, you know. Bring up servers with different you know, open source Llms and things like that. Microsoft has model catalog, and so on. So

67
00:13:47.360 –> 00:13:58.390
Ranjit: the point is, you should use the Llm. That’s best suited for your problems. You should use as many of them as you need, and you should not get locked into any one particular solution.

68
00:13:59.230 –> 00:14:04.570
Ranjit: So just for illustrative purposes, I’ve I’ve given you a bunch of different

69
00:14:04.740 –> 00:14:20.780
Ranjit: criteria using which people can evaluate. Llm. So if you want something that’s instructional. If so, if you want something that has the ability, say, in a customer support situation to take a problem and give you instructions about it. There’s Llllm’s tuned for those kinds of

70
00:14:20.820 –> 00:14:45.080
Ranjit: purposes. There’s multimodal models which means that they can deal with image, text voice, all of those kinds of different types of data, and so on. So there’s a whole bunch of them that I’ve listed here. But I just wanted to show you this list because there’s a diversity of criteria that may be important to you, and there’s a large number of models that will probably satisfy your requirements.

71
00:14:46.230 –> 00:14:49.190
Ranjit: I wanted to quickly take you through

72
00:14:49.330 –> 00:14:51.100
Ranjit: our approach

73
00:14:51.130 –> 00:15:00.040
Ranjit: that we that we use, and that we make available to our customers when it comes to. You know, applying the sufficient frontier idea.

74
00:15:00.210 –> 00:15:06.309
Ranjit: So our our our approach is that we provide you with the infrastructure

75
00:15:06.540 –> 00:15:19.179
Ranjit: that allows you to specify what’s important to you and then, based on your use case, let’s say you decide that cost is important. Security is second, most important. And and maybe the speed of response is third.

76
00:15:19.180 –> 00:15:46.629
Ranjit: So what we do is based on that. We have a a framework that can use and deploy and configure the appropriate Llms that you need to meet those those requirements. So as an example some models are costly than others. You may want to avoid those if you’re cost conscious. Some are slower than others, but maybe better, and you might want to avoid that if you want to have a big response for your customers. So the the this picture basically shows you that there’s an

77
00:15:46.690 –> 00:15:53.679
Ranjit: infrastructure that sits between the customer and the Llms. And we manage that infrastructure for you.

78
00:15:53.790 –> 00:16:03.360
Ranjit: And this this infrastructure allows you to minimize cost, optimize security, using many different approaches maximize quality and maximize speed.

79
00:16:05.690 –> 00:16:12.530
Ranjit: I wanted to get into a few Demos here, just to show you how something like this would work. So

80
00:16:12.660 –> 00:16:19.800
Ranjit: so the first tool I wanted to show is an Llm. Tracker, and the Llm. Tracker allows you to.

81
00:16:19.940 –> 00:16:45.599
Ranjit: you know, explore and track, and you know, use different Llms. To see which one is most suitable for your use. Case and along with that, as you start to use them, we give you information on costs and performance and quality and and things like that. And so we have a a simple demo that use that as that is connected to all the models all the Llm’s that have been listed here.

82
00:16:45.630 –> 00:16:49.319
Ranjit: And I’ll show you this, how this works through

83
00:16:49.410 –> 00:17:00.030
Ranjit: an interactive example first, and then you know, by by using a real time dashboard and running a bunch of queries just to show you how how this concept works.

84
00:17:01.710 –> 00:17:04.619
Ranjit: By the way, Ranga. If people have questions, let me know

85
00:17:06.079 –> 00:17:14.250
Renga: I’m keeping track of that transit. And, by the way, in the audience, if you have any questions, please don’t wait for the end of the session.

86
00:17:14.280 –> 00:17:19.590
Renga: Please keep them coming in the QA. Widget, and I will allow them to run just for you. Thank you.

87
00:17:20.369 –> 00:17:25.040
Ranjit: So the first thing I wanted to show you is this, this dashboard here?

88
00:17:25.170 –> 00:17:27.999
Ranjit: So this dashboard at any given time

89
00:17:28.099 –> 00:17:36.140
Ranjit: shows me a a so so think of this. Maybe there’s an enterprise that has a bunch of different teams, each of which is using different models.

90
00:17:36.500 –> 00:17:51.510
Ranjit: So what this dashboard does is it collects data from many different models and gives you a real time view into what’s going on kind of split into 3 key key areas. The first one is the response time and the speed. So you’ll see at this moment.

91
00:17:51.550 –> 00:18:01.899
Ranjit: The model text Bison, which is a Google model, is doing the best and the 70 billion. The large model of Meta Llama is not doing that great

92
00:18:01.970 –> 00:18:24.490
Ranjit: likewise in terms of quality. You have some of the GPT. Tools right on top, and the third one in is in terms of cost. You have kind of almost flipped where you have some of these, some of the llama and visceral tools, giving you best pricing. And you know, while some of the Google tools sit in the middle of the pack and the G Pt. Tools are lower down. So

93
00:18:24.600 –> 00:18:33.459
Ranjit: so let me show you how how this works. So this is a dashboard that we have and what I’m and I’ll so scrunch that dashboard up for a minute.

94
00:18:33.570 –> 00:18:36.370
Ranjit: and I’m going to show you a tool called the Llm. Explorer.

95
00:18:37.340 –> 00:18:46.589
Ranjit: So this is just a simple tool that we haven’t. We can pick any Llm that we want. So let’s say, we go with, you know, text Bison.

96
00:18:46.630 –> 00:18:50.810
Ranjit: So I’m gonna pick text bison. And I’m gonna very randomly. You know.

97
00:18:51.670 –> 00:18:54.939
Ranjit: you know, copy and paste a query

98
00:18:55.220 –> 00:18:56.060
Ranjit: from

99
00:18:56.540 –> 00:18:59.900
Ranjit: from a collection of queries that I have.

100
00:19:00.130 –> 00:19:07.679
Ranjit: And I you know I’ve never tried these queries before. I’m just copy pasting it from a a corpus of queries. And so.

101
00:19:08.150 –> 00:19:13.159
Ranjit: as this query gets run, you’ll notice that this number changed from 1, 37 to 1, 38,

102
00:19:13.660 –> 00:19:19.329
Ranjit: and in addition to to being able to run the query I got. how much my query cost me

103
00:19:19.360 –> 00:19:22.180
Ranjit: the response time in milliseconds. And

104
00:19:22.270 –> 00:19:35.479
Ranjit: you know the the quality which is point date quality is measured on on a scale of point ZI guess 0 to one. And so point date is pretty good quality. And so you’ll see that this was reflected

105
00:19:35.550 –> 00:19:38.929
Ranjit: in this dashboard. So the next thing I’m going to do

106
00:19:39.620 –> 00:19:40.770
Ranjit: is a

107
00:19:40.780 –> 00:19:43.480
Ranjit: ron. Just give me 1 s here

108
00:19:46.070 –> 00:19:49.060
Ranjit: her a set of queries

109
00:19:49.240 –> 00:19:56.860
Ranjit: that I’m gonna run about 10 queries. Where? So we the way we have it set up is that the system automatically. Just picks from a

110
00:19:57.080 –> 00:20:11.949
Ranjit: corpus of like 25,000 queries, randomly picks model randomly picks. a a query and runs them. So I’m just going to run that now and our dashboard. I’ve set it to refresh every 5 s. So I’m I’m gonna run these 10 queries

111
00:20:12.230 –> 00:20:15.209
Ranjit: one after the other, and as they run

112
00:20:15.570 –> 00:20:21.530
Ranjit: you should, you should start to see the see the dashboard. You know, changing.

113
00:20:21.540 –> 00:20:42.789
Ranjit: Yeah, the the most obvious one will be the the change in the number of prompts. But as as you go forward you should. You should also see the the cost slowly creeping up. This this shows me the most recent query that was run, and you’ll you’ll see that maybe here and there the the leaderboard. Some elements of the Reader Leader Board start to swap

114
00:20:42.840 –> 00:20:57.670
Ranjit: and things like that. But what you have is you? You get a good bird’s eye view of which models are working in terms of in terms of response what your total cost is. And so, just to explain this dashboard a little bit

115
00:20:57.970 –> 00:21:11.569
Ranjit: this one shows you. Ii have 11 different Llm’s that are currently accessible. This shows you the average response. Time. Obviously, text Bison, continue is is really doing very well. This is a Google one.

116
00:21:11.570 –> 00:21:28.050
Ranjit: There’s the Gpt ones which are from Openai Gemini pro is the next Gen. Google, llm, and then text Bison, there’s a 32 K version and the regular version, Mistral and mix trial are both both variations of the Mistral

117
00:21:28.170 –> 00:21:30.639
Ranjit: tool from a French company.

118
00:21:31.080 –> 00:21:40.319
Ranjit: And then you have the 2 llamas, the 13 billion parameter, one and the 70 billion parameter one, you’ll see that this guy is is not that fast?

119
00:21:40.430 –> 00:21:56.129
Ranjit: And then we have a a bunch, a bunch of other tools. You have different versions of Gpt and a few. And Tex Unicon is actually, allegedly the most capable model that Google has. But it’s not doing that great from a timing perspective.

120
00:21:56.360 –> 00:22:09.229
Ranjit: So so this is the leaderboard for response. Time. This is a leaderboard for quality. The quality stats. You know. Don’t take them too too seriously. It’s it’s somewhat made up. II have a very lightweight way of

121
00:22:09.250 –> 00:22:21.830
Ranjit: figuring out what the quality is. The correct way to figure out quality is to actually get user user feedback. So don’t you know, take this one with a grain of salt. And then this is the actual cost. So you’ll see that

122
00:22:22.060 –> 00:22:37.629
Ranjit: you know, in terms of cost. The the mistral and the mixtrial ones are the least expensive, and is is the most expensive, and this is in terms of the average cost. So if you look at this this, these little

123
00:22:37.790 –> 00:22:44.000
Ranjit: you know, stats here, show you for each. What is the total that you have spent for each model?

124
00:22:44.420 –> 00:23:00.900
Ranjit: I mean for these 100 and fif 146 queries. So you’ll see that out out of having. And this is in us dollars. So having spent 45 cents you know, almost half of that is consumed. By GPT. 4, and then about a quarter of that is consumed by GPT. 4 turbo

125
00:23:00.930 –> 00:23:13.010
Ranjit: and then you have small contributions from a a couple of the Google models, but all the rest of them are. you know, consume only a a very small amount of cost. So

126
00:23:13.250 –> 00:23:17.569
Ranjit: when you look at this picture. a couple of things come to mind.

127
00:23:17.760 –> 00:23:21.859
Ranjit: The first one that’s very obvious is that when it comes to response.

128
00:23:21.950 –> 00:23:32.949
Ranjit: the Google X bison model is doing pretty well. So the text Bison model is on top average response time for. And and these are, you know, I’m sending queries that have about

129
00:23:33.080 –> 00:23:34.520
Ranjit: maybe

130
00:23:35.740 –> 00:23:42.319
Ranjit: 20, I mean, for maybe even 40 tokens. And so for these queries, it’s it’s doing a really good job.

131
00:23:42.480 –> 00:23:48.409
Ranjit: Textbison, has you not even used one sent

132
00:23:48.600 –> 00:23:51.089
Ranjit: so far for all the queries.

133
00:23:51.170 –> 00:24:06.919
Ranjit: And yeah, yeah, it’s in in terms of quality, like I said, don’t worry too much about this. Here it’s it’s sort of in the middle of the pack when it comes to cost. So if you had to just based on this pick a particular Llm. You know, text bison would probably be a pretty good choice

134
00:24:06.960 –> 00:24:23.999
Ranjit: GPT. 3.5. Turbo is pretty good. It’s better than text bison in terms of cost, and just a little bit behind in terms of response, time, pretty good in terms of quality, and so on. So what what a chart like this allows you to do is, you know, decide which

135
00:24:24.160 –> 00:24:35.950
Ranjit: tool is best, and and then you can further subdivide and drill down based on different criteria. And then, you know, say that. Okay, when I’m doing large

136
00:24:36.130 –> 00:24:43.570
Ranjit: you know, queries, I want to use this model when I’m doing short queries, I want to use this model when I’m using multimodal. I want to use this one. So

137
00:24:43.610 –> 00:25:03.649
Ranjit: all of these all of this, these kinds of decisions can be made based on this data. So so the way our efficient frontier infrastructure works is that we look at all the data that’s coming up from all the different models. And then, based on what you decide is good for you. We just we can. We can pick the model that will satisfy your requirements.

138
00:25:03.660 –> 00:25:08.820
Ranjit: So this is the first demo that I wanted to show. Let me go back to my Powerpoint.

139
00:25:12.020 –> 00:25:18.850
Ranjit: And you know the highlight is that we have a real-time dashboard. And it can, you know, adapt to all your business

140
00:25:18.920 –> 00:25:25.909
Ranjit: metrics and things like that? Just to give you guys a little bit of a view into Llm pricing

141
00:25:26.150 –> 00:25:34.300
Ranjit: as so as of today. The pricing looks like this for about so for for a thousand output tokens

142
00:25:34.400 –> 00:25:41.010
Ranjit: and so you can think of a token as being approximately II think 3 characters. So

143
00:25:41.360 –> 00:25:44.569
Ranjit: 3,500 characters, I suppose, for a thousand of them.

144
00:25:45.070 –> 00:25:50.649
Ranjit: where you know you’re you’re paying with some of these models like GPT. 32 K.

145
00:25:51.120 –> 00:25:58.090
Ranjit: You’re paying a lot of money and this, you know, if you’re doing a million queries, or whatever these, this money can add up really fast.

146
00:25:58.350 –> 00:26:06.060
Ranjit: This is not 0. This is obviously some bug but anyway, so this the these are the numbers that

147
00:26:06.270 –> 00:26:24.560
Ranjit: you know that that we have GPT, 32 k, GPT, 4. And then the the most expensive Google model is the text Unicon. But the others are relatively inexpensive. Gemini pro is their newest model. And, as as you can see, it’s quite, quite inexpensive.

148
00:26:25.530 –> 00:26:39.820
Ranjit: so you know, just to kind of summarize this whole point on cost control. What the one of the challenges that enterprises have is that the cost for the third or Llm. Use it can add up very quickly.

149
00:26:39.880 –> 00:26:43.059
Ranjit: If you decide that you want to host your own

150
00:26:43.380 –> 00:26:55.689
Ranjit: you know. Large language model. You’re you’re looking at a high capital expense to to put this into your data center. Of course, if you’re doing, you know, millions and millions of queries that starts to.

151
00:26:55.720 –> 00:27:05.649
Ranjit: you know, pay for itself. So you get the Roi. But if you’re not doing that many queries it it’s not, you know the cost of actually running your own model in house is is high.

152
00:27:06.520 –> 00:27:25.349
Ranjit: So our approach is that we we, we try and solve the bigger problem. We’re we’re not just focused on on costs in by themselves. We want to, you know, solve the bigger problem of how can an enterprise, you know, comfortably deploy models. And so we’re looking at

153
00:27:25.690 –> 00:27:40.250
Ranjit: you, you know, when when you augment these models with workflows. So let’s say a question. You know you, you get a response from a model, from a from an Llm. That’s not to your liking. You can actually divert it off to a curation workflow that is supported by our workflow tools

154
00:27:40.370 –> 00:27:54.939
Ranjit: that can. You know, people can contribute, you know, a a better answer, and then put that back into the into the database. And from that point onward. You don’t even know need to go to the model and incur the cost of that model to in order to get an answer to that question.

155
00:27:54.950 –> 00:27:58.559
Ranjit: You can do. We can do things like caching results. You can. Do.

156
00:27:58.860 –> 00:28:02.119
Ranjit: you know, you know, setting up

157
00:28:02.130 –> 00:28:08.900
Ranjit: a a very dynamic model, where, depending almost on the query you can pick which Llm. To use, and so on

158
00:28:10.340 –> 00:28:14.680
Ranjit: the next. So so that’s the first demo. Now. So

159
00:28:15.460 –> 00:28:18.340
Ranjit: since we’re talking about use cases

160
00:28:18.710 –> 00:28:29.760
Ranjit: and the best Llm. To use for specific use case. I’m going to walk you through a couple of examples of what we’ve done. The the next Demo wanted to show you is is for agent assessment.

161
00:28:30.030 –> 00:28:32.519
Ranjit: The idea here is that

162
00:28:32.540 –> 00:28:35.670
Ranjit: many companies, when they.

163
00:28:36.220 –> 00:28:54.899
Ranjit: when they have agents. They want to have a way to assess how well the conversation between the agent and the customer went so they they generate transcripts of these calls, and then a supervisor reviews these transcripts, and generally they review them against some rubric or some set of questions that

164
00:28:55.140 –> 00:29:02.860
Ranjit: must be must be answered, and you know, make an evaluation of how well the agent did based on that.

165
00:29:02.980 –> 00:29:17.270
Ranjit: So what we built is a chat-based example of agent, assessment, and the model that worked best. Here was gemini pro, and some of the features that that made it useful was the speed.

166
00:29:17.520 –> 00:29:20.170
Ranjit: the quality. the cost

167
00:29:20.240 –> 00:29:30.080
Ranjit: and a good. It takes a good number of tokens because chat transcripts, you know, conversational transcripts can have a lot of text in them.

168
00:29:30.400 –> 00:29:37.589
Ranjit: and it was, you know, very, very simple to train and configure these in in less than a week. So I’m going to show you this demo.

169
00:29:38.360 –> 00:29:50.499
Ranjit: I’m going to show you the agent assessment Demo, and I’ll show you. Show it to you with a couple of examples.

170
00:29:51.270 –> 00:29:56.799
Ranjit: Okay. So the first thing that the a agent assessment bought asks for is

171
00:29:58.200 –> 00:30:07.799
Ranjit: So I’m just gonna be copying and pasting this information from another screen. So it says, please provide the Agent Transcript. I’m gonna let me just

172
00:30:07.810 –> 00:30:17.320
Ranjit: I hope you can all see my screen now. So this is the transcript so as you can see, it’s quite long. And then we have the questions at the end. So I’m just gonna copy

173
00:30:18.150 –> 00:30:20.099
Ranjit: the text from the transcript

174
00:30:21.950 –> 00:30:23.369
Ranjit: and paste it in here.

175
00:30:26.210 –> 00:30:30.159
Ranjit: And so I pasted the transcript. The next thing I’m going to do is paste in the questions.

176
00:30:31.120 –> 00:30:34.919
Ranjit: and these are the questions, or this is sort of the rubric against which

177
00:30:35.140 –> 00:30:40.559
Ranjit: the the quality of the conversation is measured. So I’m going to copy that and paste that in here.

178
00:30:41.240 –> 00:30:44.660
Ranjit: and I’m going to hit return. In this case

179
00:30:45.220 –> 00:30:49.810
Ranjit: we have. The example we picked is a polite agent who’s done

180
00:30:49.880 –> 00:30:56.780
Ranjit: most of the things pretty well. and we hope that the results. You know that the assessment from the chat board reflects that

181
00:30:56.980 –> 00:31:15.669
Ranjit: so, as you can see, it presents the results in a very human, friendly way. You know it looks at all the questions and answers, gives me a question by a question itemized response, it says, was the CSA. Able to open and close. Yes. Yes, they were polite. Did they respond appropriately? Did they summarize everything?

182
00:31:15.760 –> 00:31:35.590
Ranjit: Did they resolve the customers? A con concern? Prom promptly. A. And you know all of these kinds of things. So it it the the tool was able to do this, which is, you know. pretty remarkable and you know, the having having the ability to to just make a tool like this available to our customers

183
00:31:35.750 –> 00:31:42.850
Ranjit: through our chat. Bot is is one of the capabilities that we bring to the table. So I’m going to try another one. So

184
00:31:42.860 –> 00:31:46.649
Ranjit: the first one was what you might call a polite agent.

185
00:31:46.720 –> 00:31:53.040
Ranjit: The second one is going to be a not so polite agent. and I’m just gonna show you this

186
00:31:53.920 –> 00:32:01.099
Ranjit: so the agent starts by saying, I’m in a hurry. Please don’t waste my time and then the

187
00:32:01.170 –> 00:32:07.080
Ranjit: makes makes a bunch of spelling mistakes and then says, you know

188
00:32:07.430 –> 00:32:17.170
Ranjit: the uses, garbled sentences like this and things like that. So I’m I’m just using this as an example of just to see if you know the

189
00:32:17.350 –> 00:32:23.810
Ranjit: agent assessment is able to actually realize that this is not a very good agent. So I’m entering the Transcript.

190
00:32:26.070 –> 00:32:29.099
Ranjit: And I’m going to enter basically the same questions.

191
00:32:33.690 –> 00:32:34.390
Ranjit: Okay.

192
00:32:35.740 –> 00:32:38.539
Ranjit: so we’ll give this tool a little bit of time.

193
00:32:42.920 –> 00:32:46.350
Ranjit: and let’s see what it comes back with.

194
00:32:47.210 –> 00:32:49.540
Ranjit: So in this case.

195
00:32:50.480 –> 00:32:51.580
Ranjit: correctly.

196
00:32:51.690 –> 00:32:58.169
Ranjit: the assessment was generally negative. though there were a few highlights. So, for example.

197
00:32:58.400 –> 00:33:06.350
Ranjit: The Csa. Was not warmer engaging. They started the conversation by saying they were a hurry not to waste their time, which created a negative impression from the outset.

198
00:33:06.760 –> 00:33:10.530
Ranjit: the according to this

199
00:33:10.750 –> 00:33:16.370
Ranjit: the the questions were relevant, and this is probably true, and that they responded appropriately and assured them.

200
00:33:16.610 –> 00:33:21.979
Ranjit: and a few other things. But then, when it comes to creating a great experience, no?

201
00:33:22.070 –> 00:33:32.979
Ranjit: Positively represent the merchant? No. And did the Ccsa use the customer’s name? Know, and things like that? So as you can see it, the the the

202
00:33:33.340 –> 00:33:39.459
Ranjit: tool was able to actually give a almost human-like assessment of of this.

203
00:33:39.700 –> 00:33:58.579
Ranjit: you know, off of the Transcript, and Gemini pro has shown itself to be really powerful in in this sense. Like, I said, in terms of capability, in terms of cost, in terms of speed. And and the best part is, it’s a multimodal type of model. So I could have taken the same information

204
00:33:58.720 –> 00:34:00.040
Ranjit: as voice

205
00:34:00.130 –> 00:34:11.449
Ranjit: and submitted it to the agent assessment tool, and it would have done it would have been able to. you know, provide a a strong assessment, because that’s what the Gemini pro model is capable of doing

206
00:34:13.300 –> 00:34:26.760
Ranjit: so. Let’s come back here. So we use our Llm. Manager here as say, yeah, as the infrastructure that allows you. You know that sits between the chat Bot and the And and the back end ln.

207
00:34:28.090 –> 00:34:31.920
Ranjit: the next. Tool I wanted to show you is around email automation.

208
00:34:32.050 –> 00:34:33.110
Ranjit: And

209
00:34:33.489 –> 00:34:44.420
Ranjit: here again we found the Gemini. Pro does a very good job. But to it to be yeah, transparent. Here we used combination of gemini pro. And

210
00:34:44.429 –> 00:34:50.630
Ranjit: oh, you know, open aig pt. For the for the next demo that you’re gonna see? So for this next demo.

211
00:34:52.679 –> 00:34:56.110
Ranjit: I’m going to be a

212
00:34:57.440 –> 00:34:58.500
Ranjit: customer

213
00:34:58.530 –> 00:35:03.619
Ranjit: who’s sent? Who’s writing an email to. you know, to a.

214
00:35:04.740 –> 00:35:24.209
Ranjit: to a customer support of of a particular company, and and the the the you know, the scenario is that we’re we’re talking about a a company that deals with Sims and mobile phones. And so the this, this person is going to. You know, send an email asking a particular question.

215
00:35:24.250 –> 00:35:43.639
Ranjit: and that the way that this tool works is that normally, when an agent gets a question like this. What they do is they go pick up some information from their knowledge base. They may look at any public sources of data. They combine all of this, put it into an email and send a response back to the customer. We are trying gonna try and automate all of that. So

216
00:35:43.870 –> 00:35:51.489
Ranjit: so I’m my identity. Here is William Chen, and William is going to be sending a message to

217
00:35:52.010 –> 00:35:59.139
Ranjit: to A to an email gateway, and I’ll kind of explain how this works. So

218
00:36:00.940 –> 00:36:15.069
Ranjit: so maybe I’ll just say subject to SIM not working. And I’m going to say my SIM has been activated, but can’t make or receive calls, that’s all I’m going to do. Just send this particular message.

219
00:36:16.280 –> 00:36:23.139
Ranjit: So I’m going to send this message and what I’m going to do is we have, we are, we’re using an email gateway from

220
00:36:23.180 –> 00:36:26.249
Ranjit: pipe Dream. And so I’m just going to

221
00:36:27.490 –> 00:36:32.810
Ranjit: keep an eye on this email gateway. So you’ll see that the email has come into this gateway.

222
00:36:34.300 –> 00:36:45.319
Ranjit: And so basically, like, I said, just to recap. So what the tool is going to do is it’s going to pull information from any different sources, assemble it into a nice

223
00:36:45.860 –> 00:36:50.860
Ranjit: you know, user, friendly email response, and send an email back to the person who sent the question.

224
00:36:52.280 –> 00:37:00.700
Ranjit: So this is still working. And then hopefully, so if it goes green, that means, you know it’s it’s all, all the appropriate

225
00:37:01.070 –> 00:37:08.260
Ranjit: services Went through properly. So I’m gonna come back here now. close this.

226
00:37:08.290 –> 00:37:12.390
Ranjit: and you’ll see now that this, this guy’s got a response.

227
00:37:13.300 –> 00:37:19.309
Ranjit: And when I click, click this response, you see that it’s actually doing a pretty fairly

228
00:37:19.500 –> 00:37:26.650
Ranjit: comprehensive assessment. And and you know of of all the data. So it’s pulling data from.

229
00:37:27.150 –> 00:37:28.830
Ranjit: you know, the?

230
00:37:28.990 –> 00:37:32.440
Ranjit: So so it it does a few things. First thing, is it it

231
00:37:32.530 –> 00:37:47.609
Ranjit: personalize it. It knows who sent the email. It generates what what we call a generative summary. So this is the top level summary response to the email. It then picks the best results from the knowledge base. So these are here, and for each for each result.

232
00:37:47.890 –> 00:38:00.630
Ranjit: It. It gives you kind of a summary, as well as what we call a generative snippet. What this means is that it pulls and approves one or more snippets from the document, connects them together into response and sends it

233
00:38:01.120 –> 00:38:18.669
Ranjit: back to the user. And then it also found that there’s public information available on the web. And so it creates again, kind of a summary or a generative response of the public information, and then gives you links to the top sites from where this information was gathered. So, for example.

234
00:38:18.910 –> 00:38:20.699
Ranjit: if I click on this site.

235
00:38:21.780 –> 00:38:24.120
Ranjit: it takes me to the page where.

236
00:38:24.670 –> 00:38:25.950
Ranjit: you know

237
00:38:27.110 –> 00:38:28.330
Ranjit: there’s a

238
00:38:28.550 –> 00:38:33.859
Ranjit: some some response to what to do when when your SIM, when your SIM is not working.

239
00:38:36.730 –> 00:38:42.670
Ranjit: So this is the the email automation. And again, we use the gemini pro model for this.

240
00:38:43.020 –> 00:38:51.459
Renga: Would you mind? Sorry to interrupt you. But would you mind taking a question while you are on the floor, or would you want?

241
00:38:51.500 –> 00:38:58.429
Renga: Yeah, we just have a Kristen saying, if we did take bad quality responses for a given. Lm, how do you fix it?

242
00:38:59.570 –> 00:39:05.650
Ranjit: Okay? So so one of the ways that you know you you deal with.

243
00:39:06.170 –> 00:39:09.580
Ranjit: Okay, let me back up a bit. Anytime. You roll out an Llm.

244
00:39:09.640 –> 00:39:15.200
Ranjit: You have to give it a period of probably 30 to 60 days to evaluate

245
00:39:15.440 –> 00:39:23.749
Ranjit: the quality of the Llm. And to tune it so that, you know it gets to a stable high quality state after after that time.

246
00:39:23.790 –> 00:39:37.300
Ranjit: And so what you do, what you do is you have a human, or do you have? People? Look. II mean you. You have, either a human intervention, or you have some other mechanism to evaluate the responses that come back from the Llm. And then what we propose

247
00:39:37.540 –> 00:39:42.000
Ranjit: is using sort of a combination of

248
00:39:42.140 –> 00:39:49.070
Ranjit: automated tools where you can look for certain keywords in the in the, in the response, and make sure all those keywords are covered.

249
00:39:49.880 –> 00:40:03.900
Ranjit: and or and and human intervention where you can actually take. Take this message, just take the response, send it to the back end, get it curated, and have someone actually create a better response for that, for the particular question that came in

250
00:40:03.910 –> 00:40:20.120
Ranjit: once you create that response, you don’t you? You can just keep it locally in your knowledge base. And so you don’t need to go back to the Llm. Each time that question comes up, and so not so you. So you do 2 things. You save money and you get a better quality response. So that would be the approach.

251
00:40:22.690 –> 00:40:25.639
Renga: Awesome. Thank you. Njit, yeah. Thanks. Rhonda.

252
00:40:25.970 –> 00:40:49.170
Ranjit: okay. So the next I think the last. And this is the last demo I want to show you is to a and to to. Before I get into that, I want to kind of give you an idea of some of this issues around security with Llms. This is obviously, you know, been a a major point of discussion. And

253
00:40:49.340 –> 00:40:54.650
Ranjit: the the, you know, people are worried about proprietary data going into

254
00:40:54.950 –> 00:41:12.000
Ranjit: third party models being used for training pi, Ii being revealed and also an important one is to to to figure out, you know, and handle or minimize misinformation. These are some of the challenges that people have from a security perspective.

255
00:41:12.720 –> 00:41:16.669
Ranjit: So we are taking a few different approaches to support this.

256
00:41:16.710 –> 00:41:25.649
Ranjit: and we expect to be rolling out more and more of these capabilities in in the rest of this year. But from from the perspective of what we do

257
00:41:25.670 –> 00:41:43.440
Ranjit: at at the First level, we make sure that all data is encrypted. So this is the first level of security. It’s standard. You wanna make sure that you use https all you know, in and out you want to make sure that data when it’s stored in a database, especially user data, is encrypted.

258
00:41:43.690 –> 00:41:57.879
Ranjit: The second, as a second point is all around user protecting Pii information. And what this means is that ultimately the meaning of PI. Is, nobody should be able to.

259
00:41:58.250 –> 00:42:09.500
Ranjit: based on the information that they have been given, they should not be able to figure out who this person is. That’s basically Pii protection. So in some cases you need to. You know.

260
00:42:10.010 –> 00:42:25.810
Ranjit: give, make information vague. In some cases you have to remove information and things like that. So, for example, if if I have a phone number, it’s pretty likely that I’ll be able to figure out who the person is. So you want to be able to have the ability to redact phone numbers.

261
00:42:25.870 –> 00:42:28.980
Ranjit: there’s idea. There are a couple of interesting ideas.

262
00:42:29.200 –> 00:42:43.649
Ranjit: you know, around this area one is called differential privacy, and the general idea is that you and this goes sort of hand in hand with with a fancy named approach! Called homomorphic encryption. The general idea is that

263
00:42:43.890 –> 00:42:45.420
Ranjit: you know you you

264
00:42:45.510 –> 00:42:48.470
Ranjit: anytime you send the data

265
00:42:48.850 –> 00:42:52.890
Ranjit: To to to a third party, and you don’t want that data to be disclosed.

266
00:42:52.920 –> 00:43:03.159
Ranjit: You disguise that data so that from the outside to the third party. It looks like normal data, and they treat it as they would any normal data. But

267
00:43:03.370 –> 00:43:05.470
Ranjit: it it is actually

268
00:43:05.550 –> 00:43:15.720
Ranjit: not revealing any information about the person or you know any personal information. So that’s the whole. II idea here, and this is what I’ll show you a demo about.

269
00:43:16.120 –> 00:43:33.160
Ranjit: And then the third aspect is to use, you know, various techniques like setting parameters or R Ag. And tools like that to reduce hallucinations to put guardrails around your data to make sure it doesn’t go off out of scope when it’s answering questions and things like that.

270
00:43:33.770 –> 00:43:48.510
Ranjit: So I wanted to show you a simple demo. And this is a sort of a you know. You can say engineering friendly demo because it’s dealing with an issue. You know that. That’s it’s it’s not a. It’s not a Ui issue or anything like that. It’s basically a way for you to.

271
00:43:48.860 –> 00:43:50.850
Ranjit: You know.

272
00:43:51.230 –> 00:44:03.220
Ranjit: To to protect information. So I’m going to show you a couple of examples. So when we talk about data encryption, what we do is we. We give users the ability

273
00:44:03.470 –> 00:44:06.089
Ranjit: to to decide what they want to do

274
00:44:06.210 –> 00:44:10.839
Ranjit: when you see certain types of information in in a query.

275
00:44:10.860 –> 00:44:19.479
Ranjit: So in this case the configuration has been set, so that anytime you see a location, you substitute it. Anytime, you see a number, you encrypt it.

276
00:44:19.900 –> 00:44:23.959
Ranjit: You substitute the person’s name. You redact the phone number.

277
00:44:24.070 –> 00:44:37.409
Ranjit: and then you encrypt any any information that’s priced like. So so there, there are a few challenges here. So, for example, you have to detect all these kinds of. They call these things entities. You have to detect all these entities in the text. First.

278
00:44:37.550 –> 00:44:46.590
Ranjit: once you detect these entities. You have to apply this these rules, and then you, and then that the the information that you then end up sending to the

279
00:44:46.830 –> 00:44:58.900
Ranjit: to the Llm. Is disguised and does not reveal any personal information about the person who’s actually you know, asking the question. So I’m going to walk through this in a few stages. So this is the the configuration.

280
00:44:59.540 –> 00:45:20.550
Ranjit: So let’s say this. User query looks like this in San Francisco. Dave Johnson bought 4 apples at dollar 20 a piece. He can be reached here. Where did he buy apples? This is obviously a very silly question, but I it’s I’m just. I’ve just created this to to make a point. This is a location. For example, this is a person. This is a

281
00:45:20.870 –> 00:45:25.100
Ranjit: price slash number. This is a phone number and and all of that.

282
00:45:25.120 –> 00:45:34.990
Ranjit: So I’m going to continue and show you how this system works. So the first thing it does is it extracts these entities, these little pieces of these little nuggets from within the

283
00:45:35.000 –> 00:45:47.299
Ranjit: sentence it it’s able to extract the price it’s able to extract the number it’s able to extract the phone number. And so now, once you do that, then you can apply the rules. You can say, I’m gonna redact this. I’m gonna substitute this and things like that.

284
00:45:48.090 –> 00:45:54.909
Ranjit: So so the actions that need to be taken are encryption on the numbers, redaction of the phone number and

285
00:45:55.190 –> 00:45:58.429
Ranjit: substitution for for the person and location.

286
00:46:00.090 –> 00:46:18.970
Ranjit: So to do this, you have to create what’s called an encryption map. What this means is that anytime you’re doing a substitution and sending information to a Llm. When the information comes back, you want to map it back to to the normal values so that you can, you can send the appropriate information back to the user.

287
00:46:19.000 –> 00:46:20.909
Ranjit: So you create an encryption map

288
00:46:20.990 –> 00:46:25.890
Ranjit: that is used for for reverse mapping. The response that comes from the Llm.

289
00:46:26.790 –> 00:46:28.529
Ranjit: So this is a user query

290
00:46:29.480 –> 00:46:39.029
Ranjit: in San Francisco, Dave, there are all of this stuff. But the encrypted query that you sent to open. AI is you’ve substituted the name and the location.

291
00:46:39.240 –> 00:46:45.999
Ranjit: You’ve encrypted the numbers. In other words, you you’ve used different numbers and you’ve redacted the phone number.

292
00:46:46.720 –> 00:46:52.029
Ranjit: But this particular question, this works. So so this query then gets sent to the user.

293
00:46:52.400 –> 00:46:57.849
Ranjit: And and the the response comes back and says, Andy Smith bought apples in New York.

294
00:46:58.030 –> 00:47:00.080
Ranjit: You then apply the reverse mapping.

295
00:47:00.540 –> 00:47:10.219
Ranjit: and then you say, you know from the encryption map you you sort of decode it, and then, once you decode it, you can send the actual response back to the user.

296
00:47:10.460 –> 00:47:15.079
Ranjit: So this is one example. The second example I wanted to show you is a little bit different.

297
00:47:15.270 –> 00:47:26.629
Ranjit: So it’s a similar kind of sentence. But in this case we actually want to do something with this information, which is 8 apples for dollar, 25 a piece. How much did he pay?

298
00:47:26.930 –> 00:47:30.530
Ranjit: So in this case you cannot encrypt this information.

299
00:47:30.850 –> 00:47:41.879
Ranjit: or at least there are tools where you can potentially encrypt this. But if you take the normal approach, you cannot encrypt it, because then the value of how much he paid will get modified.

300
00:47:42.710 –> 00:47:46.519
Ranjit: So again, same kind of idea. So

301
00:47:48.260 –> 00:47:49.920
Ranjit: so this is the user query.

302
00:47:52.060 –> 00:48:03.869
Ranjit: the we, we extract the entities. And in this case we’re not encrypting the the text information. And then, you know, the same thing happens. You substitute the the person and the location.

303
00:48:04.050 –> 00:48:05.960
Ranjit: You create this encryption map.

304
00:48:07.960 –> 00:48:09.500
Ranjit: This is the user query.

305
00:48:09.560 –> 00:48:19.589
Ranjit: You convert it here, but you’re not disclosing who bought these apples. All you all you know is that the these apples were bought. But you’re not giving away the location or the name of the person.

306
00:48:20.640 –> 00:48:38.500
Ranjit: and then you get the response back, and then you use the map to decrypt the response and give this back to the user. So these were, you know, this is just a simple illustration of techniques that we can use to protect Pii from actually going to these third party Llms.

307
00:48:39.170 –> 00:48:48.670
Ranjit: so and so, you know, as just as a you know, quick. Note, we, this is user configurable. We use data, substitution, redaction and encryption.

308
00:48:48.860 –> 00:49:02.940
Ranjit: and so you know, I think we’re coming up on time now. So that’s all demo perspective. You know, we we have a couple of more Demos, but we’ll probably put them on videos or something like that, so that you can

309
00:49:03.080 –> 00:49:04.620
Ranjit: have a look at them later

310
00:49:05.140 –> 00:49:18.269
Renga: that works, and we also have a new question. I believe it was sent before the Security Demo started. Ranjit. I think the question here says those 2 examples for text based, how do you deal with multimodal data?

311
00:49:19.240 –> 00:49:28.140
Ranjit: Mind explaining a bit bit more than the initial introduction. What multi-modal data means for the audience, Sanjay.

312
00:49:28.290 –> 00:49:29.060
Ranjit: you could

313
00:49:29.440 –> 00:49:58.120
Ranjit: so multimodal data basically means de data that’s So voice, I mean, text voice images, audio video, all of that stuff. So this is a mixture of data that comes from many different sources. Generally, the approach that you take with multimodal data is is different, depending on the type of data. The earlier it used to be that you would translate everything into text and then operate in a text based manner.

314
00:49:58.200 –> 00:50:04.889
Ranjit: But now there are tools that are that that have been developed by some companies that can redact

315
00:50:04.910 –> 00:50:29.949
Ranjit: information on the fly. And so you you, you know. So you you can take voice, a stream, a voice, stream, delay it by maybe a couple of seconds. And then what ends up happening is that you can actually detect when people are giving personal information or giving names or and things like that, you can redact it and send that information, you know, to the third party. That’s the best that you can do now with with voice data.

316
00:50:30.150 –> 00:50:42.440
Ranjit: There’s no easy way or no company or no tool that I know that can deal at the moment with. Video data. So you have to use some other techniques like, maybe. You know.

317
00:50:42.510 –> 00:50:57.549
Ranjit: transcribing it into text and then doing it, or something like that. But that’s the idea. So so, for example, you know. phone calls, you can redact. And you can apply many of these techniques to it. But when it comes to video streams it’s very hard to redact.

318
00:51:00.380 –> 00:51:09.829
Renga: I hope that answers that question, Ranjit. We don’t have any more new questions, any specific points that you wanted to add by closing your presentation, Rajit.

319
00:51:10.350 –> 00:51:16.009
Ranjit: Yeah, I think I mean, the one of the key points I want to make is that you know. So we we have this.

320
00:51:18.640 –> 00:51:37.999
Ranjit: you know, this is the architecture of the future that we are going. You know that we are planning to have at autonomate. and the idea here is that, you know we want to give you choice. We want to give you the ability to choose the Llm. That’s most suited for your for your use case. And the idea is that

321
00:51:38.060 –> 00:51:50.579
Ranjit: at the moment the way we’re doing it on a use case by use case basis, we are deciding which model should be used. But as we go forward we want to get to the point where, even on a query by query basis, so maybe a short.

322
00:51:50.950 –> 00:51:53.010
Ranjit: very in a different language.

323
00:51:53.330 –> 00:52:06.889
Ranjit: might trigger a different Llm. To be used as opposed to a long query in English, or or something like that. So we look at a bunch of different features around the incoming query, and based on that, we’ll choose the Llm. That’s most suited for

324
00:52:07.000 –> 00:52:08.660
Ranjit: Responding.

325
00:52:09.250 –> 00:52:16.569
Ranjit: So I just want to make the high-level point that don’t get locked in to any one. Llm. You know. Keep your options open.

326
00:52:17.910 –> 00:52:22.289
Renga: Fantastic. Any more questions for the audience before we call it a wrap

327
00:52:26.990 –> 00:52:30.060
Renga: going once. going twice.

328
00:52:31.760 –> 00:52:53.320
Renga: Alright. If there are any more questions, and you are not able to get the webinar on time. Please don’t worry. The webinar is being recorded. You’ll get a link to the recording shortly after this presentation, and if you still have any questions. Please reach out to us through the same email that the recording was sent to, and the team will route it to run to them. Get your queries answered.

329
00:52:53.480 –> 00:52:59.399
Renga: Thank you so much for orchestrating this seminar for us Ranjith, and we wanna make sure that

330
00:52:59.430 –> 00:53:02.470
Renga: we also announced that you’ll be

331
00:53:02.730 –> 00:53:18.470
Renga: taking the audience through a series of webinars. So stay tuned for a further announcements this month and the next, and we’ll be happy to take up any more topics that would be of interest to you. Thank you so much, and have a good day, and thank you, Ranga.

332
00:53:18.950 –> 00:53:21.000
Renga: Thank you. Bye.

333
00:53:22.820 –> 00:53:23.900
Renga: bye, everyone.

IN – Navigating the Efficient Frontier for Language Models (LLMs)

Date

Time

Duration

Schedule a Demo

Submit Your Process

Contact information

Process information