{"id":298163,"date":"2024-04-12T08:21:35","date_gmt":"2024-04-12T15:21:35","guid":{"rendered":"https:\/\/www.saastr.com\/?p=298163"},"modified":"2024-04-12T08:21:35","modified_gmt":"2024-04-12T15:21:35","slug":"a-technical-deep-dive-into-building-ai-products-for-the-enterprise-with-contextual-ais-ceo-douwe-kiela","status":"publish","type":"post","link":"https:\/\/www.saastr.com\/a-technical-deep-dive-into-building-ai-products-for-the-enterprise-with-contextual-ais-ceo-douwe-kiela\/","title":{"rendered":"A Technical Deep Dive Into Building AI Products for the Enterprise with Contextual AI\u2019s CEO Douwe Kiela"},"content":{"rendered":"<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">If you work on anything related to <a href=\"https:\/\/www.saastr.com\/category\/topics\/ai\/\" target=\"_blank\" rel=\"noopener\">Artificial Intelligence<\/a>, you know we\u2019re in the age of language models. But when it comes to Enterprises specifically, language models can change the way we work, <\/span><i><span style=\"font-weight: 400;\">and <\/span><\/i><span style=\"font-weight: 400;\">they have very big issues. At SaaStr AI Day, Contextual AI\u2019s CEO, Douwe Kiela, deep dives into what it takes to build AI products for the Enterprise.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298167 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4800.jpeg?resize=600%2C338&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"338\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/338;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298167\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4800.jpeg?resize=600%2C338&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"338\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">You might be familiar with some of the common language model issues.\u00a0<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol style=\"font-weight: 400;\">\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Hallucination with very high confidence \u2014 making up stuff that isn\u2019t true but seems true.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Attribution \u2014 knowing why these language models are saying what they\u2019re saying.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Data privacy \u2014 we send valuable data to someone else\u2019s data center.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Staleness and compliance \u2014 language models must be up-to-date, compliant with regulations, and able to be revised on the fly without having to retrain the entire thing every time.\u00a0<\/span><\/li>\n<\/ol>\n<p><iframe title=\"YouTube video player\" data-src=\"https:\/\/www.youtube.com\/embed\/NOAcuI7qag4?si=Pk3DU4PreZ_XfO33\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" class=\"lazyload\" data-load-mode=\"1\"><\/iframe><\/li>\n<\/ol>\n<p><b>The big issue for Enterprises<\/b><span style=\"font-weight: 400;\">: cost-quality tradeoffs. For many Enterprises, it\u2019s not just about the quality of the language model. It\u2019s also about the price point and whether it makes sense financially.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The first half of this article is a more academic look at the best approach for building AI products for the Enterprise. The second half focuses on what Enterprises care about relating to these solutions.\u00a0<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">RAG is One Solution to These Problems<\/span><\/h2>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298168 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4801.jpeg?resize=600%2C335&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"335\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/335;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298168\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4801.jpeg?resize=600%2C335&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"335\" \/><\/noscript><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Douwe was part of the original work for Retrieval Augmented Generation (RAG) during his time at Facebook AI Research. The basic idea of RAG is simple. It\u2019s saying, \u201cIf we have a generator, i.e., a language model, we can enrich that language model with additional data.\u201d\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">We want that language model to be able to operate off of different versions of, say, Wikipedia or an internal knowledge base. So, how do we give that data to the language model? Through a retrieval mechanism. We\u2019re augmenting the generation with retrieval.\u00a0<\/span><span style=\"font-weight: 400;\">A key insight of this paper is at the top \u2014 end-to-end backprop. The retriever was learning at the same time as the generator. <\/span><span style=\"font-weight: 400;\">Fast forward from 2020 to now, and you can see RAG everywhere. <\/span><b>But the way people do RAG is wrong.\u00a0<\/b><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Douwe is one of the few who can say this since he was on the original paper introducing the idea.\u00a0<\/span><span style=\"font-weight: 400;\">What people do is have a kind of chunking mechanism that turns documents into little pieces, chunking them up and encoding all of that information using embedding models. Then, they put it in a vector database and do an approximate nearest neighbor search.\u00a0<\/span><span style=\"font-weight: 400;\">All of these parts are completely frozen. There\u2019s no Machine Learning happening here, just off-the-shelf components.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298169 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4802.jpeg?resize=600%2C336&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"336\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/336;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298169\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4802.jpeg?resize=600%2C336&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"336\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">At Contextual AI, they refer to this architecture as Frankenstein\u2019s monster, a cobbled-together embedding model, vector database, and language model. But because there is no Machine Learning, you have a very clear ceiling you can\u2019t go beyond.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><b>Frankenstein is great for building quick demos but not great if you want to build something for production in a real Enterprise setting.\u00a0<\/b><\/p>\n<h2><span style=\"font-weight: 400;\">iPhone Approach vs. Frankenstein Approach<\/span><\/h2>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298170 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4803.jpeg?resize=600%2C337&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"337\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/337;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298170\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4803.jpeg?resize=600%2C337&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"337\" \/><\/noscript><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Many people in the field think you could try to build something more like an iPhone of RAG, where all the parts are made to work together. Everything is jointly optimized as one big system where you solve a problem for a very specific use case and do so exceptionally well.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">That\u2019s what Enterprises would like to see. This is the iPhone vs. Frankenstein approach. RAG 1.0 is the old way of doing it, vs. 2.0, the new and better way.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298171 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4804.jpeg?resize=600%2C341&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"341\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/341;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298171\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4804.jpeg?resize=600%2C341&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"341\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Think of the RAG system as a brain split into two halves. The left brain retrieves and encodes the information before giving it to the right half of the brain, which is the language model.\u00a0<\/span><span style=\"font-weight: 400;\">In the Frankenstein setup, the two halves of the brain don\u2019t know they\u2019re working together. They\u2019re entirely separate.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">If you know you\u2019ll do RAG, you should have these two halves of the brain be aware of each other and grow up together. Pre-train them together so that from the beginning, they learn to work together very effectively.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">That would be RAG 2.0, a very specialized, powerful solution. With RAG 2.0, you train the entire system in three stages.\u00a0<\/span><\/p>\n<ol style=\"font-weight: 400;\">\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Pre-train<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Fine-tune<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">RLHF \u2014 reinforcement learning from human feedback\u00a0<\/span><\/li>\n<\/ol>\n<h2><span style=\"font-weight: 400;\">What Happens When You Take a Systems Approach?<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298172 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4805.jpeg?resize=600%2C338&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"338\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/338;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298172\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4805.jpeg?resize=600%2C338&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"338\" \/><\/noscript><br \/>\n<\/span><\/h2>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">You get a system that performs much better than a Frankenstein setup. Douwe believes this is the future of Enterprise solutions.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298177 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4806.jpeg?resize=600%2C353&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"353\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/353;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298177\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4806.jpeg?resize=600%2C353&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"353\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Let\u2019s look at an example in a side-by-side comparison of Frankenstein vs. RAG 2.0. You ask the language model, \u201cWhen did Frontier purchase Spirit Airlines?\u201d This didn\u2019t happen because the deal was called off.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">If you give this to a Frankenstein system, it\u2019ll make up this random hallucination about it being purchased, which isn\u2019t true. You get a lot of mistakes with relevance, too. All language models are pretty good, but what matters is how you contextualize the language model. If you give it the wrong context, there\u2019s no way it will get it right.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">In this comparison example, you also want to be able to retrieve different facts simultaneously. These are the more sophisticated things we\u2019ll be able to do with this technology in the future.\u00a0<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Should You Have Long Context Windows?<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298178 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4808.jpeg?resize=600%2C334&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"334\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/334;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298178\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4808.jpeg?resize=600%2C334&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"334\" \/><\/noscript><br \/>\n<\/span><\/h2>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">One question Douwe often gets regarding RAG is long context windows. When Google announced Gemini 1.5, many people declared the death of RAG because you don\u2019t need retrieval if you can fit everything in the context window.\u00a0<\/span><span style=\"font-weight: 400;\">That\u2019s definitely something you can do, but it\u2019s very expensive and inefficient.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">For example, say you put an entire Harry Potter book in the context window of a language model and then asked a simple question like, \u201cWhat\u2019s the name of Harry\u2019s owl?\u201d You don\u2019t have to read the entire book to find the answer to a simple question.\u00a0<\/span><span style=\"font-weight: 400;\">In RAG, you search for \u201cowl\u201d and find relevant information. It\u2019s much more efficient.\u00a0<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">What Enterprises Care About<\/span><\/h2>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298179 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4809.jpeg?resize=600%2C331&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"331\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/331;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298179\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4809.jpeg?resize=600%2C331&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"331\" \/><\/noscript><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">SaaStr AI Day is about Enterprises and how things make it into the wild. Exciting things are happening in Enterprise around this technology, and they\u2019re starting to adopt it. 2023 was the Year of the Demo, and 2024 is the year of production deployments of AI in Enterprise.\u00a0<\/span><span style=\"font-weight: 400;\">People pretend the world is such that models are the most exciting things and that systems don\u2019t really matter.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298180 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4810.jpeg?resize=600%2C328&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"328\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/328;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298180\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4810.jpeg?resize=600%2C328&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"328\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">If you\u2019re an AI practitioner and work in big Enterprise, you know how the world really works. It\u2019s mostly about systems in Machine Learning and little about the model. The model is about 10-20% of the system when you want to solve a real problem in the world.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298181 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4811.jpeg?resize=600%2C325&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"325\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/325;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298181\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4811.jpeg?resize=600%2C325&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"325\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The other thing that matters is the feedback loop, making sure the system with this model can be optimized, get better over time, and solve your problem.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Efficiency Matters a Lot for Enterprise<\/span><\/h3>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298182 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4812.jpeg?resize=600%2C340&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"340\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/340;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298182\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4812.jpeg?resize=600%2C340&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"340\" \/><\/noscript><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">While academics might not care much about efficiency, Enterprises sure do. Contextual AI has work called GRIT, Generative Representational Instruction Tuning, where they show you that you can have a single language model that\u2019s best-in-class at representation and generation. The same model can be good at both.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Another thing that Enterprises care a lot about is around multi-modality. This means you can make language models that are enhanced to see, where you take computer vision models that turn images into text, and you feed that text directly into your RAG system or language model. This endows it with multi-modal capabilities without needing to do anything really fancy.\u00a0<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Contextual AI\u2019s Enterprise Observations<\/span><\/h2>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298183 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4813.jpeg?resize=600%2C335&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"335\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/335;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298183\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4813.jpeg?resize=600%2C335&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"335\" \/><\/noscript><\/p>\n<ol style=\"font-weight: 400;\">\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">They care about the quality of the technology, but sometimes they care more about the deployment model. Security really is the most important question because companies don\u2019t like when data has to go beyond their own security boundaries (and for good reason!).\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">No single Enterprise deployment is only about accuracy. It\u2019s always about more than that with things like latency, speed, inference, compliance, deployment model, etc. Model features beyond accuracy also add something, like auditability and traceability through attribution, and a model saying \u201cI don\u2019t know\u201d instead of making something up randomly and risking the business.\u00a0<\/span><\/li>\n<\/ol>\n<h3><span style=\"font-weight: 400;\">Enterprise Problems Are Very Different From Consumer Problems<\/span><\/h3>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">If we look towards where we\u2019re going, there\u2019s a lot of excitement about AGI. It\u2019s a gateway to the rest of the world learning about AI. But if you look at Enterprise problems, they\u2019re very different from consumer problems.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">AGI is for the consumer market, where you don\u2019t know what consumers want.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">In Enterprise, they usually know exactly what they want. It\u2019s a very constrained problem and a very specific workflow. Enterprise workflows are a specialized thing where you want to do Artificial Specialized Intelligence and not necessarily AGI.<br \/>\n<img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298184 lazyload\" data-src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4814.jpeg?resize=600%2C336&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"336\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" style=\"--smush-placeholder-width: 600px; --smush-placeholder-aspect-ratio: 600\/336;\" \/><noscript><img data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-medium wp-image-298184\" src=\"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/IMG_4814.jpeg?resize=600%2C336&#038;quality=70&#038;ssl=1\" alt=\"\" width=\"600\" height=\"336\" \/><\/noscript><br \/>\n<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">For example, if you have a system that has to give financial product recommendations or antibody recommendations or help with experimental design, it doesn\u2019t need to know about Shakespeare or quantum mechanics.\u00a0<\/span><\/p>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Specialization gives you better performance and cost-quality tradeoffs, and that\u2019s what Enterprises care about.\u00a0<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Key Takeaways<\/span><\/h2>\n<p style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Douwe has two simple observations.\u00a0<\/span><\/p>\n<ol style=\"font-weight: 400;\">\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">It\u2019s about systems, and models are 10-20% of those systems. We should all be thinking about this from a systems perspective if we want to start productionizing AI in Enterprises.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Specialization is the key to having the optimal tradeoff between cost and quality.\u00a0<\/span><\/li>\n<\/ol>\n<h2><span style=\"font-weight: 400;\">What This Means for Enterprises and AI Practitioners<\/span><\/h2>\n<ol style=\"font-weight: 400;\">\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Data is the only moat. This is true now and will be more true in the future. If you\u2019re an Enterprise, you have really valuable data, and you want to be careful with where you send that data because, in the long term, compute will get commoditized. Algorithms aren\u2019t that interesting anymore, and data will be the main differentiator for all of us.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Be much more pragmatic, and don\u2019t leave performance on the table. If you\u2019re pragmatic in choosing systems and don\u2019t follow the hype cycles or have the model with the maximum number of parameters because it sounds cool, you can get much better performance at a much better tradeoff point.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">We are just getting started. Sometimes, it feels like OpenAI is so far ahead or AI is almost solved. But when you\u2019re in the field and seeing how it\u2019s getting deployed, it\u2019s only just starting to happen. These models are relatively easy to build, but building systems is much harder and time-consuming.\u00a0<\/span><\/li>\n<\/ol>\n<p><iframe title=\"YouTube video player\" data-src=\"https:\/\/www.youtube.com\/embed\/NOAcuI7qag4?si=Pk3DU4PreZ_XfO33\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" class=\"lazyload\" data-load-mode=\"1\"><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you work on anything related to Artificial Intelligence, you know we\u2019re in the age of language models. But when it comes to Enterprises specifically, language models can change the way we work, and they have very big issues. At SaaStr AI Day, Contextual AI\u2019s CEO, Douwe Kiela, deep dives into what it takes to build AI products for the Enterprise.<\/p>\n","protected":false},"author":13,"featured_media":298474,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","om_disable_all_campaigns":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"_wpscp_schedule_draft_date":"","_wpscp_schedule_republish_date":"","_wpscppro_advance_schedule":false,"_wpscppro_advance_schedule_date":"","_wpscppro_custom_social_share_image":0,"_facebook_share_type":"","_twitter_share_type":"","_linkedin_share_type":"","_pinterest_share_type":"","_linkedin_share_type_page":"","_instagram_share_type":"","_medium_share_type":"","_threads_share_type":"","_selected_social_profile":[]},"categories":[24898,31,29,68,122,20],"tags":[],"class_list":["post-298163","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-blog-posts","category-early","category-featured-posts","category-featured-videos","category-videos"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.saastr.com\/wp-content\/uploads\/2024\/04\/youtube-thumbnails-AI-48.png?fit=1000%2C563&quality=70&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p5oib2-1fz5","jetpack_sharing_enabled":true,"fifu_image_url":"https:\/\/www.saastr.com\/wp-content\/uploads\/2024\/04\/youtube-thumbnails-AI-48.png","_links":{"self":[{"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/posts\/298163","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/comments?post=298163"}],"version-history":[{"count":0,"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/posts\/298163\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/media\/298474"}],"wp:attachment":[{"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/media?parent=298163"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/categories?post=298163"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.saastr.com\/wp-json\/wp\/v2\/tags?post=298163"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}