Generative AI Enterprise Adoption :

6 min readMar 19, 2023

Real Challenges No One Is Talking About

(and some solutions) — Part 2 of 4

Deepam Mishra | www.tbicorp.com | linkedin.com/in/deepammishra

Background

This is the second article in a series of 4, that addresses the challenges enterprises face when implementing generative AI for production. The series focuses on practical challenges and solutions that are often overlooked in popular discussions. The current article addresses the concern of closed models like chatGPT vs. open-source models.

The articles are here:

1. Generative AI gives factually wrong or made-up responses (e.g. hallucinations) — read here

2. “We have no control over data privacy and security” (current)

3. “Generative AI is too expensive for routine tasks” — coming soon

4. “Generative AI systems cannot be trusted or may harm users (black box)” — coming soon

Business Concern: “chatGPT is a closed model. I will not have control over how it works. What if it can misuse my data”

Open Source vs. Closed-access Foundational Models?

Closed andOpen Models Defined

Closed models like chatGPT are those that do not disclose their network architecture and weights. Users can only send their query to the hosted model end-points and get answers. For the most part, most of the Foundation Models (FMs) do not have anything inherently proprietary about their architecture. They are dense layers of deep neural networks, built on the popular Transformer architecture. However large FM developers spend a lot of time and resources (up to tens of millions of dollars per model) to train a model and often decide to keep the parameters and training weights confidential. They seek to earn a fee on the use of these proprietary parameters, usually on a per-use basis (closed FMs are commonly priced per number of tokens in a query, where the number of tokens roughly corresponds to the number of words).

Some closed models may provide users with an API to fine-tune that FM. Fine-tuning means that the model owner will allow users to change the weights of the last few layers of the Deep Neural Network (DNN). This process creates a new, customized model that technically is ‘owned’ by that user — or in other words, the parameters and weights of the fine-tuned layers are for that user’s exclusive use. Even after fine-tuning, the fine-tuned custom models are still closed as users will not have access to the model details.

Parameters associated with a typical Deep Neural Network

Open Source models (e.g. HuggingFace Bloom, Stable Diffusion) are fully open source and users have access to all the model parameters and weights. So users can download and run these on their infrastructure, without any limitations.

Which is Better?

Before you conclude that open is always better than closed, think again. Different businesses and use cases will be better suited to closed vs open-source models. While this is not the subject of this article, a simple comparison is below

The key insight I want to underscore is that both Closed and Open-source models can be adapted for secure enterprise use cases with little or no “practical risk” that these models will cause uncontrollable harm. Let us see that in a bit more detail

How To Incorporate FMs In Enterprise Stack

Expected Generative AI Application Stack (Source:https://www.insightpartners.com/ideas/generative-ai-stack/)

(a) Working With Closed-Models — are typically backed by contractual and technical guarantees from large infrastructure ‘Cloud’ providers who we already trust with our enterprise workloads. For example, Microsoft offers the Azure-OpenAI service and provides contractual guarantees for not misusing your data which is similar to any other AI service you currently use from Microsoft Azure. Similarly, Cohere and AWS offer access to Cohere’s closed models with the same protections as most other AWS cloud services. So it is an overstatement to suggest that closed model providers will misuse your data or will cause a higher likelihood of data loss, just because they are closed in nature. Your data can stay within your cloud instance for the most part. For FM inference, you will have to send at least some data outside of your Cloud instance to your Cloud service provider. This is technically less secure than staying within your instance, but there are industry-standard techniques to reduce that risk as will be discussed later in this article.

Also argued later in this series, ALL generative-AI models need to be controlled to ensure safe responses. In other words, Gen-AI models need some supervision and controls to reduce the risk that their answers are factually erroneous, harmful, or likely to cause a data security breach.

Imho, Closed-models should be considered for at least the following scenarios

Prototyping and rapid assessments
Defining the art of the possible using high-performance models to benchmark
Use cases where inference costs are low (infrequently used models that do not justify building your FMs)
Businesses who do not have in-house or partner-provided skills for managing FM training etc.
Very high-performance language skills required out-of-the-box

(b) Working With Open Source Models — several businesses will feel more secure with this option as it allows more control. Open-source models such as those from HuggingFace can be downloaded and run on-premises. They are also the preferred choice for applications where customers want to re-train an FM to directly control its performance.

Of course, with more power comes more responsibility! Training large FMs is not for everyone. It can require specialized techniques for handling very large data sets and even models that run across multiple GPUs. Advanced techniques for data-parallel and model-parallel management are often deployed to do these tasks efficiently and at lower costs. Training these FMs once may also not be enough. Users may have to optimize and host large OSS models and continuously upgrade them as the SOTA keeps changing. This is somewhat akin to managing your infrastructure.

However, there is some good news for OSS enthusiasts. New ‘hybrid’ business models are emerging that will allow you to have the best of both worlds of OSS and managed infrastructure. For example, AWS and HuggingFace have announced a partnership to build custom HuggingFace containers on AWS’ Sagemaker service. Hence you can use OSS models via HuggingFace and AWS, such that the model is fully hosted in your instance, yet the underlying model and its training and inference infrastructure is managed for you by infrastructure service providers. This is a new and exciting field of innovation — so stay tuned.

Imho, the typical use cases for OSS FMs are

When data privacy and control is sacrosanct e.g. financial or other regulated industries
In use cases where inference cost is extremely important and value per use is low/modest
When performance fine-tuning is critical and closed models do not provide this option
Where mid-sized FMs (e.g. <100B parameters) can be sufficient for the use-case and hence the training/managing costs are modest

Key Takeaways

Closed and Open-source models are both suitable for enterprise use cases — though often not interchangeably. Customers should clearly understand the differences and trade-offs before deciding on a path. I expect a lot faster and broader innovation coming from the OSS world — but that is a personal bias (hope?).

Next Up: “Generative AI is too expensive for routine tasks” — coming soon