Small Language Models Enable Generative AI in Packaging and Processing Equipment
OEMs can introduce AI into their equipment by embedding small language models. They can run directly on a standalone/embedded PC, don’t require an Internet connection, eliminate latency and costs associated with AI. (Updated December 2024.)
Editor's note: Advances in small language models are proceeding rapidly. This article was updated in December 2024 to account for the newest developments. (It was originally published in August 2024.)
In my previous column, I wrote about how declines in workforce quantity and quality are taking their toll on CPG customers. I sketched out how OEMs might leverage generative AI to create a new generation of machines that talk to operators and technicians, in any language. Essentially, packaging and processing OEMs can build their own version of ChatGPT right into their equipment, allowing operators and technicians to carry on a dialog with the machine. In this column I’ll walk through actual small language models that your engineers can download and begin experimenting with today.
Wait a minute, you say...there’s no way that your CPG customers will stand for your machine maintaining an open connection to the cloud in order for generative AI to work. You’re right about that, which is why there’s so much buzz right now in the AI community about a new generation of self-contained large language models that are designed to run on a local machine. No Internet connection required. Yes, you read that right.
In fact, there are several language models (which some have taken to calling small language models or SLMs) that are available for download that are open source or freely available via a commercial-friendly license that costs you exactly nothing. Why free? There’s an arm’s race right now among the big tech companies with AI model development. Users of these models, like you, benefit from their innovation. For more context, see this recent post from Mark Zuckerberg on why he believes the future of AI is open source, and what’s in it for his company.
Small language models can run on any PC based control or PC-based HMI, or even a dedicated PC embedded in your machine for the purpose of running a generative AI interface. A recurring theme around small language models is that they are a fraction of the size of current large language models, require a fraction of the computing power, and can yield performance coming close to that of large language models, depending on the application. They also eliminate any latency associated with round-trip communications to the cloud. All of the above makes this game-changing technology.
The size of language models is measured by the number of parameters. (If you’re curious, see a detailed explanation of what is meant by parameters when it comes to LLMs.) For example, ChatGPT 3.5 uses 175 billion parameters.
In this column I’m going to focus on several SLMs that your engineers can begin experimenting with that are all 8 billion parameters or smaller. In fact, the more interesting models are 2 billion parameters and smaller. Counterintuitively, when it comes to embedding AI into your equipment, smaller is better: The smaller number of parameters, the less computing power is needed. I’m guessing if you create your own SLM specific to your machine that hoovers up every single word on every page of every manual ever written on that equipment, including hours of interviews with your design engineers, 2-billion-parameter models will be plenty powerful enough.
One final technical note. Many of these models are designed to run faster on a PC with a GPU (a Graphics Processing Unit), typically from Nvidia. These are the chips every AI company is desperate to lay their hands on to power their data centers. But GPUs (typically from Nvidia) can also be found in everyday PC workstations. Even some HMIs may contain GPUs to aid in visualization and graphics. You’ll want to check the hardware that you are using in or on your equipment, as a GPU will make these models run faster. That said, speed differences may be immaterial depending on what you design and how much information you incorporate into your custom models.
For details on the following models, have your engineers download this spreadsheet that we compiled that contains details on all the language models including:
Name of the model
Company who provides the model
Whether it’s open source
Links to detailed information on the model
Links to AI chats with Perplexity for details on the model
All the models are free, and all run on a PC or PC-based HMI. The spreadsheet also covers the hardware requirements of each model, where I was able to discern it. Once your engineers download this spreadsheet, they can begin tinkering.
The six families
When it comes to small language models, I am tracking five major players that are actively competing in the space right now: Google, Microsoft, Meta (Facebook), Mistral, and Hugging Face. (OpenAI is conspicuously missing from the list, but I’ll get to that.) A recent entrant into the SLM arms race is IBM, which I cover below.
Let’s start with Google. You probably have used Google’s Gemini chatbot. In February of 2024, Google released its Gemma small language model, based on its Gemini large language model. In June, Google released Gemma 2, the next version. And at the end of July, it released Gemma 2B, which it claims outperforms the original ChatGPT 3.5 that stunned the world less than a couple years ago, weighing in with a size of only 2.6 billion parameters. Gemma 2B was specifically designed to balance performance with efficiency, along with purported AI safety enhancements. Google is taking AI very seriously and does not want to let OpenAI or any other AI company win the AI arms race. Toward the end of 2024, Google released Gemini Nano, which is designed to run on a device such as a smartphone or tablet.
Not surprisingly, Microsoft is also investing heavily in small language models. In addition to literally investing in OpenAI, Microsoft is building its own AI tech: Earlier this year it released its Phi-3 model family which comes in several sizes. Its Phi-3-mini model is the one that interests me most, and weighs in 3.8 billion parameters. Microsoft is claiming that its Phi-3 family outperforms “models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.” More recently, Microsoft released a Phi-1.5 model (1.5 billion parameters) but it appears to be more of a research proof of concept and may or may not be ready for prime time. But these models are evolving rapidly and this one's worth keeping an eye on.
Interestingly, Microsoft has also released what it calls adapted AI models for industry, which are fine-tuned models that are pre-trained on industry-specific data. Among others, it's notable that Microsoft is working with industrial automation suppliers Siemens Digital Industries and Rockwell Automation for these models.
Meta (popularly known as Facebook) has open-sourced its LlaMa family of AI models, which come in several sizes. Meta is iterating LlaMa at a breakneck pace, moving from version 2 in the spring of 2024 to version 3.1 during the summer and Llama 3.2 in September 2024. The smallest model is LlaMa 1B, which is a 1-billion parameter model, which is specifically designed for edge computing devices (aka, devices in the field). 1B is a significant development, because as recently as this summer, the smallest model was 8B, which would require a hefty PC. One note about LlaMa. Before you dismiss the idea of building into your serious machine application something from Facebook, I would remind you that tech companies like Meta have been known to release highly durable technologies that gain wide purchase due to their reliability and functionality. Meta’s React open-source front-end JavaScript library for building user interfaces for web applications is a great example, and millions of websites use it today. It’s not going anywhere.
Another company serious about the AI arms race is the French AI company Mistral. Mistral is well known in AI circles and is one of the few companies outside of the U.S. and China working on AI models, especially small language models. In the summer of 2024, Mistral released Codestral Mamba, which is a specialized model for code generation but also for reasoning. It could be worth checking out.
There’s one last model to mention, and it’s intriguing because it’s a series of really small models. The family is called smolLM from the company Hugging Face, and is available in a 135 million, 360 million and 1.7 billion parameter models. Before you dismiss this one because of the funny name, Hugging Face is extremely well-known and well-respected in the AI community as a provider of tools and data to that community. Confusingly, the actual language models from most of the other companies mentioned in this column can also be downloaded from Hugging Face. (More on that in a future column.) But smolLM is its own offering.
For the packaging and processing machinery applications, smolLM is worth experimenting with if you’re running into hardware constraints with your existing HMI or PC hardware and the other models. Another reason to experiment with smolLM: The models are so compact they can reportedly run on an inexpensive Raspberry PI, which as many techies and geeks know is a single-board computer that can fit in the palm of your hand. Raspberry Pis were originally designed for hobbyists, but the Raspberry Pi 4 with at least 4GB of RAM, which costs only around $50, can easily be incorporated into a packaging or processing machine without messing with your existing controls architecture.
I recently learned of IBM's offering in the Small Language Model space, with Granite, its third generation of SLM. According to the company, these open-sourced models are said to deliver hjigh performance against safety benchmarks and across a wide variety of tasks from cybersecurity to Retrieval-Augmented Generation, which is the use case I've been covering in this column series.
What about the infamous ChatGPT from the company OpenAI, which started the generative AI revolution? Is there a small language model version available? As of this writing, the only small language model Open AI does offer is a low-cost, small language model that it calls GPT-4o Mini. But that model runs only in the cloud, ruling it out for use in packaging and processing equipment, for now.
The Small Language Model space is exploding with upstarts and new entrants from all over the world. The forecast is rapid evolution of these models for the foreseeable future. (Companies like Anthropic and Grok do not offer small language models currently but given the heat and ligth in this space, it wouldn't surprise me at all if this changes in the coming months).
Conclusion
The small language model space is clearly heating up. SLMs portend a future where AI can be built into many products without requiring a permanent connection to the cloud.
I don’t endorse any particular model, but I do endorse the idea of allocating some engineering time to begin experimenting with one or more of these technologies. As mentioned earlier in this column, you can download our Researched List of all of the above models, complete with detailed specs and links to learn more and download. These models are developing at a breathtaking pace—just in the span of a few weeks I have already had to update this article frequently--as I was writing it--with newer information. We’ll continue to update this article and the researched list above, as things develop.
What about accuracy, hallucinations and overall risk of something bad happening? I’ll tackle that in my next column, so stay tuned.
OEM Magazine is pleased to inaugurate this semi-occasional column tracking the rapid advances in AI and how packaging and processing machine builders can leverage them to build next-generation equipment. Reach out to Dave at [email protected] and let him know what you think or what you’re working on when it comes to AI.
INTRODUCING! The Latest Trends for All Industries at PACK EXPO Southeast
The exciting new PACK EXPO Southeast 2025 unites all vertical markets in one dynamic hub, generating more innovative answers to your production challenges. Don’t miss this extraordinary opportunity for your business!
In this e-book, you’ll learn key considerations for vertical and horizontal f/f/s and how to choose between premade bags and an f/f/s system. Plus, discover the pitfalls to avoid on bagging machinery projects.