<aside> ⚠️
Note: This is experimental work.
This guide is a work in process and is intended for experimentation and prototyping.
Expect rough edges, manual steps, and a few patches along the way.
Happy building! 👷🛠️✨
</aside>
In this post, I’ll walk you through how I successfully built llama.cpp
on IBM i, step-by-step—from configuring the environment and compiling the source to running a local LLM right on the system. You’ll also learn how to prep a compatible model and test it using the built-in CLI, HTTP server, and web UI.
llama.cpp
is a lightweight, blazing-fast C++ implementation for running Large Language Models (LLMs) locally on a wide range of hardware—no internet, no cloud, no external dependencies. It’s the engine powering many popular tools like Ollama, LM Studio, and other open-source AI apps. And now—for the first time—you can run it directly on IBM i 🧵🎉
If you’ve been curious about integrating LLMs into your applications, this is the perfect way to get started. Running LLMs directly on IBM i enables quick, cost-free prototyping of AI-powered features, all within your existing environment:
First make sure you have the open source environment installed on your IBM i system. you can follow directions here: https://ibmi-oss-docs.readthedocs.io/en/latest/yum/README.html#installation
to configure your environment.
Install the required open source packages: