Auto-tuning RAG Models With Katib In Kubeflow

Wajeeh Ul Hassan
2 min read3 days ago

--

Open source LLMs are now widely available, larger LLM models might be difficult to fine tune but their generative ability can be improved with the help of RAG (retrieval augmentative generation). Luckily finding the best RAG model is similar to tuning hyper-parameters of a machine learning or deep learning model. Its easier to fine tune an SLM (small language model) or apply RAG to it.

We can use different tools to tune the RAG. If you are using Kubeflow then tuning RAG is even simpler and faster. Kubeflow comes with Katib.

What is Katib (Automl)?

“Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports hyperparameter tuning, early stopping and neural architecture search (NAS). Learn more about AutoML at fast.ai, Google Cloud, Microsoft Azure or Amazon SageMaker.” — https://www.kubeflow.org/docs/components/katib/overview/

Kubeflow is made taking into consideration managing of machine learning models lifecycle at scale.

Hyperparameter Tuning A RAG Model

With the help of Katib we can tune RAG in a parallel fashion, saving us a lot of time. We can define different parameters like chunk_size, overlap, temperature, chunking_strategy as hyper-parameters and Katib would take care of the rest. It will tune different RAG models and give us the best tuned model.

Auto-tuning RAG Models With Katib In Kubeflow

This is enough to find the best RAG model and push into production after considerable testing and evaluation.

However managing machine learning lifecycle isn’t easy. Once the RAG model is in production, we may need to update the model in the future and have full track of the different metrics, and dataset/documents. To take care of the whole machine learning life cycle, we must have a platform with relevant tools, Kubeflow solves that problem, as it is flexible and allows integration of different tools. To learn move about MLOps, checkout Hands On MLOps With Kubeflow.

It is now possible to automate many of the machine learning tasks, making the system more robust and reducing human error.

Thank you.

--

--

Wajeeh Ul Hassan

#MLOps, Machine Learning Engineer, former Full Stack Engineer