Last Update: Jun 7, 2024

Coding with AI

I wrote a book! Check out A Quick Guide to Coding with AI.
Learn how to use Generative AI coding tools as a force multiplier for your career.
Use my code mlmorgan3 to get 50% off (Until Sept 27th).


So, you want to run a ChatGPT-like chatbot on your own computer? Want to learn more LLMs or just be free to chat away without others seeing what you’re saying? This is an excellent option for doing just that.

I’ve been running several LLMs and other generative AI tools on my computer lately. I’ve discovered this web UI from oobabooga for running models, and it’s incredible. You have a ton of options, and it works great.

That’s what we will set up today in this tutorial.

The easy way

If you’re in Windows using WSL, you can run a simple batch file, and it might work great. Super easy.

Clone the repo:

git clone https://github.com/oobabooga/text-generation-webui.git

Then run the batch file:

start_wsl.bat

It will ask you to choose your GPU/platform setup:

“How to run a ChatGPT like LLM locally”

And it’s up and running:

“How to run a ChatGPT like LLM locally”

If this works, skip to the Run the WebUI step.

But if it fails (which I’ve seen), you must do it manually. Below are the instructions to install it manually in WSL. It’s also the instructions to install this in regular old Linux. Let’s get started.

Install Anaconda

I’m using Ubuntu in WSL. So here are the commands we’ll run:

sudo apt-get update

Always a good idea.

sudo apt-get install wget

Change into the tmp directory:

cd /tmp

Then, we want to get the latest version of the installation script from this directory. At the time of this writing, this is the most current version for Linux-x86_64:

wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh

This script is huge. After it’s done downloading, you should see something like this:

“How to run a ChatGPT like LLM locally”

Then you’ll want to validate it:

sha256sum Anaconda3-2023.09-0-Linux-x86_64.sh

and if you don’t see any errors, you’re good to go:

“How to run a ChatGPT like LLM locally”

Now it’s time to run it!

bash Anaconda3-2023.09-0-Linux-x86_64.sh

Accept the license terms (if you want to use it) and press enter.

It will ask where you want to install it. I chose the default location:

“How to run a ChatGPT like LLM locally”

Then, grab a beverage and wait a while. I prefer ice water with lemon.

It’s going to ask if you want to initialize Conda automatically. I do a ton of Python stuff, so I select yes. Choose whatever works best for you.

“How to run a ChatGPT like LLM locally”

Now exit the shell and restart your WSL window.

Install the Text UI

Next, we will install the Web UI interface for our models. This is a Gradio web UI for Large Language Models.

As stated in the repo, their goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation.

Clone it into a folder you’ll want to work in:

git clone https://github.com/oobabooga/text-generation-webui.git

Now type in

conda deactivate

If you have a base version running. We’ll then create a new environment:

conda create -n textgen python=3.11
conda activate textgen

If you see (textgen) in front of your prompt, it’s working.

“How to run a ChatGPT like LLM locally”

Now, we need to install PyTorch. I’m using an NVidia card, so I type in:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

If you aren’t using an NVidia card and want to do CPU only, use this:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

You’ll see a lot of this:

“How to run a ChatGPT like LLM locally”

Wait for it to finish. If you are running an NVidia card, you may need to do this:

conda install -y -c "nvidia/label/cuda-12.1.0" cuda-runtime

Next, we need to install some more dependencies. This will depend on your machine.

cd text-generation-webui
pip install -r <requirements file according to table below>

Requirements file to use:

GPUCPUrequirements file to use
NVIDIAhas AVX2requirements.txt
NVIDIAno AVX2requirements_noavx2.txt
AMDhas AVX2requirements_amd.txt
AMDno AVX2requirements_amd_noavx2.txt
CPU onlyhas AVX2requirements_cpu_only.txt
CPU onlyno AVX2requirements_cpu_only_noavx2.txt

(this comes from the instructions)

After everything is installed, you should be ready to run the WebUI.

Run the WebUI

Now we’re ready to run! In the text-generation-webui directory, run the following:

python server.py

And you should see this:

“How to run a ChatGPT like LLM locally”

Awesome! Let’s load it up in the web browser:

“How to run a ChatGPT like LLM locally”

If you see this, you’re golden! However, you can’t do anything with it yet. You’ll need a model.

Downloading an LLM model

Your models will be downloaded and placed in the text-generation-webui/models folder. There are several ways to download the models, but the easiest way is in the web UI.

Click on “Model” in the top menu:

“How to run a ChatGPT like LLM locally”

Here, you can click on “Download model or Lora” and put in the URL for a model hosted on Hugging Face.

There are tons to choose from. The first one I will load up is the Hermes 13B GPTQ.

I only need to place the username/model path from Hugging Face to do this.

TheBloke/Nous-Hermes-13B-GPTQ

And I can then download it through the web interface.

“How to run a ChatGPT like LLM locally”

After I click refresh, I can see the new model available:

“How to run a ChatGPT like LLM locally”

Select it, and press load. Now we’re ready to go!

Having a Chat

There are a ton of parameters you can adjust. You can get lost in the settings, and once I learn more about it, I’ll certainly share it here.

Here was my test chat:

“How to run a ChatGPT like LLM locally”

Hey! It works! Awesome, and it’s running locally on my machine.

I decided to ask it about a coding problem:

“How to run a ChatGPT like LLM locally”

Okay, not quite as good as GitHub Copilot or ChatGPT, but it’s an answer! I’ll play around with this and share what I’ve learned soon.

Conclusion

You may want to run a large language model locally on your own machine for many reasons. I’m doing it because I want to understand LLMs better and understand how to tune and train them. I am deeply curious about the process and love playing with it. You may have your own reasons for doing it, such as content generation or a chatbot to joke around with. The fact that you don’t have to be connected to the internet or pay a monthly fee is awesome.

What are you doing with LLMs today? Let me know! Let’s talk.

Also if you have any questions or comments, feel free to reach out.

Happy hacking!


Stay up to date on the latest in Computer Vision and AI.

Get notified when I post new articles!

Intuit Mailchimp




Published: Oct 21, 2023 by Jeremy Morgan. Contact me before republishing this content.