JeVoisBase  1.23
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
Loading...
Searching...
No Matches
PyLLM.py
Go to the documentation of this file.
1import pyjevois
2if pyjevois.pro: import libjevoispro as jevois
3else: import libjevois as jevois
4
5import asyncio
6import ollama
7import cv2
8import os # to delete temp image
9import psutil
10import subprocess # for subprocess.run()
11
12## Interact with a large-language model (LLM) or vision-language model (VLM) in a chat box
13#
14# This module uses the ollama framework from https://ollama.com to run a large language model (LLM) or vision language
15# model (VLM) right inside the JeVois-Pro camera. The default model is tinydolphin, an experimental LLM (no vision)
16# model with 1.1 Billion parameters, obtained from training the TinyLlama model on the popular Dolphin dataset by Eric
17# Hartford.
18#
19# For now, the model runs fairly slowly and on CPU (multithreaded).
20#
21# Try asking questions, like "how can I make a ham and cheese sandwich?", or "why is the sky blue?", or "when does
22# summer start?", or "how does asyncio work in Python?"
23#
24#
25# Also pre-loaded on microSD is moondream2 with 1.7 Billion parameters, a VLM that can both answer text queries, and
26# also describe images captured by JeVois-Pro, and answer queries about them. However, this model is very slow as just
27# sending one image to it as an input is like sending it 729 tokens... So, consider it an experimental feature for
28# now. Hopefully smaller models will be available soon.
29#
30# With moondream, you can use the special keyword /videoframe/ to pass the current frame from the live video to the
31# model. You can also add more text to the query, for example:
32#
33# user: /videoframe/ how many people?
34# moondream: there are five people in the image.
35#
36# If you only input /videoframe/ then the following query text is automatically added: "Describe this image:"
37#
38# This module uses the ollama python library from https://github.com/ollama/ollama-python
39#
40# More models
41# -----------
42#
43# Other models can run as well. The main question is how slowly, and will we run out or RAM or out of space on our
44# microSD card? Have a look at https://ollama.com for supported models. You need a working internet connection to be
45# able to download and install new models. Installing new models may involve lengthy downloads and possible issues
46# with the microSD getting full. Hence, we recommend that you restart JeVois-Pro to ubuntu command-line mode (see
47# under System tab of the GUI), then login as root/jevois, then:
48#
49# df -h / # check available disk space
50# ollama list # shows instaled models
51# ollama rm somemodel # delete some installed model if running low on disk space
52# ollama run newmodel # download and run a new model, e.g., tinyllama (<2B parameters recommended); if you like it,
53# exit ollama (CTRL-D), and run jevoispro.sh to try it out in the JeVois-Pro GUI.
54#
55# Disclaimer
56# ----------
57#
58# LLM research is still in early stages, despite the recent hype. Remember that these models may return statements that
59# may be inaccurate, biased, possibly offensive, factually wrong, or complete garbage. At then end of the day, always
60# remember that: it's just next-token prediction. You are not interacting with a sentient, intelligent being.
61#
62# @author Laurent Itti
63#
64# @displayname PyLLM
65# @videomapping JVUI 0 0 30.0 YUYV 1920 1080 30.0 JeVois PyLLM
66# @email itti\@usc.edu
67# @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
68# @copyright Copyright (C) 2024 by Laurent Itti, iLab and the University of Southern California
69# @mainurl http://jevois.org
70# @supporturl http://jevois.org/doc
71# @otherurl http://iLab.usc.edu
72# @license GPL v3
73# @distribution Unrestricted
74# @restrictions None
75# @ingroup modules
76class PyLLM:
77 # ###################################################################################################
78 ## Constructor
79 def __init__(self):
80 self.chatbox = jevois.ChatBox("JeVois-Pro large language model (LLM) chat")
81 self.messages = []
82 self.client = ollama.AsyncClient()
83 self.statusmsg = "---"
84 self.waitmsg = "Loading Model"
85
86 # ###################################################################################################
87 ## JeVois optional extra init once the instance is fully constructed
88 def init(self):
89 # Create some parameters that users can adjust in the GUI:
90 self.pc = jevois.ParameterCategory("LLM Parameters", "")
91
92 models = []
93 avail_models = subprocess.run(["ollama", "list"], capture_output=True, encoding="ascii").stdout.split('\n')
94 for m in avail_models:
95 if m and m.split()[0] != "NAME": models.append(m.split()[0])
96
97 self.modelname = jevois.Parameter(self, 'modelname', 'str',
98 'Model to use. Other models available at https://ollama.com, typically select one with ' +
99 '< 2B parameters. Working internet connection and space on microSD required to download a ' +
100 'new model. You need to download the model from the ollama command-line before using it here.',
101 'smollm2:135m', self.pc)
102 self.modelname.setCallback(self.setModelsetModel)
103 self.modelname.setValidValues(models)
104
105 # ###################################################################################################
106 ## JeVois optional extra un-init before destruction
107 def uninit(self):
108 if os.path.exists("/tmp/pyllm.jpg"):
109 os.remove("/tmp/pyllm.jpg")
110
111 # ###################################################################################################
112 # Instantiate a model each time model name is changed:
113 def setModel(self, name):
114 if hasattr(self, 'generator'):
115 self.task.cancel()
116 del(self.task)
117 del(self.generator)
118 self.messages = []
119 self.chatbox.clear()
120 self.waitmsg = "Loading Model"
121 jevois.LINFO("Selected model " + name)
122
123 # ###################################################################################################
124 ## Run the LLM model asynchronously
125 async def runmodel(self):
126 # Try to get some more reply words:
127 async for response in await self.generator:
128 content = response['message']['content']
129 self.chatbox.writeString(content)
130 self.currmsg['content'] += content
131
132 if response['done']:
133 self.messages.append(self.currmsg)
134 self.chatbox.writeString("\n")
135
136 # ###################################################################################################
137 ## Process function with GUI output on JeVois-Pro
138 def processGUI(self, inframe, helper):
139 # Start a new display frame, gets its size and also whether mouse/keyboard are idle:
140 idle, winw, winh = helper.startFrame()
141
142 # Draw full-resolution input frame from camera:
143 x, y, w, h = helper.drawInputFrame("c", inframe, False, False)
144 helper.itext('JeVois-Pro large language model (LLM) chat')
145
146 # Draw the chatbox window:
147 self.chatbox.draw()
148
149 # Get access to the event loop:
150 loop = asyncio.get_event_loop()
151
152 # Wait for user input or wait for more words from the LLM response?
153 if hasattr(self, 'generator'):
154 # We have a generator that is working on getting us a response; run it a bit and check if complete:
155 try:
156 # This will run our runmodel() function until timeout, which would throw:
157 loop.run_until_complete(asyncio.wait_for(asyncio.shield(self.task), timeout = 0.025))
158
159 # If no exception was thrown, response complete, nuke generator & task, back to user input:
160 del(self.task)
161 del(self.generator)
162 self.chatbox.freeze(False, self.waitmsg)
163 self.waitmsg = "Working" # messsage to show on subsequent queries
164 except:
165 # Timeout, no new response words from the LLM
166 pass
167
168 else:
169 # We are not generating a response, so we are waiting for user input. Any new user input?
170 if self.chatbox.wasCleared():
171 self.messages = []
172 self.currmsg = []
173
174 if user_input := self.chatbox.get():
175 # Do we want to pass an image to moondream or similar VLM?
176 if '/videoframe/' in user_input:
177 img = inframe.getCvBGRp()
178 cv2.imwrite('/tmp/pyllm.jpg', img)
179 user_input = user_input.replace('/videoframe/', '')
180 if len(user_input) == 0: user_input = 'Describe this image:'
181 self.messages.append({'role': 'user', 'content': user_input, 'images': ['/tmp/pyllm.jpg']})
182 else:
183 self.messages.append({'role': 'user', 'content': user_input})
184
185 # Prepare to get response from LLM:
186 self.currmsg = {'role': 'assistant', 'content': ''}
187 self.chatbox.freeze(True, self.waitmsg)
188
189 # Create a response generator and associated asyncio task:
190 try:
191 self.generator = self.client.chat(model = self.modelname.get(),
192 messages = self.messages, stream = True)
193 self.task = loop.create_task(self.runmodel())
194 except Exception as e:
195 helper.reportError(str(e))
196
197 # Because ollama runs in a different process (we are just running a web client to it here), get general CPU load
198 # and other system info to show to user:
199 if jevois.frameNum() % 15 == 0:
200 temp="UNK"
201 with open("/sys/class/thermal/thermal_zone1/temp", "r") as f:
202 temp = float(int(f.readline()) / 100) / 10
203 freq="UNK"
204 with open("/sys/devices/system/cpu/cpu2/cpufreq/scaling_cur_freq") as f:
205 freq = int(int(f.readline()) / 1000)
206 self.statusmsg="{}% CPU, {}% RAM, {}C, {} MHz".format(psutil.cpu_percent(), psutil.virtual_memory().percent,
207 temp, freq)
208 helper.iinfo(inframe, self.statusmsg, winw, winh);
209
210 # End of frame:
211 helper.endFrame()
Interact with a large-language model (LLM) or vision-language model (VLM) in a chat box.
Definition PyLLM.py:76
runmodel(self)
Run the LLM model asynchronously.
Definition PyLLM.py:125
setModel(self, name)
Definition PyLLM.py:113
init(self)
JeVois optional extra init once the instance is fully constructed.
Definition PyLLM.py:88
__init__(self)
Constructor.
Definition PyLLM.py:79
uninit(self)
JeVois optional extra un-init before destruction.
Definition PyLLM.py:107
processGUI(self, inframe, helper)
Process function with GUI output on JeVois-Pro.
Definition PyLLM.py:138