JeVoisBase  1.22
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
Loading...
Searching...
No Matches
PyLLM.py
Go to the documentation of this file.
1import pyjevois
2if pyjevois.pro: import libjevoispro as jevois
3else: import libjevois as jevois
4
5import asyncio
6import ollama
7import cv2
8import os # to delete temp image
9import psutil
10import subprocess # for subprocess.run()
11
12## Interact with a large-language model (LLM) or vision-language model (VLM) in a chat box
13#
14# This module uses the ollama framework from https://ollama.com to run a large language model (LLM) or vision language
15# model (VLM) right inside the JeVois-Pro camera. The default model is tinydolphin, an experimental LLM (no vision)
16# model with 1.1 Billion parameters, obtained from training the TinyLlama model on the popular Dolphin dataset by Eric
17# Hartford.
18#
19# For now, the model runs fairly slowly and on CPU (multithreaded).
20#
21# Try asking questions, like "how can I make a ham and cheese sandwich?", or "why is the sky blue?", or "when does
22# summer start?", or "how does asyncio work in Python?"
23#
24#
25# Also pre-loaded on microSD is moondream2 with 1.7 Billion parameters, a VLM that can both answer text queries, and
26# also describe images captured by JeVois-Pro, and answer queries about them. However, this model is very slow as just
27# sending one image to it as an input is like sending it 729 tokens... So, consider it an experimental feature for
28# now. Hopefully smaller models will be available soon.
29#
30# With moondream, you can use the special keyword /videoframe/ to pass the current frame from the live video to the
31# model. You can also add more text to the query, for example:
32#
33# user: /videoframe/ how many people?
34# moondream: there are five people in the image.
35#
36# If you only input /videoframe/ then the following query text is automatically added: "Describe this image:"
37#
38# This module uses the ollama python library from https://github.com/ollama/ollama-python
39#
40# More models
41# -----------
42#
43# Other models can run as well. The main question is how slowly, and will we run out or RAM or out of space on our
44# microSD card? Have a look at https://ollama.com for supported models. You need a working internet connection to be
45# able to download and install new models. Installing new models may involve lengthy downloads and possible issues
46# with the microSD getting full. Hence, we recommend that you restart JeVois-Pro to ubuntu command-line mode (see
47# under System tab of the GUI), then login as root/jevois, then:
48#
49# df -h / # check available disk space
50# ollama list # shows instaled models
51# ollama rm somemodel # delete some installed model if running low on disk space
52# ollama run newmodel # download and run a new model, e.g., tinyllama (<2B parameters recommended); if you like it,
53# exit ollama (CTRL-D), and run jevoispro.sh to try it out in the JeVois-Pro GUI.
54#
55# Disclaimer
56# ----------
57#
58# LLM research is still in early stages, despite the recent hype. Remember that these models may return statements that
59# may be inaccurate, biased, possibly offensive, factually wrong, or complete garbage. At then end of the day, always
60# remember that: it's just next-token prediction. You are not interacting with a sentient, intelligent being.
61#
62# @author Laurent Itti
63#
64# @displayname PyLLM
65# @videomapping JVUI 0 0 30.0 YUYV 1920 1080 30.0 JeVois PyLLM
66# @email itti\@usc.edu
67# @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
68# @copyright Copyright (C) 2024 by Laurent Itti, iLab and the University of Southern California
69# @mainurl http://jevois.org
70# @supporturl http://jevois.org/doc
71# @otherurl http://iLab.usc.edu
72# @license GPL v3
73# @distribution Unrestricted
74# @restrictions None
75# @ingroup modules
76class PyLLM:
77 # ###################################################################################################
78 ## Constructor
79 def __init__(self):
80 self.chatbox = jevois.ChatBox("JeVois-Pro large language model (LLM) chat")
81 self.messages = []
82 self.client = ollama.AsyncClient()
83 self.statusmsg = "---"
84
85 # ###################################################################################################
86 ## JeVois optional extra init once the instance is fully constructed
87 def init(self):
88 # Create some parameters that users can adjust in the GUI:
89 self.pc = jevois.ParameterCategory("LLM Parameters", "")
90
91 models = []
92 avail_models = subprocess.run(["ollama", "list"], capture_output=True, encoding="ascii").stdout.split('\n')
93 for m in avail_models:
94 if m and m.split()[0] != "NAME": models.append(m.split()[0])
95
96 self.modelname = jevois.Parameter(self, 'modelname', 'str',
97 'Model to use, one of:' + str(models) + '. Other models available at ' +
98 'https://ollama.com, typically select one with < 2B parameters. Working internet connection ' +
99 'and space on microSD required to download a new model. You need to download the model ' +
100 'from the ollama command-line first, before using it here.',
101 'qwen2.5:0.5b', self.pc)
102 self.modelname.setCallback(self.setModelsetModel);
103
104 # ###################################################################################################
105 ## JeVois optional extra un-init before destruction
106 def uninit(self):
107 if os.path.exists("/tmp/pyllm.jpg"):
108 os.remove("/tmp/pyllm.jpg")
109
110 # ###################################################################################################
111 # Instantiate a model each time model name is changed:
112 def setModel(self, name):
113 if hasattr(self, 'generator'):
114 self.task.cancel()
115 del(self.task)
116 del(self.generator)
117 self.messages = []
118 self.chatbox.clear()
119
120 # ###################################################################################################
121 ## Run the LLM model asynchronously
122 async def runmodel(self):
123 # Try to get some more reply words:
124 async for response in await self.generator:
125 content = response['message']['content']
126 self.chatbox.writeString(content)
127 self.currmsg['content'] += content
128
129 if response['done']:
130 self.messages.append(self.currmsg)
131 self.chatbox.writeString("\n")
132
133 # ###################################################################################################
134 ## Process function with GUI output on JeVois-Pro
135 def processGUI(self, inframe, helper):
136 # Start a new display frame, gets its size and also whether mouse/keyboard are idle:
137 idle, winw, winh = helper.startFrame()
138
139 # Draw full-resolution input frame from camera:
140 x, y, w, h = helper.drawInputFrame("c", inframe, False, False)
141 helper.itext('JeVois-Pro large language model (LLM) chat')
142
143 # Draw the chatbox window:
144 self.chatbox.draw()
145
146 # Get access to the event loop:
147 loop = asyncio.get_event_loop()
148
149 # Wait for user input or wait for more words from the LLM response?
150 if hasattr(self, 'generator'):
151 # We have a generator that is working on getting us a response; run it a bit and check if complete:
152 try:
153 # This will run our runmodel() function until timeout, which would throw:
154 loop.run_until_complete(asyncio.wait_for(asyncio.shield(self.task), timeout = 0.025))
155
156 # If no exception was thrown, response complete, nuke generator & task, back to user input:
157 del(self.task)
158 del(self.generator)
159 self.chatbox.freeze(False)
160 except:
161 # Timeout, no new response words from the LLM
162 pass
163
164 else:
165 # We are not generating a response, so we are waiting for user input. Any new user input?
166 if user_input := self.chatbox.get():
167 # Do we want to pass an image to moondream or similar VLM?
168 if '/videoframe/' in user_input:
169 img = inframe.getCvBGRp()
170 cv2.imwrite('/tmp/pyllm.jpg', img)
171 user_input = user_input.replace('/videoframe/', '')
172 if len(user_input) == 0: user_input = 'Describe this image:'
173 self.messages.append({'role': 'user', 'content': user_input, 'images': ['/tmp/pyllm.jpg']})
174 else:
175 self.messages.append({'role': 'user', 'content': user_input})
176
177 # Prepare to get response from LLM:
178 self.currmsg = {'role': 'assistant', 'content': ''}
179 self.chatbox.freeze(True)
180
181 # Create a response generator and associated asyncio task:
182 self.generator = self.client.chat(model = self.modelname.get(), messages = self.messages, stream=True)
183 self.task = loop.create_task(self.runmodel())
184
185 # Because ollama runs in a different process (we are just running a web client to it here), get general CPU load
186 # and other system info to show to user:
187 if jevois.frameNum() % 15 == 0:
188 temp="UNK"
189 with open("/sys/class/thermal/thermal_zone1/temp", "r") as f:
190 temp = float(int(f.readline()) / 100) / 10
191 freq="UNK"
192 with open("/sys/devices/system/cpu/cpu2/cpufreq/scaling_cur_freq") as f:
193 freq = int(int(f.readline()) / 1000)
194 self.statusmsg="{}% CPU, {}% RAM, {}C, {} MHz".format(psutil.cpu_percent(), psutil.virtual_memory().percent,
195 temp, freq)
196 helper.iinfo(inframe, self.statusmsg, winw, winh);
197
198 # End of frame:
199 helper.endFrame()
Interact with a large-language model (LLM) or vision-language model (VLM) in a chat box.
Definition PyLLM.py:76
runmodel(self)
Run the LLM model asynchronously.
Definition PyLLM.py:122
setModel(self, name)
Definition PyLLM.py:112
init(self)
JeVois optional extra init once the instance is fully constructed.
Definition PyLLM.py:87
__init__(self)
Constructor.
Definition PyLLM.py:79
uninit(self)
JeVois optional extra un-init before destruction.
Definition PyLLM.py:106
processGUI(self, inframe, helper)
Process function with GUI output on JeVois-Pro.
Definition PyLLM.py:135