JeVoisBase  1.21
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
Loading...
Searching...
No Matches
PyLLM.py
Go to the documentation of this file.
1import pyjevois
2if pyjevois.pro: import libjevoispro as jevois
3else: import libjevois as jevois
4
5import asyncio
6import ollama
7import cv2
8import os # to delete temp image
9import psutil
10
11## Interact with a large-language model (LLM) or vision-language model (VLM) in a chat box
12#
13# This module uses the ollama framework from https://ollama.com to run a large language model (LLM) or vision language
14# model (VLM) right inside the JeVois-Pro camera. The default model is tinydolphin, an experimental LLM (no vision)
15# model with 1.1 Billion parameters, obtained from training the TinyLlama model on the popular Dolphin dataset by Eric
16# Hartford.
17#
18# For now, the model runs fairly slowly and on CPU (multithreaded).
19#
20# Try asking questions, like "how can I make a ham and cheese sandwich?", or "why is the sky blue?", or "when does
21# summer start?", or "how does asyncio work in Python?"
22#
23#
24# Also pre-loaded on microSD is moondream2 with 1.7 Billion parameters, a VLM that can both answer text queries, and
25# also describe images captured by JeVois-Pro, and answer queries about them. However, this model is very slow as just
26# sending one image to it as an input is like sending it 729 tokens... So, consider it an experimental feature for
27# now. Hopefully smaller models will be available soon.
28#
29# With moondream, you can use the special keyword /videoframe/ to pass the current frame from the live video to the
30# model. You can also add more text to the query, for example:
31#
32# user: /videoframe/ how many people?
33# moondream: there are five people in the image.
34#
35# If you only input /videoframe/ then the following query text is automatically added: "Describe this image:"
36#
37# This module uses the ollama python library from https://github.com/ollama/ollama-python
38#
39# More models
40# -----------
41#
42# Other models can run as well. The main question is how slowly, and will we run out or RAM or out of space on our
43# microSD card? Have a look at https://ollama.com for supported models. You need a working internet connection to be
44# able to download and install new models. Installing new models may involve lengthy downloads and possible issues
45# with the microSD getting full. Hence, we recommend that you restart JeVois-Pro to ubuntu command-line mode (see
46# under System tab of the GUI), then login as root/jevois, then:
47#
48# df -h / # check available disk space
49# ollama list # shows instaled models
50# ollama rm somemodel # delete some installed model if running low on disk space
51# ollama run newmodel # download and run a new model, e.g., tinyllama (<2B parameters recommended); if you like it,
52# exit ollama (CTRL-D), and run jevoispro.sh to try it out in the JeVois-Pro GUI.
53#
54# Disclaimer
55# ----------
56#
57# LLM research is still in early stages, despite the recent hype. Remember that these models may return statements that
58# may be inaccurate, biased, possibly offensive, factually wrong, or complete garbage. At then end of the day, always
59# remember that: it's just next-token prediction. You are not interacting with a sentient, intelligent being.
60#
61# @author Laurent Itti
62#
63# @displayname PyLLM
64# @videomapping JVUI 0 0 30.0 YUYV 1920 1080 30.0 JeVois PyLLM
65# @email itti\@usc.edu
66# @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
67# @copyright Copyright (C) 2024 by Laurent Itti, iLab and the University of Southern California
68# @mainurl http://jevois.org
69# @supporturl http://jevois.org/doc
70# @otherurl http://iLab.usc.edu
71# @license GPL v3
72# @distribution Unrestricted
73# @restrictions None
74# @ingroup modules
75class PyLLM:
76 # ###################################################################################################
77 ## Constructor
78 def __init__(self):
79 self.chatbox = jevois.ChatBox("JeVois-Pro large language model (LLM) chat")
80 self.messages = []
81 self.client = ollama.AsyncClient()
82 self.statusmsg = "---"
83
84 # ###################################################################################################
85 ## JeVois optional extra init once the instance is fully constructed
86 def init(self):
87 # Create some parameters that users can adjust in the GUI:
88 self.pc = jevois.ParameterCategory("LLM Parameters", "")
89
90 self.modelname = jevois.Parameter(self, 'modelname', 'str',
91 'Model to use, must be one of the supported names at https://ollama.com and typically ' +
92 '< 2B parameters. Working internet connection and space on microSD required to download ' +
93 'a new model. You need to download the model from the ollama command-line first, before ' +
94 'using it here.',
95 'tinydolphin', self.pc)
96 self.modelname.setCallback(self.setModelsetModel);
97
98 # ###################################################################################################
99 ## JeVois optional extra un-init before destruction
100 def uninit(self):
101 if os.path.exists("/tmp/pyllm.jpg"):
102 os.remove("/tmp/pyllm.jpg")
103
104 # ###################################################################################################
105 # Instantiate a model each time model name is changed:
106 def setModel(self, name):
107 if hasattr(self, 'generator'):
108 self.task.cancel()
109 del(self.task)
110 del(self.generator)
111 self.messages = []
112 self.chatbox.clear()
113
114 # ###################################################################################################
115 ## Run the LLM model asynchronously
116 async def runmodel(self):
117 # Try to get some more reply words:
118 async for response in await self.generator:
119 content = response['message']['content']
120 self.chatbox.writeString(content)
121 self.currmsg['content'] += content
122
123 if response['done']:
124 self.messages.append(self.currmsg)
125 self.chatbox.writeString("\n")
126
127 # ###################################################################################################
128 ## Process function with GUI output on JeVois-Pro
129 def processGUI(self, inframe, helper):
130 # Start a new display frame, gets its size and also whether mouse/keyboard are idle:
131 idle, winw, winh = helper.startFrame()
132
133 # Draw full-resolution input frame from camera:
134 x, y, w, h = helper.drawInputFrame("c", inframe, False, False)
135 helper.itext('JeVois-Pro large language model (LLM) chat')
136
137 # Draw the chatbox window:
138 self.chatbox.draw()
139
140 # Get access to the event loop:
141 loop = asyncio.get_event_loop()
142
143 # Wait for user input or wait for more words from the LLM response?
144 if hasattr(self, 'generator'):
145 # We have a generator that is working on getting us a response; run it a bit and check if complete:
146 try:
147 # This will run our runmodel() function until timeout, which would throw:
148 loop.run_until_complete(asyncio.wait_for(asyncio.shield(self.task), timeout = 0.025))
149
150 # If no exception was thrown, response complete, nuke generator & task, back to user input:
151 del(self.task)
152 del(self.generator)
153 self.chatbox.freeze(False)
154 except:
155 # Timeout, no new response words from the LLM
156 pass
157
158 else:
159 # We are not generating a response, so we are waiting for user input. Any new user input?
160 if user_input := self.chatbox.get():
161 # Do we want to pass an image to moondream or similar VLM?
162 if '/videoframe/' in user_input:
163 img = inframe.getCvBGRp()
164 cv2.imwrite('/tmp/pyllm.jpg', img)
165 user_input = user_input.replace('/videoframe/', '')
166 if len(user_input) == 0: user_input = 'Describe this image:'
167 self.messages.append({'role': 'user', 'content': user_input, 'images': ['/tmp/pyllm.jpg']})
168 else:
169 self.messages.append({'role': 'user', 'content': user_input})
170
171 # Prepare to get response from LLM:
172 self.currmsg = {'role': 'assistant', 'content': ''}
173 self.chatbox.freeze(True)
174
175 # Create a response generator and associated asyncio task:
176 self.generator = self.client.chat(model = self.modelname.get(), messages = self.messages, stream=True)
177 self.task = loop.create_task(self.runmodel())
178
179 # Because ollama runs in a different process (we are just running a web client to it here), get general CPU load
180 # and other system info to show to user:
181 if jevois.frameNum() % 15 == 0:
182 temp="UNK"
183 with open("/sys/class/thermal/thermal_zone1/temp", "r") as f:
184 temp = float(int(f.readline()) / 100) / 10
185 freq="UNK"
186 with open("/sys/devices/system/cpu/cpu2/cpufreq/scaling_cur_freq") as f:
187 freq = int(int(f.readline()) / 1000)
188 self.statusmsg="{}% CPU, {}% RAM, {}C, {} MHz".format(psutil.cpu_percent(), psutil.virtual_memory().percent,
189 temp, freq)
190 helper.iinfo(inframe, self.statusmsg, winw, winh);
191
192 # End of frame:
193 helper.endFrame()
Interact with a large-language model (LLM) or vision-language model (VLM) in a chat box.
Definition PyLLM.py:75
runmodel(self)
Run the LLM model asynchronously.
Definition PyLLM.py:116
setModel(self, name)
Definition PyLLM.py:106
init(self)
JeVois optional extra init once the instance is fully constructed.
Definition PyLLM.py:86
__init__(self)
Constructor.
Definition PyLLM.py:78
uninit(self)
JeVois optional extra un-init before destruction.
Definition PyLLM.py:100
processGUI(self, inframe, helper)
Process function with GUI output on JeVois-Pro.
Definition PyLLM.py:129