Fully-Local AI Agent Runs On Raspberry Pi, With A Little Patience

[Simone]’s AI assistant, dubbed Max Headbox, is a wakeword-triggered local AI agent capable of following instructions and doing simple tasks. It’s an experiment in many ways, but also a great demonstration not only of what is possible with the kinds of open tools and hardware available to a modern hobbyist, but also a reminder of just how far some of these software tools have come in only a few short years.

Max Headbox is not just a local large language model (LLM) running on Pi hardware; the model is able to make tool calls in a loop, chaining them together to complete tasks. This means the system can break down a spoken instruction (for example, “find the weather report for today and email it to me”) into a series of steps to complete, utilizing software tools as needed throughout the process until the task is finished.

Continue reading “Fully-Local AI Agent Runs On Raspberry Pi, With A Little Patience”

Moondream title with man's face visible in background.

Using Moondream AI To Make Your Pi “See” Like A Human

[Jaryd] from Core Electronics shows us human-like computer vision with Moondream on the Pi 5.

Using the Moondream visual language model, which runs directly on your Raspberry Pi, and not in the cloud, you can answer questions such as “are the clothes on the line?”, “is there a package on the porch?”, “did I leave the fridge open?”, or “is the dog on the bed?” [Jaryd] compares Moondream to an alternative visual AI system, You Only Look Once (YOLO).

Processing a question with Moondream on your Pi can take anywhere from just a few moments to 90 seconds, depending on the model used and the nature of the question. Moondream comes in two varieties, based on size, one is two billion parameters and the other five hundred million parameters. The larger model is more capable and more accurate, but it has a longer processing time — the fastest possible response time coming in at about 22 to 25 seconds. The smaller model is faster, about 8 to 10 seconds, but as you might expect its results are not as good. Indeed, [Jaryd] says the answers can be infuriatingly bad.

In the write-up, [Jaryd] runs you through how to use Moonbeam on your Pi 5 and the video (embedded below) shows it in action. Fair warning though, Moondream is quite RAM intensive so you will need at least 8 GB of memory in your Pi if you want to play along.

If you’re interested in machine vision you might also like to check out Machine Vision Automates Trainspotting With Unique Full-Length Portraits.

Continue reading “Using Moondream AI To Make Your Pi “See” Like A Human”

Regretfully: $3,000 Worth Of Raspberry Pi Boards

We feel for [Jeff Geerling]. He spent a lot of effort building an AI cluster out of Raspberry PI boards and $3,000 later, he’s a bit regretful. As you can see in the video below, it is a neat build. As Jeff points out, it is relatively low power and dense. But dollar for dollar, it isn’t much of a supercomputer.

Of course, the most obvious thing is that there’s plenty of CPU, but no GPU. We can sympathize, too, with the fact that he had to strip it down twice and rebuild it for a total of three rebuilds. One time, he decided to homogenize the SSDs for each board. The second time was to affix the heatsinks. It is always something.

With ten “blades” — otherwise known as compute modules — the plucky little computer turned in about 325 gigaflops on tests. That sounds pretty good, but a Framework Desktop x4 manages 1,180 gigaflops. What’s more is that the Framework turned out cheaper per gigaflop, too. Each dollar bought about 110 megaflops for the Pis, but about 140 for the Framework.

Continue reading “Regretfully: $3,000 Worth Of Raspberry Pi Boards”

A man holds a license plate in front of a black pickup (F-150 Lightning) tailgate. It is a novelty Georgia plate with the designation P00-5000. There are specks of black superimposed over the plate with a transparent sticker, giving it the appearance of digital mud in black.

A Deep Dive On Creepy Cameras

George Orwell might’ve predicted the surveillance state, but it’s still surprising how many entities took 1984 as a how-to manual instead of a cautionary tale. [Benn Jordan] decided to take a closer look at the creepy cameras invading our public spaces and how to circumvent them.

[Jordan] starts us off with an overview of how machine learning “AI” is used Automated License Plate Reader (ALPR) cameras and some of the history behind their usage in the United States. Basically, when you drive by one of these cameras, an ” image segmentation model or something similar” detects the license plate and then runs optical character recognition (OCR) on the plate contents. It will also catalog any bumper stickers with the make and model of the car for a pretty good guess of it being your vehicle, even if the OCR isn’t 100% on the exact plate sequence.

Where the video gets really interesting is when [Jordan] starts disassembling, building, and designing countermeasures to these systems. We get a teardown of a Motorola ALPR for in-vehicle use that is better at being closed hardware than it is at reading license plates, and [Jordan] uses a Raspberry Pi 5, a Halo AI board, and You Only Look Once (YOLO) recognition software to build a “computer vision system that’s much more accurate than anything on the market for law enforcement” for $250.

[Jordan] was able to develop a transparent sticker that renders a license plate unreadable to the ALPR but still plainly visible to a human observer. What’s interesting is that depending on the pattern, the system could read it as either an incorrect alphanumeric sequence or miss detecting the license plate entirely. It turns out, filtering all the rectangles in the world to find just license plates is a tricky problem if you’re a computer. You can find the code on his Github, if you want to take a gander.

You’ve probably heard about using IR LEDs to confuse security cameras, but what about yarn? If you’re looking for more artistic uses for AI image processing, how about this camera that only takes nudes or this one that generates a picture based on geographic data?

Continue reading “A Deep Dive On Creepy Cameras”

Image Recognition On 0.35 Watts

Much of the expense of developing AI models, and much of the recent backlash to said models, stems from the massive amount of power they tend to consume. If you’re willing to sacrifice some ability and accuracy, however, you can get ever-more-decent results from minimal hardware – a tradeoff taken by the Grove Vision AI board, which runs image recognition in near-real time on only 0.35 Watts.

The heart of the board is a WiseEye processor, which combines two ARM Cortex M55 CPUs and an Ethos U55 NPU, which handles AI acceleration. The board connects to a camera module and a host device, such as another microcontroller or a more powerful computer. When the host device sends the signal, the Grove board takes a picture, runs image recognition on it, and sends the results back to the host computer. A library makes signaling over I2C convenient, but in this example [Jaryd] used a UART.

To let it run on such low-power hardware, the image recognition model needs some limits; it can run YOLO8, but it can only recognize one object, runs at a reduced resolution of 192×192, and has to be quantized down to INT8. Within those limits, though, the performance is impressive: 20-30 fps, good accuracy, and as [Jaryd] points out, less power consumption than a single key on a typical RGB-backlit keyboard. If you want another model, there are quite a few available, though apparently of varying quality. If all else fails, you can always train your own.

Continue reading “Image Recognition On 0.35 Watts”

Pong Cloned By Neural Network

Although not the first video game ever produced, Pong was the first to achieve commercial success and has had a tremendous influence on our culture as a whole. In Pong’s time, its popularity ushered in the arcade era that would last for more than two decades. Today, it retains a similar popularity partially for approachability: gameplay is relatively simple, has hardwired logic, and provides insights about the state of computer science at the time. For these reasons, [Nick Bild] has decided to recreate this arcade classic, but not in a traditional way. He’s trained a neural network to become the game instead.

Continue reading “Pong Cloned By Neural Network”

Sony PSP, Evan-Amos, Public Domain.

Llama Habitat Continues To Expand, Now Includes The PSP

Organic Llamas have a rather restricted range, in nature: the Andes Mountains, and that’s it. Humans weren’t content to let the fluffy, friend-shaped creatures stay in their natural habitat, however, and they can now be found on every continent except Antarctica. The Llama2 Large Language Model is like that: while it may have started on a GPU somewhere, thanks to enterprising hackers like [Caio Madeira], who has ported Llama2 to the PlayStation Portable (PSP), the fluffiest LLM can be found just about anywhere.

The AI, in all its glory, dooming yet another system.

Ultimately this project has its roots in Llama2.c by [karpathy], a project we’ve seen used on Pentium II under Windows 98, DOS machines running 486 processors, and even the venerable Commodore 64, of all impossible things. Now, it’s the PSP’s turn. This implementation uses the same 260K tinystories model as the C64 port, upon which it is based. Of course the PSP’s RAM has room for a much larger model, but [Ciao] apparently prefers to run the tiny model faster on this less-ancient gaming hardware.

Its getting to the point that it’s harder to find systems that won’t run LLMs than those that do. Given that Llama2 seems to be the new DOOM, it’s probably only a matter of time before their virtual fur is all over all our old equipment. Fortunately for allergy sufferers, virtual fur cannot trigger a histamine response.

If you know of another system getting LLMs (Alpaca-adjacent or otherwise), send in a tip.