Run LLMs on all your Linux IoT devices

Michael will guide attendees through the evolving landscape of Large Language Model (LLM) applications, particularly the shift away from Python to more efficient compiled languages like C, C++, and Rust. This transition is driven by the need for faster, less resource-intensive, and easier-to-manage LLM applications.

The talk will highlight how popular frameworks such as llama2.c, whisper.cpp, and llama.cpp are adapting to this change by eliminating Python dependencies. We will explain how leveraging WebAssembly (Wasm) and WASI NN with these compiled languages can significantly boost the performance and portability of LLM apps, making them portable across Linux IoT devices and various OSes/devices.

In this talk, Michael will demonstrate how to run llama2 series of models in Wasm. The session will also cover the development of LLM agents in Rust, showcasing its advantages in creating lightweight, high performance, efficient and scalable applications within Wasm sandboxes, including LLM-based code review tools and knowledge-assisted agents, providing a realistic perspective on how these technologies can be integrated into everyday workflows. This talk offers a pragmatic look at the future of LLM app development, emphasizing practicality and applicability in Linux IoT environments.

Format

Presentation

When

Saturday, April 13, 3:30 PM - 4:15 PM

Where

Room 4

Speaker

Michael Yuan