Web AI Monthly #19: Meta's 8B Llama3 LLM in browser, let's meet...
May the 4th be with you! Yes, May is upon us, and many exciting events are lined up. Maybe we will even cross paths in real life this year? Read on below to find out where we shall be speaking and demoing! In this edition, we have some seriously cool new models and resources for you to check out.
If you enjoy the content, please do give us a share with friends, colleagues, and family - everyone is welcome. I aim to centralize the community with all the amazing work being produced in this space and bring light to our most awesome creations. Tag me ( Jason Mayes) if you make something noteworthy for future editions so I can help get eyes on your Web AI work - many readers work for top global tech companies or high-growth startups. We have subscribers ranging from decision-makers (think C-level, VPs, and Directors) to folks on the frontlines using this stuff day today (SWEs, web engineers, and researchers). You never know who may see your creations. Alright, let's go!
Exciting News in the AI Space
Breaking news, in case you didn't have enough LLMs to choose from already, the amazing folks at MLC (hat tip to Charlie Ruan) ported Meta's newest 8B parameter LLM to run entirely in the browser, client-side (no cloud needed for inference) with WebGPU on the same day AI at Meta launched! Incredible efficiency by the community there - well played.
This 8 billion parameter model is a really nice size that even runs at a great speed on my very old NVIDIA 1070 GPU. For me, this represents a serious win as many gamers or such are still sitting on 1070/1080 cards from way back when, so this means even more folks can use this great tech on a device with a single GPU. See it in action - captured in real-time, below:
I guess you want to try it out, right? Well, head on over to https://webllm.mlc.ai/ to do that. Be sure to select the right model from the drop-down before you send a chat message - after which it will download and cache the selected model. Be warned this is several GB in size, but once loaded, it will load much faster on the 2nd page load.
Collaboration with Chrome Team
I'm also pleased to announce a collaboration with the Chrome team (shout outs to Alexandra W., Maud Nalpas, André Bandarra, Paul Kinlan ) who are helping to create a wonderful centralized resource for Web Devs who use AI.
Future of AI Models for Web Engineers
This brand new website will be different from all the existing Python-biased AI docs online, and instead be from the perspective, needs, and wants of a web engineer/developer - actually putting the 70% of engineers who are JS developers top of mind with an AI lens focusing on tasks and processes that matter to us.
It's clear that web folk have different requirements and expectations when using AI client-side in the browser vs folks who may be working on the server-side with near-infinite resources at their disposal by contrast. The models we may choose may not be the same as a researcher looking to get an extra percentage point in accuracy when training a new state-of-the-art model.
I hope that longer-term we can try and narrow down some of the complexity for you all when choosing a model to use for various on-device scenarios for a given target vertical or use case.
After speaking to many of you, it's clear that documentation and benchmarking of AI models are not obvious or consistent, and it would be nice to have some sort of opinion/comparison for popular model types with production use cases in mind instead of research, all in one place that you can pick from, knowing with some confidence it would fare well in the web browser environment.
So go check out the beginnings of this site, but remember this is just the start, not the end - so we welcome feedback for what you want to see more of on such a site: https://web.dev/explore/ai
Bookmark and check back regularly as we continue to add articles throughout the year touching on new topics that matter in the space.




















