• 📅 ThursdAI - May 30 - 1000 T/s inference w/ SambaNova, <135ms TTS with Cartesia, SEAL leaderboard from Scale & more AI news

  • May 31 2024
  • Duración: 1 h y 53 m
  • Podcast

📅 ThursdAI - May 30 - 1000 T/s inference w/ SambaNova, <135ms TTS with Cartesia, SEAL leaderboard from Scale & more AI news  Por  arte de portada

📅 ThursdAI - May 30 - 1000 T/s inference w/ SambaNova, <135ms TTS with Cartesia, SEAL leaderboard from Scale & more AI news

  • Resumen

  • Hey everyone, Alex here! Can you believe it's already end of May? And that 2 huge AI companies conferences are behind us (Google IO, MSFT Build) and Apple's WWDC is just ahead in 10 days! Exciting! I was really looking forward to today's show, had quite a few guests today, I'll add all their socials below the TL;DR so please give them a follow and if you're only in reading mode of the newsletter, why don't you give the podcast a try 🙂 It's impossible for me to add the density of knowledge that's being shared on stage for 2 hours here in the newsletter! Also, before we dive in, I’m hosting a free workshop soon, about building evaluations from scratch, if you’re building anything with LLMs in production, more than welcome to join us on June 12th (it’ll be virtual)TL;DR of all topics covered: * Open Source LLMs * Mistral open weights Codestral - 22B dense coding model (X, Blog)* Nvidia open sources NV-Embed-v1 - Mistral based SOTA embeddings (X, HF)* HuggingFace Chat with tool support (X, demo)* Aider beats SOTA on Swe-Bench with 26% (X, Blog, Github)* OpenChat - Sota finetune of Llama3 (X, HF, Try It)* LLM 360 - K2 65B - fully transparent and reproducible (X, Paper, HF, WandB)* Big CO LLMs + APIs* Scale announces SEAL Leaderboards - with private Evals (X, leaderboard)* SambaNova achieves >1000T/s on Llama-3 full precision* Groq hits back with breaking 1200T/s on Llama-3* Anthropic tool support in GA (X, Blogpost)* OpenAI adds GPT4o, Web Search, Vision, Code Interpreter & more to free users (X)* Google Gemini & Gemini Flash are topping the evals leaderboards, in GA(X)* Gemini Flash finetuning coming soon* This weeks Buzz (What I learned at WandB this week)* Sponsored a Mistral hackathon in Paris* We have an upcoming workshop in 2 parts - come learn with me* Vision & Video* LLama3-V - Sota OSS VLM (X, Github)* Voice & Audio* Cartesia AI - super fast SSM based TTS with very good sounding voices (X, Demo)* Tools & Hardware* Jina Reader (https://jina.ai/reader/) * Co-Hosts and Guests* Rodrigo Liang (@RodrigoLiang) & Anton McGonnell (@aton2006) from SambaNova* Itamar Friedman (@itamar_mar) Codium* Arjun Desai (@jundesai) - Cartesia* Nisten Tahiraj (@nisten) - Cohost* Wolfram Ravenwolf (@WolframRvnwlf)* Eric Hartford (@erhartford)* Maziyar Panahi (@MaziyarPanahi)Scale SEAL leaderboards (Leaderboard)Scale AI has announced their new initiative, called SEAL leaderboards, which aims to provide yet another point of reference in how we understand frontier models and their performance against each other. We've of course been sharing LMSys arena rankings here, and openLLM leaderboard from HuggingFace, however, there are issues with both these approaches, and Scale is approaching the measuring in a different way, focusing on very private benchmarks and dataset curated by their experts (Like Riley Goodside) The focus of SEAL is private and novel assessments across Coding, Instruction Following, Math, Spanish and more, and the main reason they keep this private, is so that models won't be able to train on these benchmarks if they leak to the web, and thus show better performance due to data contamination. They are also using ELO scores (Bradley-Terry) and I love this footnote from the actual website: "To ensure leaderboard integrity, we require that models can only be featured the FIRST TIME when an organization encounters the prompts"This means they are taking the contamination thing very seriously and it's great to see such dedication to being a trusted source in this space. Specifically interesting also that on their benchmarks, GPT-4o is not better than Turbo at coding, and definitely not by 100 points like it was announced by LMSys and OpenAI when they released it! Gemini 1.5 Flash (and Pro) in GA and showing impressive performance As you may remember from my Google IO recap, I was really impressed with Gemini Flash, and I felt that it went under the radar for many folks. Given it's throughput speed, 1M context window, and multimodality and price tier, I strongly believed that Google was onto something here. Well this week, not only was I proven right, I didn't actually realize how right I was 🙂 as we heard breaking news from Logan Kilpatrick during the show, that the models are now in GA, and that Gemini Flash gets upgraded to 1000 RPM (requests per minute) and announced that finetuning is coming and will be free of charge! Not only with finetuning won't cost you anything, inference on your tuned model is going to cost the same, which is very impressive. There was a sneaky price adjustment from the announced pricing to the GA pricing that upped the pricing by 2x on output tokens, but even despite that, Gemini Flash with $0.35/1MTok for input and $1.05/1MTok on output is probably the best deal there is right now for LLMs of this level. This week it was also confirmed both on LMsys, and on Scale SEAL leaderboards that Gemini Flash is a very good coding LLM, beating Claude Sonnet and LLama-3 70B! SambaNova ...
    Más Menos
activate_primeday_promo_in_buybox_DT

Lo que los oyentes dicen sobre 📅 ThursdAI - May 30 - 1000 T/s inference w/ SambaNova, <135ms TTS with Cartesia, SEAL leaderboard from Scale & more AI news

Calificaciones medias de los clientes

Reseñas - Selecciona las pestañas a continuación para cambiar el origen de las reseñas.