• Kodsnack 567 - Arrow straight through, with Matt Topol and Lars Wikman

  • Jan 30 2024
  • Length: 1 hr and 23 mins
  • Podcast

Kodsnack 567 - Arrow straight through, with Matt Topol and Lars Wikman  By  cover art

Kodsnack 567 - Arrow straight through, with Matt Topol and Lars Wikman

  • Summary

  • Fredrik has Matt Topol and Lars Wikman over for a deep and wide chat about Apache Arrow and many, many topics in the orbit of the language-independent columnar memory format for flat and hierarchical data. What does that even mean? What is the point? And why does Arrow only feel more and more interesting and useful the more you think about deeply integrating it into your systems? Feeding data to systems fast enough is a problem which is focused on much less than it ought to be. With Arrow you can send data over the network, process it on the CPU - or GPU for that matter- and send it along to the database. All without parsing, transformation, or copies unless absolutely necessary. Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We are @kodsnack, @tobiashieta, @oferlund and @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links LarsMattØredevMatt’s Øredev presentations: State of the Apache Arrow ecosystem: How your project can leverage Arrow! and Leveraging Apache Arrow for ML workflowsKallbadhusetApache ArrowLars talks about his Arrow rabbit hole in Regular programmingSIMD/vectorizationSparkExplorer - builds on PolarsNull bitmapZeromqAirbyteArrow flightDremioArrow flight SQLInfluxdbArrow flight RPCKafkaPulsarOpentelemetryArrow IPC format - also known as FeatherADBC - Arrow database connectivityODBC and JDBCSnowflakeDBT - SQL to SQLJinjaDatafusionIbisSubstraitMeta’s Velox engineArrow’s project management committee (PMC)Voltron dataMatt’s Arrow book - In-memory analytics with Apache ArrowRapids and CudfThe Theseus engine - accelerator-native distributed compute engine using ArrowThe composable codexThe standards chapterDremioHugging faceApache Hop - orchestration data scheduling thingDirected acyclic graphUCX - libraries for finding fast routes for dataInfinibandNUMACUDAGRPCFoam bananasTurkish pepper - Tyrkisk peberPloppMarianne Titles For me, it started during the speaker’s dinnerOld, dated, and JavaA real nerd snipeIdentical representation in memoryWorking on columnsIt’s already laid out that wayPass the memory, as isNull plus null is nullA wild perkArrow into the thingSo many curly brackets you need to storeArrow straight throughSomething data people like to doSo many backendsThe SQL string is for peopleI’m rude, and he’s politeFeed the data fast enoughA depressing amount of JSONArrow the whole way throughThese are the problems in dataReference the bytes as they areBoiling down to ArrowData lakehousesRemoving inefficiency
    Show more Show less
activate_primeday_promo_in_buybox_DT

What listeners say about Kodsnack 567 - Arrow straight through, with Matt Topol and Lars Wikman

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.