The Python Podcast.__init__  By  cover art

The Python Podcast.__init__

By: Tobias Macey
  • Summary

  • The podcast about Python and the people who make it great
    © 2024 Boundless Notions, LLC.
    Show more Show less
Episodes
  • Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River
    Dec 12 2022
    Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary The majority of machine learning projects that you read about or work on are built around batch processes. The model is trained, and then validated, and then deployed, with each step being a discrete and isolated task. Unfortunately, the real world is rarely static, leading to concept drift and model failures. River is a framework for building streaming machine learning projects that can constantly adapt to new information. In this episode Max Halford explains how the project works, why you might (or might not) want to consider streaming ML, and how to get started building with River. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started!Your host is Tobias Macey and today I’m interviewing Max Halford about River, a Python toolkit for streaming and online machine learning Interview IntroductionHow did you get involved in machine learning?Can you describe what River is and the story behind it?What is "online" machine learning? What are the practical differences with batch ML?Why is batch learning so predominant?What are the cases where someone would want/need to use online or streaming ML? The prevailing pattern for batch ML model lifecycles is to train, deploy, monitor, repeat. What does the ongoing maintenance for a streaming ML model look like? Concept drift is typically due to a discrepancy between the data used to train a model and the actual data being observed. How does the use of online learning affect the incidence of drift? Can you describe how the River framework is implemented? How have the design and goals of the project changed since you started working on it? How do the internal representations of the model differ from batch learning to allow for incremental updates to the model state?In the documentation you note the use of Python dictionaries for state management and the flexibility offered by that choice. What are the benefits and potential pitfalls of that decision?Can you describe the process of using River to design, implement, and validate a streaming ML model? What are the operational requirements for deploying and serving the model once it has been developed? What are some of the challenges that users of River might run into if they are coming from a batch learning background?What are the most interesting, innovative, or unexpected ways that you have seen River used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on River?When is River the wrong choice?What do you have planned for the future of River? Contact Info Email@halford_max on TwitterMaxHalford on GitHub Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Riverscikit-multiflowFederated Machine LearningHogwild! Google PaperChip Huyen concept drift blog postDan Crenshaw Berkeley Clipper MLOpsRobustness PrincipleNY Taxi DatasetRiverTorchRiver Public RoadmapBeaver tool for deploying online modelsProdigy ML human in the loop labeling The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
    Show more Show less
    1 hr and 16 mins
  • Declarative Machine Learning For High Performance Deep Learning Models With Predibase
    Dec 5 2022
    Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of Predibase, Travis Addair, explains how they are reducing the burden of model development even further with their managed service for declarative and low-code ML and how they are integrating with the growing ecosystem of solutions for the full ML lifecycle. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great!When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!Your host is Tobias Macey and today I’m interviewing Travis Addair about Predibase, a low-code platform for building ML models in a declarative format Interview IntroductionHow did you get involved in machine learning?Can you describe what Predibase is and the story behind it?Who is your target audience and how does that focus influence your user experience and feature development priorities?How would you describe the semantic differences between your chosen terminology of "declarative ML" and the "autoML" nomenclature that many projects and products have adopted? Another platform that launched recently with a promise of "declarative ML" is Continual. How would you characterize your relative strengths? Can you describe how the Predibase platform is implemented? How have the design and goals of the product changed as you worked through the initial implementation and started working with early customers?The operational aspects of the ML lifecycle are still fairly nascent. How have you thought about the boundaries for your product to avoid getting drawn into scope creep while providing a happy path to delivery? Ludwig is a core element of your platform. What are the other capabilities that you are layering around and on top of it to build a differentiated product?In addition to the existing interfaces for Ludwig you created a new language in the form of PQL. What was the motivation for that decision? How did you approach the semantic and syntactic design of the dialect?What is your vision for PQL in the space of "declarative ML" that you are working to define? Can you describe the available workflows for an individual or team that is using Predibase for prototyping and validating an ML model? Once a model has been deemed satisfactory, what is the path to production? How are you approaching governance and sustainability of Ludwig and Horovod while balancing your reliance on them in Predibase?What are some of the notable investments/improvements that you have made in Ludwig during your work of building Predibase?What are the most interesting, innovative, or unexpected ways that you have seen Predibase used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Predibase?When is Predibase the wrong choice?What do you have planned for the future of Predibase? Contact Info LinkedIntgaddair on GitHub@travisaddair on Twitter Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links PredibaseHorovodLudwig Podcast.__init__ Episode Support Vector ...
    Show more Show less
    59 mins
  • Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks
    Nov 28 2022
    Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Machine learning has the potential to transform industries and revolutionize business capabilities, but only if the models are reliable and robust. Because of the fundamental probabilistic nature of machine learning techniques it can be challenging to test and validate the generated models. The team at Deepchecks understands the widespread need to easily and repeatably check and verify the outputs of machine learning models and the complexity involved in making it a reality. In this episode Shir Chorev and Philip Tannor explain how they are addressing the problem with their open source deepchecks library and how you can start using it today to build trust in your machine learning applications. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.Do you wish you could use artificial intelligence to drive your business the way Big Tech does, but don’t have a money printer? Graft is a cloud-native platform that aims to make the AI of the 1% accessible to the 99%. Wield the most advanced techniques for unlocking the value of data, including text, images, video, audio, and graphs. No machine learning skills required, no team to hire, and no infrastructure to build or maintain. For more information on Graft or to schedule a demo, visit themachinelearningpodcast.com/graft today and tell them Tobias sent you.Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out!Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today!Your host is Tobias Macey and today I’m interviewing Shir Chorev and Philip Tannor about Deepchecks, a Python package for comprehensively validating your machine learning models and data with minimal effort. Interview IntroductionHow did you get involved in machine learning?Can you describe what Deepchecks is and the story behind it?Who is the target audience for the project? What are the biggest challenges that these users face in bringing ML models from concept to production and how does DeepChecks address those problems? In the absence of DeepChecks how are practitioners solving the problems of model validation and comparison across iteratiosn? What are some of the other tools in this ecosystem and what are the differentiating features of DeepChecks? What are some examples of the kinds of tests that are useful for understanding the "correctness" of models? What are the methods by which ML engineers/data scientists/domain experts can define what "correctness" means in a given model or subject area? In software engineering the categories of tests are tiered as unit -> integration -> end-to-end. What are the relevant categories of tests that need to be built for validating the behavior of machine learning models?How do model monitoring utilities overlap with the kinds of tests that you are building with deepchecks?Can you describe how the DeepChecks package is implemented? How have the design and goals of the project changed or evolved from when you started working on it?What are the assumptions that you have built up from your own experiences that have been challenged by your early users and design partners? Can you describe the workflow for an individual or team using DeepChecks as part of their model training and deployment lifecycle?Test engineering is a deep discipline in its own right. How have you approached the user experience and API design to reduce the overhead for ML practitioners to adopt good practices?What are the interfaces available for creating reusable tests and composing test suites together?What are ...
    Show more Show less
    48 mins

What listeners say about The Python Podcast.__init__

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.