PyData NYC 2024

Advanced Excel Analytics: Python Integration Workshop
11-06, 15:10–16:40 (US/Eastern), Winter Garden

This 90-minute interactive tutorial will guide participants through the practical application of Python in Excel and Anaconda Toolbox, focusing on building a comprehensive data analysis workflow. Attendees will actively engage in hands-on exercises, culminating in the creation of a functional model that integrates advanced Python capabilities within Excel. The session is designed for data scientists, engineers, and senior leaders looking to understand and implement these tools in their data ecosystems. Participants will explore custom reprs, configurable return types, and real-world applications, gaining practical skills applicable to various industries.


This immersive tutorial will guide you through building a complete data analysis workflow, demonstrating how these tools can enhance and streamline complex data operations within the familiar Excel environment.

  • Introduction and Setup (10 minutes):
  • Brief overview of Python in Excel and Anaconda Toolbox
  • Ensuring all participants have necessary access and tools

  • Custom Reprs (25 minutes):

  • Hands-on exercise: Defining a custom repr for Python in Excel
  • Implementing and configuring Card View for custom data
  • Practical application: Creating industry-specific data types

  • Conversion Types (20 minutes):

  • Interactive demonstration: Modifying xl() function return types
  • Exercise: Implementing callbacks to switch between DataFrame and NumPy array outputs
  • Discussion: Implications for different data analysis scenarios

  • Building an End-to-End Workflow (30 minutes):

  • Guided exercise: Constructing a data analysis pipeline
  • Integrating custom data types and return type configurations
  • Implementing natural language processing using NLTK within Excel
  • Creating visualizations with Anaconda Toolbox

  • Wrap-up and Q&A (5 minutes):

  • Review of the completed workflow
  • Discussion on potential applications and limitations

Throughout the tutorial, participants will actively work in their own Excel spreadsheets, following along to build components that will contribute to a final, functional data analysis model. This hands-on approach will provide practical experience with:
- Defining and using custom data types in Excel
- Manipulating return types for different analytical needs
- Integrating Python libraries like NLTK for advanced analysis
- Utilizing Anaconda Toolbox for enhanced data handling and visualization

By the end of the session, attendees will have created a working model that demonstrates the integration of these advanced features, applicable to real-world data analysis scenarios.
This tutorial aims to provide a balanced, practical understanding of how Python in Excel and Anaconda Toolbox can be effectively implemented in various data science workflows. Whether you're considering adopting these tools or looking to optimize existing processes, this session will equip you with hands-on experience and insights into the evolving landscape of data analytics within Excel.


Prior Knowledge Expected

No previous knowledge expected

See also:

Jim Kitchen is the lead engineer for the team that built the Anaconda Toolbox and Anaconda Code add-ins for Excel. His work at Anaconda has included consulting with large financial institutions to convert Excel-based stress models into production-ready Python code. His interest in connecting Excel and Python originally began at Xerox where he rewrote several Excel-based tools into a comprehensive Python analysis suite for printer logs. Jim is based in Austin Texas.

Timothy Hewitt is a senior product manager at Anaconda focusing on supporting Python in Excel and bringing those new to Python into the fold. Timothy is a linguist at heart and gets super nerdy if you ask him about dictionaries and natural language syntactic structures.