SciPy 2017 Talk Highlights

Posted on Mon 17 July 2017 in science

The full SciPy 2017 conference took place from July 10 to 16. The talks and tutorials are now live in a YouTube playlist created by Enthought. I have watched most of the available talks that seemed interesting from a materials science perspective, specifically talks geared towards building scientific packages and computational tools like pycalphad. Read on to see the four talks I found most interesting

Computing has been driving forward a revolution in how science and technology can solve new problems. Python has grown to be a central player in this game, from computational physics to data science. I would like to explore some lessons learned doing science with Python as well as doing Python libraries for science. What are the ingredients that the scientists need? What technical and project-management choices drove the success of projects I've been involved with? How do these demands and offers shape our ecosystem?

Gael offered some interesting insights into how we should be developing packages and APIs for science that people actually want to use and are able to use. One takeaway for me was on slide 34, where Gael suggests to use example-driven development. Example-driven development is about writing codes and APIs that solve problems that people want to do. In my mind, this is a guiding principle behind pycalphad development. There are so many things to do in the CALPHAD space that the best path forward is to enable things that are hard or impossible with other software tools and iterate on those features to make them intuitive and easy to use.

Dask enables parallel computing in Python. While commonly used for its parallel and NumPy, Pandas implementations, Dask is also capable of a variety of more advanced parallel computing workflows. This talk dives into these advanced features features and applications beyond the typical distributed dataframe to talk about asynchronicity, dynamic and self-building computations, multi-user workflows, and more.

This covered some of the advanced usage that we us in pycalphad and ESPEI with the distributed scheduler and Client. Note: you should be familiar with Dask or watch an introductory talk (from PyCon 2017) first.

If you are a data scientist today, it's actually pretty tough to build a data visualization web-application. If you're not a full-stack developer, you're practically out of luck.

But GUIs like sliders, dropdowns, and text inputs are extremely helpful to the data scientist or engineer. If you're an R programmer, you're in luck with Shiny. If you're a MATLAB programmer, you can use GUIDE (but good luck sharing it!). The dash project introduces a framework for building web-based technical computing apps (GUIs). It's like a Shiny for Python. dash is built off of plotly.js and react.js to provide rich interactive graphing and user interfaces and Python's flask to provide a simple but scalable web server.

This talk will introduce the scientific community to Dash. We'll go over motivations behind the project, the basic architecture of the framework, several interactive examples, and leave with a vision for the future of interactive and sharable technical computing.

Dash is a newly released package from the folks at Plot.ly. I like the idea of Dash because of its potential to generate interactive GUIs with visualizations in the browser. Easily, the best and most interesting part is that there is tooling to use JavaScript plugins so that all of the work building browser tools in that space can be reused.

JOSS is "A developer friendly journal for research software packages."

The Journal of Open Source Software (JOSS) is operated entirely in the open and offers an alternative for publishing software to the traditional 'write a paper with results' that also includes a description of the software. JOSS is mostly operated on GitHub and is transparent about the cost of publishing (free for authors). They have achieved several milestones since starting in May 2016 and Kyle Niemeyer discusses those milestones and the future of JOSS in this talk.

Bonus: I haven't watched any of the tutorials, but the Cython tutorial seemed interesting and new this year. If you're still looking for more talks, PyCon 2017 had a ton of interesting content. Specifically this 4d topology visualization by David Dumas and Jake VanderPlas's talk on the Python visualization landscape.