Dask Blog

Working notes about scaling Python

Contribute a Blog Atom Feed

Improving GroupBy.map with Dask and Xarray November 21, 2024
Dask DataFrame is Fast Now May 30, 2024
High Level Query Optimization in Dask August 25, 2023
Upstream testing in Dask April 18, 2023
Do you need consistent environments between the client, scheduler and workers? April 14, 2023
Deep Dive into creating a Dask DataFrame Collection with from_map April 12, 2023
Shuffling large data at constant memory in Dask March 15, 2023
Managing dask workloads with Flyte February 13, 2023
Easy CPU/GPU Arrays and Dataframes February 02, 2023
Dask Demo Day November 2022 November 21, 2022
Reducing memory usage in Dask workloads by 80% November 15, 2022
Dask Kubernetes Operator November 09, 2022
Understanding Dask’s meta keyword argument August 09, 2022
Data Proximate Computation on a Dask Cluster Distributed Between Data Centres July 19, 2022
Documentation Framework July 15, 2022
How to run different worker types with the Dask Helm Chart February 17, 2022
Reflections on one year as the Dask life science fellow December 15, 2021
Mosaic Image Fusion December 01, 2021
Choosing good chunk sizes in Dask November 02, 2021
CZI EOSS Update October 20, 2021
2021 Dask User Survey September 15, 2021
Google Summer of Code 2021 - Dask Project August 23, 2021
High Level Graphs update July 07, 2021
Ragged output, how to handle awkward shaped results July 02, 2021
Dask Down Under June 25, 2021
Dask Survey 2021, early anecdotes June 18, 2021
The evolution of a Dask Distributed user June 01, 2021
The 2021 Dask User Survey is out now May 25, 2021
Life sciences at the 2021 Dask Summit May 24, 2021
Stability of the Dask library May 21, 2021
Skeleton analysis May 07, 2021
Dask with PyTorch for large scale image analysis March 29, 2021
Image segmentation with Dask March 19, 2021
Measuring Dask memory usage with dask-memusage March 11, 2021
Getting to know the life science community March 04, 2021
Dask User Summit 2021 March 03, 2021
Image Analysis Redux November 12, 2020
2020 Dask User Survey September 22, 2020
Announcing the DaskHub Helm Chart August 31, 2020
Running tutorials August 21, 2020
Comparing Dask-ML and Ray Tune's Model Selection Algorithms August 06, 2020
Configuring a Distributed Dask Cluster July 30, 2020
The current state of distributed Dask clusters July 23, 2020
Faster Scheduling July 21, 2020
Last Year in Review July 17, 2020
Large SVDs May 13, 2020
Dask Summit April 28, 2020
Estimating Users January 14, 2020
Dask Deployment Updates November 01, 2019
DataFrame Groupby Aggregations October 08, 2019
Better and faster hyperparameter optimization with Dask September 30, 2019
Co-locating a Jupyter Server and Dask Scheduler September 13, 2019
Dask on HPC: a case study August 28, 2019
Dask and ITK for large scale image analysis August 09, 2019
2019 Dask User Survey August 05, 2019
Dask Release 2.2.0 August 02, 2019
Extracting fsspec from Dask July 23, 2019
Dask Release 2.0 June 22, 2019
Load Large Image Data with Dask Array June 20, 2019
Python and GPUs: A Status Update June 19, 2019
Dask on HPC June 12, 2019
Experiments in High Performance Networking with UCX and DGX June 09, 2019
Composing Dask Array with Numba Stencils April 09, 2019
cuML and Dask hyperparameter optimization March 27, 2019
Dask and the __array_function__ protocol March 18, 2019
Building GPU Groupby-Aggregations for Dask March 04, 2019
Running Dask and MPI programs together January 31, 2019
Single-Node Multi-GPU Dataframe Joins January 29, 2019
Dask Release 1.1.0 January 23, 2019
Extension Arrays in Dask DataFrame January 22, 2019
Dask, Pandas, and GPUs: first steps January 13, 2019
GPU Dask Arrays, first steps January 03, 2019
Dask Version 1.0 November 29, 2018
Dask-jobqueue October 08, 2018
Refactor Documentation September 27, 2018
Dask Development Log September 17, 2018
Dask Release 0.19.0 September 05, 2018
High level performance of Pandas, Dask, Spark, and Arrow August 28, 2018
Building SAGA optimization for Dask arrays August 07, 2018
Dask Development Log August 02, 2018
Pickle isn't slow, it's a protocol July 23, 2018
Dask Development Log, Scipy 2018 July 17, 2018
Who uses Dask? July 16, 2018
Dask Development Log July 08, 2018
Dask Scaling Limits June 26, 2018
Dask Release 0.18.0 June 14, 2018
Beyond Numpy Arrays in Python May 27, 2018
Dask Release 0.17.2 March 21, 2018
Craft Minimal Bug Reports February 28, 2018
Dask Release 0.17.0 February 12, 2018
Credit Modeling with Dask February 09, 2018
Pangeo: JupyterHub, Dask, and XArray on the Cloud January 22, 2018
Dask Development Log December 06, 2017
Dask Release 0.16.0 November 21, 2017
Optimizing Data Structure Access in Python November 03, 2017
Streaming Dataframes October 16, 2017
Notes on Kafka in Python October 10, 2017
Dask Release 0.15.3 September 24, 2017
Fast GeoSpatial Analysis in Python September 21, 2017
Dask on HPC - Initial Work September 18, 2017
Dask Release 0.15.2 August 30, 2017
Scikit-Image and Dask Performance July 18, 2017
Dask Benchmarks July 03, 2017
Use Apache Parquet June 28, 2017
Dask Release 0.15.0 June 15, 2017
Dask Release 0.14.3 May 08, 2017
Dask Development Log April 28, 2017
Asynchronous Optimization Algorithms with Dask April 19, 2017
Dask and Pandas and XGBoost March 28, 2017
Dask Release 0.14.1 March 23, 2017
Developing Convex Optimization Algorithms in Dask March 22, 2017
Dask Release 0.14.0 February 27, 2017
Dask Development Log February 20, 2017
Experiment with Dask and TensorFlow February 11, 2017
Two Easy Ways to Use Scikit Learn and Dask February 07, 2017
Dask Development Log January 30, 2017
Custom Parallel Algorithms on a Cluster with Dask January 24, 2017
Dask Development Log January 18, 2017
Distributed NumPy on a Cluster with Dask Arrays January 17, 2017
Distributed Pandas on a Cluster with Dask DataFrames January 12, 2017
Dask Release 0.13.0 January 03, 2017
Dask Development Log December 24, 2016
Dask Development Log December 18, 2016
Dask Development Log December 12, 2016
Dask Development Log December 05, 2016
Dask Cluster Deployments September 22, 2016
Dask and Celery September 13, 2016
Dask Distributed Release 1.13.0 September 12, 2016
Dask for Institutions August 16, 2016
Dask and Scikit-Learn -- Model Parallelism July 12, 2016
Ad Hoc Distributed Random Forests April 20, 2016
Fast Message Serialization April 14, 2016
Distributed Dask Arrays February 26, 2016
Pandas on HDFS with Dask Dataframes February 22, 2016
Introducing Dask distributed February 17, 2016
Dask is one year old December 21, 2015
Distributed Prototype October 09, 2015
Caching August 03, 2015
Custom Parallel Workflows July 23, 2015
Write Complex Parallel Algorithms June 26, 2015
Distributed Scheduling June 23, 2015
State of Dask May 19, 2015
Towards Out-of-core DataFrames March 11, 2015
Towards Out-of-core ND-Arrays -- Dask + Toolz = Bag February 17, 2015
Towards Out-of-core ND-Arrays -- Slicing and Stacking February 13, 2015
Towards Out-of-core ND-Arrays -- Spilling to Disk January 16, 2015
Towards Out-of-core ND-Arrays -- Benchmark MatMul January 14, 2015
Towards Out-of-core ND-Arrays -- Multi-core Scheduling January 06, 2015
Towards Out-of-core ND-Arrays -- Frontend December 30, 2014
Towards Out-of-core ND-Arrays December 27, 2014