Oliver Smith
on 25 April 2022
Upgrade your data science workflows with Ubuntu WSL
Ubuntu 22.04 LTS has been released for WSL and is available to download directly from the Microsoft store. With five years of support and the latest toolchains for Go, Python, Ruby, Perl, PHP and Rust it’s never been a better time to leverage Ubuntu WSL in your data science workflows on Windows workstations.
WSL features for data science
We’ve put together a host of resources to help data scientists understand the power of Ubuntu WSL. How its deep integration with Windows increases workflow efficiency and reduces the time to innovation whilst delivering comparative performance to Ubuntu on bare metal. For those wishing to avoid dual-booting, running a second Ubuntu workstation or spinning up a cloud VM, Ubuntu on WSL delivers a compelling package.
For a more in-depth look into the technology behind WSL, detailed performance comparisons and its advantages for data science workflows, read our new whitepaper: Ubuntu WSL for Data Scientists.
You can also check out our webinar, Ubuntu on WSL | An FAQ for Data Scientists and Developers in collaboration with Dell Technologies.
Let’s go through some of the highlights below!
Interoperability
Interoperability is the ability to transparently execute commands and applications, as well as share files and environment variables between Windows and Ubuntu. This enables you to:
- Run Ubuntu commands from a Windows PowerShell prompt such as cut, grep or awk
- Run Windows commands from an Ubuntu Terminal such as explorer.exe or notepad.exe
- Share environment variables between Ubuntu and Windows systems
- Open files on the Windows file system from Ubuntu
- Browse the Ubuntu file system from Windows Explorer
This is particularly advantageous when you need to leverage tools that are only available on one platform, or are more efficient to develop with using an Ubuntu CLI (such as pip).
Data can be downloaded as an Excel file and accessed in Ubuntu WSL where it is cleaned, manipulated and analysed using tools like Anaconda, Jupyter or Tensorflow. The resulting visualisations are then easily accessible from Windows for further processing, or to summarise in PowerPoint.
When working within a Windows-centric organisation, this can save valuable time sharing insights between stakeholders without the need to transfer data across operating systems, devices or a cloud VM.
For a deeper understanding of the power of interoperability, including shared environment variables and mixed Windows/Ubuntu commands check out the following tutorial.
Tutorial: Windows and Ubuntu interoperability>
Visual Studio Code & Docker Desktop integration
When deploying models to Ubuntu in the cloud it’s important to be able to test locally on the same operating system to minimise overhead. Native Windows applications Visual Studio Code and Docker Desktop are both fully integrated with WSL, allowing you to develop directly on Linux inside your usual IDE.
Docker Desktop’s integration makes it possible to run a full Linux toolchain for building containers on your local machine. On WSL, the Docker daemon is able to get up and running within seconds with improved resource consumption.
Tutorial: Working with Visual Studio Code on Ubuntu WSL
Linux GUI App Support
An additional feature of WSL2 is WSLg, which lets you run graphical Linux apps on Windows 11 without additional steps. This is useful for visualisation tools like GNU Octave or UGENE that can then be saved out to your Windows filesystem. (But don’t forget that you can still access your Jupyter notebooks from your native Windows browser!)
Tutorial: Install Ubuntu WSL on Windows 11 with GUI support
Comparable performance to bare metal
Previously, this level of convenience and efficiency came at a cost to performance when comparing the same workloads on Ubuntu WSL vs Ubuntu on bare metal due to the overheads of virtualisation. However numerous optimisations over time have reduced that trade off considerably.
A recent Phoronix report summarised benchmarks run by OpenBenchmarking.org. They concluded that Ubuntu WSL performance was around 94% the speed of bare metal Ubuntu on the same system overall. Different types of workloads can change this performance, however, and when focussing on Machine Learning benchmarks the gap is often smaller.
Read the full Phoronix report.
GPU Acceleration
But the performance story doesn’t end there. Thanks to NVIDIA’s close collaboration with Microsoft, the Linux NVIDIA CUDA software stack enables GPU-accelerated AI training inside WSL whilst leveraging native Windows NVIDIA drivers across a wide range of hardware. This allows data scientists to use frameworks such as Tensorflow, PyTorch and CUDA that target Ubuntu while staying on Windows.
Tutorial: Enabling GPU acceleration on Ubuntu on WSL2 with the NVIDIA CUDA Platform
Get Started Today!
Follow one of the following tutorials to get up and running with WSL:
For Windows 10: Install Ubuntu WSL on Windows 10
For Windows 11: Install Ubuntu WSL on Windows 11
And for more information check out: