Which version of Python should I use?

 

If you want to start a new software project today, you will likely be looking at Python for your codebase. It's easy to use and allows for quick development thanks to the many great third party libraries. However, a glaring issue pops up when you look at the Python landscape.

There are two versions of Python, Python 2.X and Python 3.X; both widely used and officially supported. This article briefly discusses the main differences between the two and makes the case for why you should use Python 3.

Differences

Python 2 and 3 are at the core not very different from each other. Originally, improvements added to Python 2 simply lead to incremental versions. Python 3 was created because the changes to fix fundamental design flaws in the language would break backwards compatibility. The authors' intentions were that the community would quickly shift to Python 3, but this didn't happen in reality, and many new projects were still built in Python 2. Since backwards compatibility was broken in a substantial way, it isn't exactly trivial to update projects to Python 3.

Features

Python 2.7 first came out in 2010 and is the last major release in the 2.X series. New major features have only been added to the 3.X series since then. Some of these features are:

  • Better Unicode support;
  • Better integer division;
  • Iterators are much more memory efficient;
  • Integers have unlimited precision;
  • Type hinting to help find bugs sooner;
  • Asynchronous programming support (via asyncio);

Python 3 has kept getting better since it was released almost 10 years ago, while Python 2 has mainly just been getting security and bugfixes.

Unicode

Python 3 separates textual data from binary data in a much more clearly than Python 2. In Python 3, a `str` object can only represent textual data (string of characters). In Python 2 however, it could be either textual or binary data (string of bytes).

In Python 2, you always need to keep track of whether your strings are textual or binary data, and you need to remember to convert between the two correctly where needed. To help developers with this problem, Python 2 offers the `unicode` type for text, but it's not used as much as it should be in practice, introducing many bugs to programs. Python 3 solves this problem by making all `str` unicode by default and forcing developers to explicitly use the `bytes` type to handle binary data. Text being unicode by default in Python 3 also comes with the added bonus of having support for any written language.

For a fully detailed explanation on strings and unicode in Python, see the official documentation's Unicode HOWTO

Performance

If you're worried about performance, you might be thinking of using Python 2. It used to be the case that Python 2 was faster than Python 3. This happened because Python 3's early changes saved programming time, but used more CPU time. However, this hasn't been true since the release of Python 3.7. Python 3 is now the fastest version of Python ever, beating Python 2 in most benchmarks.

Official Support

The biggest argument for Python 3 by far is official support. Official support for Python 2 will end on January 1, 2020. This means that no more bug fixes or security patches will be implemented after that date.

Furthermore, Python 2 has seen no major releases since version 2.7 in 2010, while Python 3 has had a major release almost every year and is still under active development. Recent standard library improvements have as a result only been implemented for Python 3.X.

Libraries

Most major open source Python libraries support both Python 2 and 3. If you're doing any number crunching or machine learning, you'll find that widely used libraries such as NumPy, pandas, and scikit-learn work very well in either version. However, the scientific community has pledged to drop Python 2 support in or before 2020. Dropping the burden of supporting Python 2 will allow developers to simplify their projects and take advantage of Python 3's new features.

On the web development side of things, Django, Python's most popular full-stack web framework, has already dropped support for Python 2.

If you already built your codebase using Python 2 and you want to keep benefiting from official and third party support after 2020, you should port it to Python 3. Don’t worry! There's an extensive guide available in the official documentation

TLDR

To summarize the above, if you're starting a new project, simply use Python 3. It will be the only version with official support starting from 2020, it has better features and performance, it much harder to shoot yourself in the foot with unicode, and it will end up having better library support in the future.