summaryrefslogtreecommitdiffstats
path: root/docs/docsite/rst/dev_guide/developing_python_3.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/docsite/rst/dev_guide/developing_python_3.rst')
-rw-r--r--docs/docsite/rst/dev_guide/developing_python_3.rst393
1 files changed, 393 insertions, 0 deletions
diff --git a/docs/docsite/rst/dev_guide/developing_python_3.rst b/docs/docsite/rst/dev_guide/developing_python_3.rst
new file mode 100644
index 0000000..74385c4
--- /dev/null
+++ b/docs/docsite/rst/dev_guide/developing_python_3.rst
@@ -0,0 +1,393 @@
+.. _developing_python_3:
+
+********************
+Ansible and Python 3
+********************
+
+The ``ansible-core`` code runs Python 3 (for specific versions check :ref:`Control Node Requirements <control_node_requirements>`
+Contributors to ``ansible-core`` and to Ansible Collections should be aware of the tips in this document so that they can write code
+that will run on the same versions of Python as the rest of Ansible.
+
+.. contents::
+ :local:
+
+We do have some considerations depending on the types of Ansible code:
+
+1. controller-side code - code that runs on the machine where you invoke :command:`/usr/bin/ansible`, only needs to support the controller's Python versions.
+2. modules - the code which Ansible transmits to and invokes on the managed machine. Modules need to support the 'managed node' Python versions, with some exceptions.
+3. shared ``module_utils`` code - the common code that is used by modules to perform tasks and sometimes used by controller-side code as well. Shared ``module_utils`` code needs to support the same range of Python as the modules.
+
+However, the three types of code do not use the same string strategy. If you're developing a module or some ``module_utils`` code, be sure to read the section on string strategy carefully.
+
+.. note:
+ - While modules can be written in any language, the above applies to code contributed to the core project, which only supports specific Python versions and Powershell for Windows.
+
+Minimum version of Python 3.x and Python 2.x
+============================================
+
+See :ref:`Control Node Requirements <control_node_requirements>` and :ref:`Managed Node Requirements <managed_node_requirements>` for the
+specific versions supported.
+
+Your custom modules can support any version of Python (or other languages) you want, but the above are the requirements for the code contributed to the Ansible project.
+
+Developing Ansible code that supports Python 2 and Python 3
+===========================================================
+
+The best place to start learning about writing code that supports both Python 2 and Python 3
+is `Lennart Regebro's book: Porting to Python 3 <http://python3porting.com/>`_.
+The book describes several strategies for porting to Python 3. The one we're
+using is `to support Python 2 and Python 3 from a single code base
+<http://python3porting.com/strategies.html#python-2-and-python-3-without-conversion>`_
+
+Understanding strings in Python 2 and Python 3
+----------------------------------------------
+
+Python 2 and Python 3 handle strings differently, so when you write code that supports Python 3
+you must decide what string model to use. Strings can be an array of bytes (like in C) or
+they can be an array of text. Text is what we think of as letters, digits,
+numbers, other printable symbols, and a small number of unprintable "symbols"
+(control codes).
+
+In Python 2, the two types for these (:class:`str <python:str>` for bytes and
+:func:`unicode <python:unicode>` for text) are often used interchangeably. When dealing only
+with ASCII characters, the strings can be combined, compared, and converted
+from one type to another automatically. When non-ASCII characters are
+introduced, Python 2 starts throwing exceptions due to not knowing what encoding
+the non-ASCII characters should be in.
+
+Python 3 changes this behavior by making the separation between bytes (:class:`bytes <python3:bytes>`)
+and text (:class:`str <python3:str>`) more strict. Python 3 will throw an exception when
+trying to combine and compare the two types. The programmer has to explicitly
+convert from one type to the other to mix values from each.
+
+In Python 3 it's immediately apparent to the programmer when code is
+mixing the byte and text types inappropriately, whereas in Python 2, code that mixes those types
+may work until a user causes an exception by entering non-ASCII input.
+Python 3 forces programmers to proactively define a strategy for
+working with strings in their program so that they don't mix text and byte strings unintentionally.
+
+Ansible uses different strategies for working with strings in controller-side code, in
+:ref: `modules <module_string_strategy>`, and in :ref:`module_utils <module_utils_string_strategy>` code.
+
+.. _controller_string_strategy:
+
+Controller string strategy: the Unicode Sandwich
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Until recently ``ansible-core`` supported Python 2.x and followed this strategy, known as the Unicode Sandwich (named
+after Python 2's :func:`unicode <python:unicode>` text type). For Unicode Sandwich we know that
+at the border of our code and the outside world (for example, file and network IO,
+environment variables, and some library calls) we are going to receive bytes.
+We need to transform these bytes into text and use that throughout the
+internal portions of our code. When we have to send those strings back out to
+the outside world we first convert the text back into bytes.
+To visualize this, imagine a 'sandwich' consisting of a top and bottom layer
+of bytes, a layer of conversion between, and all text type in the center.
+
+For compatibility reasons you will see a bunch of custom functions we developed (``to_text``/``to_bytes``/``to_native``)
+and while Python 2 is not a concern anymore we will continue to use them as they apply for other cases that make
+dealing with unicode problematic.
+
+While we will not be using it most of it anymore, the documentation below is still useful for those developing modules
+that still need to support both Python 2 and 3 simultaneously.
+
+Unicode Sandwich common borders: places to convert bytes to text in controller code
+-----------------------------------------------------------------------------------
+
+This is a partial list of places where we have to convert to and from bytes
+when using the Unicode Sandwich string strategy. It's not exhaustive but
+it gives you an idea of where to watch for problems.
+
+Reading and writing to files
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In Python 2, reading from files yields bytes. In Python 3, it can yield text.
+To make code that's portable to both we don't make use of Python 3's ability
+to yield text but instead do the conversion explicitly ourselves. For example:
+
+.. code-block:: python
+
+ from ansible.module_utils.common.text.converters import to_text
+
+ with open('filename-with-utf8-data.txt', 'rb') as my_file:
+ b_data = my_file.read()
+ try:
+ data = to_text(b_data, errors='surrogate_or_strict')
+ except UnicodeError:
+ # Handle the exception gracefully -- usually by displaying a good
+ # user-centric error message that can be traced back to this piece
+ # of code.
+ pass
+
+.. note:: Much of Ansible assumes that all encoded text is UTF-8. At some
+ point, if there is demand for other encodings we may change that, but for
+ now it is safe to assume that bytes are UTF-8.
+
+Writing to files is the opposite process:
+
+.. code-block:: python
+
+ from ansible.module_utils.common.text.converters import to_bytes
+
+ with open('filename.txt', 'wb') as my_file:
+ my_file.write(to_bytes(some_text_string))
+
+Note that we don't have to catch :exc:`UnicodeError` here because we're
+transforming to UTF-8 and all text strings in Python can be transformed back
+to UTF-8.
+
+Filesystem interaction
+^^^^^^^^^^^^^^^^^^^^^^
+
+Dealing with filenames often involves dropping back to bytes because on UNIX-like
+systems filenames are bytes. On Python 2, if we pass a text string to these
+functions, the text string will be converted to a byte string inside of the
+function and a traceback will occur if non-ASCII characters are present. In
+Python 3, a traceback will only occur if the text string can't be decoded in
+the current locale, but it's still good to be explicit and have code which
+works on both versions:
+
+.. code-block:: python
+
+ import os.path
+
+ from ansible.module_utils.common.text.converters import to_bytes
+
+ filename = u'/var/tmp/くらとみ.txt'
+ f = open(to_bytes(filename), 'wb')
+ mtime = os.path.getmtime(to_bytes(filename))
+ b_filename = os.path.expandvars(to_bytes(filename))
+ if os.path.exists(to_bytes(filename)):
+ pass
+
+When you are only manipulating a filename as a string without talking to the
+filesystem (or a C library which talks to the filesystem) you can often get
+away without converting to bytes:
+
+.. code-block:: python
+
+ import os.path
+
+ os.path.join(u'/var/tmp/café', u'くらとみ')
+ os.path.split(u'/var/tmp/café/くらとみ')
+
+On the other hand, if the code needs to manipulate the filename and also talk
+to the filesystem, it can be more convenient to transform to bytes right away
+and manipulate in bytes.
+
+.. warning:: Make sure all variables passed to a function are the same type.
+ If you're working with something like :func:`python3:os.path.join` which takes
+ multiple strings and uses them in combination, you need to make sure that
+ all the types are the same (either all bytes or all text). Mixing
+ bytes and text will cause tracebacks.
+
+Interacting with other programs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Interacting with other programs goes through the operating system and
+C libraries and operates on things that the UNIX kernel defines. These
+interfaces are all byte-oriented so the Python interface is byte oriented as
+well. On both Python 2 and Python 3, byte strings should be given to Python's
+subprocess library and byte strings should be expected back from it.
+
+One of the main places in Ansible's controller code that we interact with
+other programs is the connection plugins' ``exec_command`` methods. These
+methods transform any text strings they receive in the command (and arguments
+to the command) to execute into bytes and return stdout and stderr as byte strings
+Higher level functions (like action plugins' ``_low_level_execute_command``)
+transform the output into text strings.
+
+.. _module_string_strategy:
+
+Module string strategy: Native String
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In modules we use a strategy known as Native Strings. This makes things
+easier on the community members who maintain so many of Ansible's
+modules, by not breaking backwards compatibility by
+mandating that all strings inside of modules are text and converting between
+text and bytes at the borders.
+
+Native strings refer to the type that Python uses when you specify a bare
+string literal:
+
+.. code-block:: python
+
+ "This is a native string"
+
+In Python 2, these are byte strings. In Python 3 these are text strings. Modules should be
+coded to expect bytes on Python 2 and text on Python 3.
+
+.. _module_utils_string_strategy:
+
+Module_utils string strategy: hybrid
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In ``module_utils`` code we use a hybrid string strategy. Although Ansible's
+``module_utils`` code is largely like module code, some pieces of it are
+used by the controller as well. So it needs to be compatible with modules
+and with the controller's assumptions, particularly the string strategy.
+The module_utils code attempts to accept native strings as input
+to its functions and emit native strings as their output.
+
+In ``module_utils`` code:
+
+* Functions **must** accept string parameters as either text strings or byte strings.
+* Functions may return either the same type of string as they were given or the native string type for the Python version they are run on.
+* Functions that return strings **must** document whether they return strings of the same type as they were given or native strings.
+
+Module-utils functions are therefore often very defensive in nature.
+They convert their string parameters into text (using ``ansible.module_utils.common.text.converters.to_text``)
+at the beginning of the function, do their work, and then convert
+the return values into the native string type (using ``ansible.module_utils.common.text.converters.to_native``)
+or back to the string type that their parameters received.
+
+Tips, tricks, and idioms for Python 2/Python 3 compatibility
+------------------------------------------------------------
+
+Use forward-compatibility boilerplate
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use the following boilerplate code at the top of all python files
+to make certain constructs act the same way on Python 2 and Python 3:
+
+.. code-block:: python
+
+ # Make coding more python3-ish
+ from __future__ import (absolute_import, division, print_function)
+ __metaclass__ = type
+
+``__metaclass__ = type`` makes all classes defined in the file into new-style
+classes without explicitly inheriting from :class:`object <python3:object>`.
+
+The ``__future__`` imports do the following:
+
+:absolute_import: Makes imports look in :data:`sys.path <python3:sys.path>` for the modules being
+ imported, skipping the directory in which the module doing the importing
+ lives. If the code wants to use the directory in which the module doing
+ the importing, there's a new dot notation to do so.
+:division: Makes division of integers always return a float. If you need to
+ find the quotient use ``x // y`` instead of ``x / y``.
+:print_function: Changes :func:`print <python3:print>` from a keyword into a function.
+
+.. seealso::
+ * `PEP 0328: Absolute Imports <https://www.python.org/dev/peps/pep-0328/#guido-s-decision>`_
+ * `PEP 0238: Division <https://www.python.org/dev/peps/pep-0238>`_
+ * `PEP 3105: Print function <https://www.python.org/dev/peps/pep-3105>`_
+
+Prefix byte strings with ``b_``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since mixing text and bytes types leads to tracebacks we want to be clear
+about what variables hold text and what variables hold bytes. We do this by
+prefixing any variable holding bytes with ``b_``. For instance:
+
+.. code-block:: python
+
+ filename = u'/var/tmp/café.txt'
+ b_filename = to_bytes(filename)
+ with open(b_filename) as f:
+ data = f.read()
+
+We do not prefix the text strings instead because we only operate
+on byte strings at the borders, so there are fewer variables that need bytes
+than text.
+
+Import Ansible's bundled Python ``six`` library
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The third-party Python `six <https://pypi.org/project/six/>`_ library exists
+to help projects create code that runs on both Python 2 and Python 3. Ansible
+includes a version of the library in module_utils so that other modules can use it
+without requiring that it is installed on the remote system. To make use of
+it, import it like this:
+
+.. code-block:: python
+
+ from ansible.module_utils import six
+
+.. note:: Ansible can also use a system copy of six
+
+ Ansible will use a system copy of six if the system copy is a later
+ version than the one Ansible bundles.
+
+Handle exceptions with ``as``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In order for code to function on Python 2.6+ and Python 3, use the
+new exception-catching syntax which uses the ``as`` keyword:
+
+.. code-block:: python
+
+ try:
+ a = 2/0
+ except ValueError as e:
+ module.fail_json(msg="Tried to divide by zero: %s" % e)
+
+Do **not** use the following syntax as it will fail on every version of Python 3:
+
+.. This code block won't highlight because python2 isn't recognized. This is necessary to pass tests under python 3.
+.. code-block:: none
+
+ try:
+ a = 2/0
+ except ValueError, e:
+ module.fail_json(msg="Tried to divide by zero: %s" % e)
+
+Update octal numbers
+^^^^^^^^^^^^^^^^^^^^
+
+In Python 2.x, octal literals could be specified as ``0755``. In Python 3,
+octals must be specified as ``0o755``.
+
+String formatting for controller code
+-------------------------------------
+
+Use ``str.format()`` for Python 2.6 compatibility
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Starting in Python 2.6, strings gained a method called ``format()`` to put
+strings together. However, one commonly used feature of ``format()`` wasn't
+added until Python 2.7, so you need to remember not to use it in Ansible code:
+
+.. code-block:: python
+
+ # Does not work in Python 2.6!
+ new_string = "Dear {}, Welcome to {}".format(username, location)
+
+ # Use this instead
+ new_string = "Dear {0}, Welcome to {1}".format(username, location)
+
+Both of the format strings above map positional arguments of the ``format()``
+method into the string. However, the first version doesn't work in
+Python 2.6. Always remember to put numbers into the placeholders so the code
+is compatible with Python 2.6.
+
+.. seealso::
+ Python documentation on format strings:
+
+ - `format strings in 2.6 <https://docs.python.org/2.6/library/string.html#formatstrings>`_
+ - `format strings in 3.x <https://docs.python.org/3/library/string.html#formatstrings>`_
+
+Use percent format with byte strings
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In Python 3.x, byte strings do not have a ``format()`` method. However, it
+does have support for the older, percent-formatting.
+
+.. code-block:: python
+
+ b_command_line = b'ansible-playbook --become-user %s -K %s' % (user, playbook_file)
+
+.. note:: Percent formatting added in Python 3.5
+
+ Percent formatting of byte strings was added back into Python 3 in 3.5.
+ This isn't a problem for us because Python 3.5 is our minimum version.
+ However, if you happen to be testing Ansible code with Python 3.4 or
+ earlier, you will find that the byte string formatting here won't work.
+ Upgrade to Python 3.5 to test.
+
+.. seealso::
+ Python documentation on `percent formatting <https://docs.python.org/3/library/stdtypes.html#string-formatting>`_
+
+.. _testing_modules_python_3: