summaryrefslogtreecommitdiffstats
path: root/docs/basic/params.rst
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--docs/basic/params.rst242
1 files changed, 242 insertions, 0 deletions
diff --git a/docs/basic/params.rst b/docs/basic/params.rst
new file mode 100644
index 0000000..a733f07
--- /dev/null
+++ b/docs/basic/params.rst
@@ -0,0 +1,242 @@
+.. currentmodule:: psycopg
+
+.. index::
+ pair: Query; Parameters
+
+.. _query-parameters:
+
+Passing parameters to SQL queries
+=================================
+
+Most of the times, writing a program you will have to mix bits of SQL
+statements with values provided by the rest of the program:
+
+.. code::
+
+ SELECT some, fields FROM some_table WHERE id = ...
+
+:sql:`id` equals what? Probably you will have a Python value you are looking
+for.
+
+
+`!execute()` arguments
+----------------------
+
+Passing parameters to a SQL statement happens in functions such as
+`Cursor.execute()` by using ``%s`` placeholders in the SQL statement, and
+passing a sequence of values as the second argument of the function. For
+example the Python function call:
+
+.. code:: python
+
+ cur.execute("""
+ INSERT INTO some_table (id, created_at, last_name)
+ VALUES (%s, %s, %s);
+ """,
+ (10, datetime.date(2020, 11, 18), "O'Reilly"))
+
+is *roughly* equivalent to the SQL command:
+
+.. code-block:: sql
+
+ INSERT INTO some_table (id, created_at, last_name)
+ VALUES (10, '2020-11-18', 'O''Reilly');
+
+Note that the parameters will not be really merged to the query: query and the
+parameters are sent to the server separately: see :ref:`server-side-binding`
+for details.
+
+Named arguments are supported too using :samp:`%({name})s` placeholders in the
+query and specifying the values into a mapping. Using named arguments allows
+to specify the values in any order and to repeat the same value in several
+places in the query::
+
+ cur.execute("""
+ INSERT INTO some_table (id, created_at, updated_at, last_name)
+ VALUES (%(id)s, %(created)s, %(created)s, %(name)s);
+ """,
+ {'id': 10, 'name': "O'Reilly", 'created': datetime.date(2020, 11, 18)})
+
+Using characters ``%``, ``(``, ``)`` in the argument names is not supported.
+
+When parameters are used, in order to include a literal ``%`` in the query you
+can use the ``%%`` string::
+
+ cur.execute("SELECT (%s % 2) = 0 AS even", (10,)) # WRONG
+ cur.execute("SELECT (%s %% 2) = 0 AS even", (10,)) # correct
+
+While the mechanism resembles regular Python strings manipulation, there are a
+few subtle differences you should care about when passing parameters to a
+query.
+
+- The Python string operator ``%`` *must not be used*: the `~cursor.execute()`
+ method accepts a tuple or dictionary of values as second parameter.
+ |sql-warn|__:
+
+ .. |sql-warn| replace:: **Never** use ``%`` or ``+`` to merge values
+ into queries
+
+ .. code:: python
+
+ cur.execute("INSERT INTO numbers VALUES (%s, %s)" % (10, 20)) # WRONG
+ cur.execute("INSERT INTO numbers VALUES (%s, %s)", (10, 20)) # correct
+
+ .. __: sql-injection_
+
+- For positional variables binding, *the second argument must always be a
+ sequence*, even if it contains a single variable (remember that Python
+ requires a comma to create a single element tuple)::
+
+ cur.execute("INSERT INTO foo VALUES (%s)", "bar") # WRONG
+ cur.execute("INSERT INTO foo VALUES (%s)", ("bar")) # WRONG
+ cur.execute("INSERT INTO foo VALUES (%s)", ("bar",)) # correct
+ cur.execute("INSERT INTO foo VALUES (%s)", ["bar"]) # correct
+
+- The placeholder *must not be quoted*::
+
+ cur.execute("INSERT INTO numbers VALUES ('%s')", ("Hello",)) # WRONG
+ cur.execute("INSERT INTO numbers VALUES (%s)", ("Hello",)) # correct
+
+- The variables placeholder *must always be a* ``%s``, even if a different
+ placeholder (such as a ``%d`` for integers or ``%f`` for floats) may look
+ more appropriate for the type. You may find other placeholders used in
+ Psycopg queries (``%b`` and ``%t``) but they are not related to the
+ type of the argument: see :ref:`binary-data` if you want to read more::
+
+ cur.execute("INSERT INTO numbers VALUES (%d)", (10,)) # WRONG
+ cur.execute("INSERT INTO numbers VALUES (%s)", (10,)) # correct
+
+- Only query values should be bound via this method: it shouldn't be used to
+ merge table or field names to the query. If you need to generate SQL queries
+ dynamically (for instance choosing a table name at runtime) you can use the
+ functionalities provided in the `psycopg.sql` module::
+
+ cur.execute("INSERT INTO %s VALUES (%s)", ('numbers', 10)) # WRONG
+ cur.execute( # correct
+ SQL("INSERT INTO {} VALUES (%s)").format(Identifier('numbers')),
+ (10,))
+
+
+.. index:: Security, SQL injection
+
+.. _sql-injection:
+
+Danger: SQL injection
+---------------------
+
+The SQL representation of many data types is often different from their Python
+string representation. The typical example is with single quotes in strings:
+in SQL single quotes are used as string literal delimiters, so the ones
+appearing inside the string itself must be escaped, whereas in Python single
+quotes can be left unescaped if the string is delimited by double quotes.
+
+Because of the difference, sometimes subtle, between the data types
+representations, a naïve approach to query strings composition, such as using
+Python strings concatenation, is a recipe for *terrible* problems::
+
+ SQL = "INSERT INTO authors (name) VALUES ('%s')" # NEVER DO THIS
+ data = ("O'Reilly", )
+ cur.execute(SQL % data) # THIS WILL FAIL MISERABLY
+ # SyntaxError: syntax error at or near "Reilly"
+
+If the variables containing the data to send to the database come from an
+untrusted source (such as data coming from a form on a web site) an attacker
+could easily craft a malformed string, either gaining access to unauthorized
+data or performing destructive operations on the database. This form of attack
+is called `SQL injection`_ and is known to be one of the most widespread forms
+of attack on database systems. Before continuing, please print `this page`__
+as a memo and hang it onto your desk.
+
+.. _SQL injection: https://en.wikipedia.org/wiki/SQL_injection
+.. __: https://xkcd.com/327/
+
+Psycopg can :ref:`automatically convert Python objects to SQL
+values<types-adaptation>`: using this feature your code will be more robust
+and reliable. We must stress this point:
+
+.. warning::
+
+ - Don't manually merge values to a query: hackers from a foreign country
+ will break into your computer and steal not only your disks, but also
+ your cds, leaving you only with the three most embarrassing records you
+ ever bought. On cassette tapes.
+
+ - If you use the ``%`` operator to merge values to a query, con artists
+ will seduce your cat, who will run away taking your credit card
+ and your sunglasses with them.
+
+ - If you use ``+`` to merge a textual value to a string, bad guys in
+ balaclava will find their way to your fridge, drink all your beer, and
+ leave your toilet seat up and your toilet paper in the wrong orientation.
+
+ - You don't want to manually merge values to a query: :ref:`use the
+ provided methods <query-parameters>` instead.
+
+The correct way to pass variables in a SQL command is using the second
+argument of the `Cursor.execute()` method::
+
+ SQL = "INSERT INTO authors (name) VALUES (%s)" # Note: no quotes
+ data = ("O'Reilly", )
+ cur.execute(SQL, data) # Note: no % operator
+
+.. note::
+
+ Python static code checkers are not quite there yet, but, in the future,
+ it will be possible to check your code for improper use of string
+ expressions in queries. See :ref:`literal-string` for details.
+
+.. seealso::
+
+ Now that you know how to pass parameters to queries, you can take a look
+ at :ref:`how Psycopg converts data types <types-adaptation>`.
+
+
+.. index::
+ pair: Binary; Parameters
+
+.. _binary-data:
+
+Binary parameters and results
+-----------------------------
+
+PostgreSQL has two different ways to transmit data between client and server:
+`~psycopg.pq.Format.TEXT`, always available, and `~psycopg.pq.Format.BINARY`,
+available most of the times but not always. Usually the binary format is more
+efficient to use.
+
+Psycopg can support both formats for each data type. Whenever a value
+is passed to a query using the normal ``%s`` placeholder, the best format
+available is chosen (often, but not always, the binary format is picked as the
+best choice).
+
+If you have a reason to select explicitly the binary format or the text format
+for a value you can use respectively a ``%b`` placeholder or a ``%t``
+placeholder instead of the normal ``%s``. `~Cursor.execute()` will fail if a
+`~psycopg.adapt.Dumper` for the right data type and format is not available.
+
+The same two formats, text or binary, are used by PostgreSQL to return data
+from a query to the client. Unlike with parameters, where you can choose the
+format value-by-value, all the columns returned by a query will have the same
+format. Every type returned by the query should have a `~psycopg.adapt.Loader`
+configured, otherwise the data will be returned as unparsed `!str` (for text
+results) or buffer (for binary results).
+
+.. note::
+ The `pg_type`_ table defines which format is supported for each PostgreSQL
+ data type. Text input/output is managed by the functions declared in the
+ ``typinput`` and ``typoutput`` fields (always present), binary
+ input/output is managed by the ``typsend`` and ``typreceive`` (which are
+ optional).
+
+ .. _pg_type: https://www.postgresql.org/docs/current/catalog-pg-type.html
+
+Because not every PostgreSQL type supports binary output, by default, the data
+will be returned in text format. In order to return data in binary format you
+can create the cursor using `Connection.cursor`\ `!(binary=True)` or execute
+the query using `Cursor.execute`\ `!(binary=True)`. A case in which
+requesting binary results is a clear winner is when you have large binary data
+in the database, such as images::
+
+ cur.execute(
+ "SELECT image_data FROM images WHERE id = %s", [image_id], binary=True)
+ data = cur.fetchone()[0]