diff options
Diffstat (limited to 'third_party/python/voluptuous/README.md')
-rw-r--r-- | third_party/python/voluptuous/README.md | 723 |
1 files changed, 723 insertions, 0 deletions
diff --git a/third_party/python/voluptuous/README.md b/third_party/python/voluptuous/README.md new file mode 100644 index 0000000000..46e2288f4b --- /dev/null +++ b/third_party/python/voluptuous/README.md @@ -0,0 +1,723 @@ +# Voluptuous is a Python data validation library + +[![Build Status](https://travis-ci.org/alecthomas/voluptuous.png)](https://travis-ci.org/alecthomas/voluptuous) +[![Coverage Status](https://coveralls.io/repos/github/alecthomas/voluptuous/badge.svg?branch=master)](https://coveralls.io/github/alecthomas/voluptuous?branch=master) [![Gitter chat](https://badges.gitter.im/alecthomas.png)](https://gitter.im/alecthomas/Lobby) + +Voluptuous, *despite* the name, is a Python data validation library. It +is primarily intended for validating data coming into Python as JSON, +YAML, etc. + +It has three goals: + +1. Simplicity. +2. Support for complex data structures. +3. Provide useful error messages. + +## Contact + +Voluptuous now has a mailing list! Send a mail to +[<voluptuous@librelist.com>](mailto:voluptuous@librelist.com) to subscribe. Instructions +will follow. + +You can also contact me directly via [email](mailto:alec@swapoff.org) or +[Twitter](https://twitter.com/alecthomas). + +To file a bug, create a [new issue](https://github.com/alecthomas/voluptuous/issues/new) on GitHub with a short example of how to replicate the issue. + +## Documentation + +The documentation is provided [here](http://alecthomas.github.io/voluptuous/). + +## Changelog + +See [CHANGELOG.md](https://github.com/alecthomas/voluptuous/blob/master/CHANGELOG.md). + +## Show me an example + +Twitter's [user search API](https://dev.twitter.com/rest/reference/get/users/search) accepts +query URLs like: + +``` +$ curl 'https://api.twitter.com/1.1/users/search.json?q=python&per_page=20&page=1' +``` + +To validate this we might use a schema like: + +```pycon +>>> from voluptuous import Schema +>>> schema = Schema({ +... 'q': str, +... 'per_page': int, +... 'page': int, +... }) + +``` + +This schema very succinctly and roughly describes the data required by +the API, and will work fine. But it has a few problems. Firstly, it +doesn't fully express the constraints of the API. According to the API, +`per_page` should be restricted to at most 20, defaulting to 5, for +example. To describe the semantics of the API more accurately, our +schema will need to be more thoroughly defined: + +```pycon +>>> from voluptuous import Required, All, Length, Range +>>> schema = Schema({ +... Required('q'): All(str, Length(min=1)), +... Required('per_page', default=5): All(int, Range(min=1, max=20)), +... 'page': All(int, Range(min=0)), +... }) + +``` + +This schema fully enforces the interface defined in Twitter's +documentation, and goes a little further for completeness. + +"q" is required: + +```pycon +>>> from voluptuous import MultipleInvalid, Invalid +>>> try: +... schema({}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "required key not provided @ data['q']" +True + +``` + +...must be a string: + +```pycon +>>> try: +... schema({'q': 123}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "expected str for dictionary value @ data['q']" +True + +``` + +...and must be at least one character in length: + +```pycon +>>> try: +... schema({'q': ''}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "length of value must be at least 1 for dictionary value @ data['q']" +True +>>> schema({'q': '#topic'}) == {'q': '#topic', 'per_page': 5} +True + +``` + +"per\_page" is a positive integer no greater than 20: + +```pycon +>>> try: +... schema({'q': '#topic', 'per_page': 900}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "value must be at most 20 for dictionary value @ data['per_page']" +True +>>> try: +... schema({'q': '#topic', 'per_page': -10}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "value must be at least 1 for dictionary value @ data['per_page']" +True + +``` + +"page" is an integer \>= 0: + +```pycon +>>> try: +... schema({'q': '#topic', 'per_page': 'one'}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) +"expected int for dictionary value @ data['per_page']" +>>> schema({'q': '#topic', 'page': 1}) == {'q': '#topic', 'page': 1, 'per_page': 5} +True + +``` + +## Defining schemas + +Schemas are nested data structures consisting of dictionaries, lists, +scalars and *validators*. Each node in the input schema is pattern +matched against corresponding nodes in the input data. + +### Literals + +Literals in the schema are matched using normal equality checks: + +```pycon +>>> schema = Schema(1) +>>> schema(1) +1 +>>> schema = Schema('a string') +>>> schema('a string') +'a string' + +``` + +### Types + +Types in the schema are matched by checking if the corresponding value +is an instance of the type: + +```pycon +>>> schema = Schema(int) +>>> schema(1) +1 +>>> try: +... schema('one') +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "expected int" +True + +``` + +### URL's + +URL's in the schema are matched by using `urlparse` library. + +```pycon +>>> from voluptuous import Url +>>> schema = Schema(Url()) +>>> schema('http://w3.org') +'http://w3.org' +>>> try: +... schema('one') +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "expected a URL" +True + +``` + +### Lists + +Lists in the schema are treated as a set of valid values. Each element +in the schema list is compared to each value in the input data: + +```pycon +>>> schema = Schema([1, 'a', 'string']) +>>> schema([1]) +[1] +>>> schema([1, 1, 1]) +[1, 1, 1] +>>> schema(['a', 1, 'string', 1, 'string']) +['a', 1, 'string', 1, 'string'] + +``` + +However, an empty list (`[]`) is treated as is. If you want to specify a list that can +contain anything, specify it as `list`: + +```pycon +>>> schema = Schema([]) +>>> try: +... schema([1]) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "not a valid value @ data[1]" +True +>>> schema([]) +[] +>>> schema = Schema(list) +>>> schema([]) +[] +>>> schema([1, 2]) +[1, 2] + +``` + +### Sets and frozensets + +Sets and frozensets are treated as a set of valid values. Each element +in the schema set is compared to each value in the input data: + +```pycon +>>> schema = Schema({42}) +>>> schema({42}) == {42} +True +>>> try: +... schema({43}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "invalid value in set" +True +>>> schema = Schema({int}) +>>> schema({1, 2, 3}) == {1, 2, 3} +True +>>> schema = Schema({int, str}) +>>> schema({1, 2, 'abc'}) == {1, 2, 'abc'} +True +>>> schema = Schema(frozenset([int])) +>>> try: +... schema({3}) +... raise AssertionError('Invalid not raised') +... except Invalid as e: +... exc = e +>>> str(exc) == 'expected a frozenset' +True + +``` + +However, an empty set (`set()`) is treated as is. If you want to specify a set +that can contain anything, specify it as `set`: + +```pycon +>>> schema = Schema(set()) +>>> try: +... schema({1}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "invalid value in set" +True +>>> schema(set()) == set() +True +>>> schema = Schema(set) +>>> schema({1, 2}) == {1, 2} +True + +``` + +### Validation functions + +Validators are simple callables that raise an `Invalid` exception when +they encounter invalid data. The criteria for determining validity is +entirely up to the implementation; it may check that a value is a valid +username with `pwd.getpwnam()`, it may check that a value is of a +specific type, and so on. + +The simplest kind of validator is a Python function that raises +ValueError when its argument is invalid. Conveniently, many builtin +Python functions have this property. Here's an example of a date +validator: + +```pycon +>>> from datetime import datetime +>>> def Date(fmt='%Y-%m-%d'): +... return lambda v: datetime.strptime(v, fmt) + +``` + +```pycon +>>> schema = Schema(Date()) +>>> schema('2013-03-03') +datetime.datetime(2013, 3, 3, 0, 0) +>>> try: +... schema('2013-03') +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "not a valid value" +True + +``` + +In addition to simply determining if a value is valid, validators may +mutate the value into a valid form. An example of this is the +`Coerce(type)` function, which returns a function that coerces its +argument to the given type: + +```python +def Coerce(type, msg=None): + """Coerce a value to a type. + + If the type constructor throws a ValueError, the value will be marked as + Invalid. + """ + def f(v): + try: + return type(v) + except ValueError: + raise Invalid(msg or ('expected %s' % type.__name__)) + return f + +``` + +This example also shows a common idiom where an optional human-readable +message can be provided. This can vastly improve the usefulness of the +resulting error messages. + +### Dictionaries + +Each key-value pair in a schema dictionary is validated against each +key-value pair in the corresponding data dictionary: + +```pycon +>>> schema = Schema({1: 'one', 2: 'two'}) +>>> schema({1: 'one'}) +{1: 'one'} + +``` + +#### Extra dictionary keys + +By default any additional keys in the data, not in the schema will +trigger exceptions: + +```pycon +>>> schema = Schema({2: 3}) +>>> try: +... schema({1: 2, 2: 3}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "extra keys not allowed @ data[1]" +True + +``` + +This behaviour can be altered on a per-schema basis. To allow +additional keys use +`Schema(..., extra=ALLOW_EXTRA)`: + +```pycon +>>> from voluptuous import ALLOW_EXTRA +>>> schema = Schema({2: 3}, extra=ALLOW_EXTRA) +>>> schema({1: 2, 2: 3}) +{1: 2, 2: 3} + +``` + +To remove additional keys use +`Schema(..., extra=REMOVE_EXTRA)`: + +```pycon +>>> from voluptuous import REMOVE_EXTRA +>>> schema = Schema({2: 3}, extra=REMOVE_EXTRA) +>>> schema({1: 2, 2: 3}) +{2: 3} + +``` + +It can also be overridden per-dictionary by using the catch-all marker +token `extra` as a key: + +```pycon +>>> from voluptuous import Extra +>>> schema = Schema({1: {Extra: object}}) +>>> schema({1: {'foo': 'bar'}}) +{1: {'foo': 'bar'}} + +``` + +#### Required dictionary keys + +By default, keys in the schema are not required to be in the data: + +```pycon +>>> schema = Schema({1: 2, 3: 4}) +>>> schema({3: 4}) +{3: 4} + +``` + +Similarly to how extra\_ keys work, this behaviour can be overridden +per-schema: + +```pycon +>>> schema = Schema({1: 2, 3: 4}, required=True) +>>> try: +... schema({3: 4}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "required key not provided @ data[1]" +True + +``` + +And per-key, with the marker token `Required(key)`: + +```pycon +>>> schema = Schema({Required(1): 2, 3: 4}) +>>> try: +... schema({3: 4}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "required key not provided @ data[1]" +True +>>> schema({1: 2}) +{1: 2} + +``` + +#### Optional dictionary keys + +If a schema has `required=True`, keys may be individually marked as +optional using the marker token `Optional(key)`: + +```pycon +>>> from voluptuous import Optional +>>> schema = Schema({1: 2, Optional(3): 4}, required=True) +>>> try: +... schema({}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "required key not provided @ data[1]" +True +>>> schema({1: 2}) +{1: 2} +>>> try: +... schema({1: 2, 4: 5}) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "extra keys not allowed @ data[4]" +True + +``` + +```pycon +>>> schema({1: 2, 3: 4}) +{1: 2, 3: 4} + +``` + +### Recursive / nested schema + +You can use `voluptuous.Self` to define a nested schema: + +```pycon +>>> from voluptuous import Schema, Self +>>> recursive = Schema({"more": Self, "value": int}) +>>> recursive({"more": {"value": 42}, "value": 41}) == {'more': {'value': 42}, 'value': 41} +True + +``` + +### Extending an existing Schema + +Often it comes handy to have a base `Schema` that is extended with more +requirements. In that case you can use `Schema.extend` to create a new +`Schema`: + +```pycon +>>> from voluptuous import Schema +>>> person = Schema({'name': str}) +>>> person_with_age = person.extend({'age': int}) +>>> sorted(list(person_with_age.schema.keys())) +['age', 'name'] + +``` + +The original `Schema` remains unchanged. + +### Objects + +Each key-value pair in a schema dictionary is validated against each +attribute-value pair in the corresponding object: + +```pycon +>>> from voluptuous import Object +>>> class Structure(object): +... def __init__(self, q=None): +... self.q = q +... def __repr__(self): +... return '<Structure(q={0.q!r})>'.format(self) +... +>>> schema = Schema(Object({'q': 'one'}, cls=Structure)) +>>> schema(Structure(q='one')) +<Structure(q='one')> + +``` + +### Allow None values + +To allow value to be None as well, use Any: + +```pycon +>>> from voluptuous import Any + +>>> schema = Schema(Any(None, int)) +>>> schema(None) +>>> schema(5) +5 + +``` + +## Error reporting + +Validators must throw an `Invalid` exception if invalid data is passed +to them. All other exceptions are treated as errors in the validator and +will not be caught. + +Each `Invalid` exception has an associated `path` attribute representing +the path in the data structure to our currently validating value, as well +as an `error_message` attribute that contains the message of the original +exception. This is especially useful when you want to catch `Invalid` +exceptions and give some feedback to the user, for instance in the context of +an HTTP API. + + +```pycon +>>> def validate_email(email): +... """Validate email.""" +... if not "@" in email: +... raise Invalid("This email is invalid.") +... return email +>>> schema = Schema({"email": validate_email}) +>>> exc = None +>>> try: +... schema({"email": "whatever"}) +... except MultipleInvalid as e: +... exc = e +>>> str(exc) +"This email is invalid. for dictionary value @ data['email']" +>>> exc.path +['email'] +>>> exc.msg +'This email is invalid.' +>>> exc.error_message +'This email is invalid.' + +``` + +The `path` attribute is used during error reporting, but also during matching +to determine whether an error should be reported to the user or if the next +match should be attempted. This is determined by comparing the depth of the +path where the check is, to the depth of the path where the error occurred. If +the error is more than one level deeper, it is reported. + +The upshot of this is that *matching is depth-first and fail-fast*. + +To illustrate this, here is an example schema: + +```pycon +>>> schema = Schema([[2, 3], 6]) + +``` + +Each value in the top-level list is matched depth-first in-order. Given +input data of `[[6]]`, the inner list will match the first element of +the schema, but the literal `6` will not match any of the elements of +that list. This error will be reported back to the user immediately. No +backtracking is attempted: + +```pycon +>>> try: +... schema([[6]]) +... raise AssertionError('MultipleInvalid not raised') +... except MultipleInvalid as e: +... exc = e +>>> str(exc) == "not a valid value @ data[0][0]" +True + +``` + +If we pass the data `[6]`, the `6` is not a list type and so will not +recurse into the first element of the schema. Matching will continue on +to the second element in the schema, and succeed: + +```pycon +>>> schema([6]) +[6] + +``` + +## Multi-field validation + +Validation rules that involve multiple fields can be implemented as +custom validators. It's recommended to use `All()` to do a two-pass +validation - the first pass checking the basic structure of the data, +and only after that, the second pass applying your cross-field +validator: + +```python +def passwords_must_match(passwords): + if passwords['password'] != passwords['password_again']: + raise Invalid('passwords must match') + return passwords + +s=Schema(All( + # First "pass" for field types + {'password':str, 'password_again':str}, + # Follow up the first "pass" with your multi-field rules + passwords_must_match +)) + +# valid +s({'password':'123', 'password_again':'123'}) + +# raises MultipleInvalid: passwords must match +s({'password':'123', 'password_again':'and now for something completely different'}) + +``` + +With this structure, your multi-field validator will run with +pre-validated data from the first "pass" and so will not have to do +its own type checking on its inputs. + +The flipside is that if the first "pass" of validation fails, your +cross-field validator will not run: + +``` +# raises Invalid because password_again is not a string +# passwords_must_match() will not run because first-pass validation already failed +s({'password':'123', 'password_again': 1337}) +``` + +## Running tests. + +Voluptuous is using nosetests: + + $ nosetests + + +## Why use Voluptuous over another validation library? + +**Validators are simple callables** +: No need to subclass anything, just use a function. + +**Errors are simple exceptions.** +: A validator can just `raise Invalid(msg)` and expect the user to get +useful messages. + +**Schemas are basic Python data structures.** +: Should your data be a dictionary of integer keys to strings? +`{int: str}` does what you expect. List of integers, floats or +strings? `[int, float, str]`. + +**Designed from the ground up for validating more than just forms.** +: Nested data structures are treated in the same way as any other +type. Need a list of dictionaries? `[{}]` + +**Consistency.** +: Types in the schema are checked as types. Values are compared as +values. Callables are called to validate. Simple. + +## Other libraries and inspirations + +Voluptuous is heavily inspired by +[Validino](http://code.google.com/p/validino/), and to a lesser extent, +[jsonvalidator](http://code.google.com/p/jsonvalidator/) and +[json\_schema](http://blog.sendapatch.se/category/json_schema.html). + +[pytest-voluptuous](https://github.com/F-Secure/pytest-voluptuous) is a +[pytest](https://github.com/pytest-dev/pytest) plugin that helps in +using voluptuous validators in `assert`s. + +I greatly prefer the light-weight style promoted by these libraries to +the complexity of libraries like FormEncode. |