Writing Great Docstrings in Python

Have you ever tried to understand a new project by looking at the source code only to find that the code isn’t clear on its own and is lacking documentation, such as docstrings? I have had that experience a few times which slowed down being able to fix bugs and add new features and frequently also meant that the code wasn’t well structured. In this post we’ll look at best practices for documenting in code and its numerous benefits such as helping you be clear on what you actually need to code and reminding yourself and others about how the code works when you come back from an extended holiday. We will also look at a linter that checks your docstrings to make sure they are complete. Your future self will then never be frustrated about a lack of documentation in code again!

Continue reading →

Giving and Receiving Great Feedback through PRs

Do you struggle with PRs? Have you ever had to change code even though you disagreed with the change just to land the PR? Have you ever given feedback that would have improved the code only to get into a comment war? We’ll discuss how to give and receive feedback to extract maximum value from it and avoid all the communication problems that come with PRs. We’ll start with some thoughts about what PRs are intended to achieve and then first discuss how to give feedback that will be well received and result in improvements to the code followed by how to extract maximum value from feedback you receive without agreeing to suboptimal changes. Finally, we will look at a checklist for giving and receiving feedback you can use as you go through reviews both as an author and reviewer.

Continue reading →

Writing Great Test Documentation

Have you ever needed to understand a new project and started reading the tests only to find that you have no idea what the tests are doing? Writing great test documentation as you are writing tests will improve your tests and help you and others reading the tests later. We will first look at why test documentation is important both when writing tests and for future readers and then look at a framework that helps give some structure to your test documentation. Next, we will look at a showcase of the flake8-test-docs tool that automates test documentation checks to ensure your documentation is great! Finally, we briefly discuss how this framework would apply in more advanced cases, such as when you are using fixtures or parametrising tests.

Continue reading →

Help your Users fix your Errors

Have you ever encountered an error when using a package and then gone to Google to find out how to solve the error only not to find any clear answers? Wouldn’t you have preferred to go directly to documentation that tells you exactly what went wrong and how to resolve that error? A lot of us can tell similar stories, especially when we try something new. I have also abandoned an otherwise promising package after encountering an error that wasn’t clear or when I didn’t know how to solve the error.

Continue reading →

Reduce Duplication in Pytest Parametrised Tests using the Walrus Operator

Do you find yourself having to repeat literal values (like strings and integers) in parametrised tests? I often find myself in this situation and have been looking for ways of reducing this duplication. To show an example, consider this trivial function:

def get_second_word(text: str) -> str | None:
    """Get the second word from text."""
    words = text.split()
    return words[1] if len(words) >= 2 else None

Continue reading →

Python TypedDict Arbitrary Key Names with Totality

In PEP 589 Python has introduced type hints for dictionaries. This helps with defining the structure of dictionaries where the keys and the types of the values are well known. TypedDict also supports keys that may not be present using the concept of totality. Using inheritance a dictionary where some keys are not required can be built. The default syntax for defining TypedDict is a class based syntax (example below). Whilst this is easy to understand, it does limit the names of the keys of the dictionary to valid python variable names. If a dictionary needs to include keys with, for example, a dash, the class based syntax no longer is appropriate for defining the TypedDict. There is an alternative syntax (similar to named tuples) that allows for arbitrary key names (example below). However, this syntax does not support inheritance which means that a dictionary with mixed totality cannot be constructed using this syntax alone.

Continue reading →

Inheritance for SQLAlchemy Models

In software engineering one of the key principles of object oriented software is the concept of inheritance. It can be used to increase code re-use which reduces the volume of tests and speeds up development. You can use inheritance in SQLAlchemy as described here. However, this inheritance is mainly used to describe relationships between tables and not for the purpose of re-using certain pieces of models defined elsewhere.

The openapi specification allows for inheritance using the allOf statement. This means that you could, for example, define a schema for id properties once and re-use that schema for any number of objects where you can customise things like the description that may differ object by object. You can also use allOf to combine objects, which is a powerful way of reducing duplication. You could, for example, define a base object with an id and name property that you then use repeatedly for other objects so that you don’t have to keep giving objects an id and a name.

If this feature could be brought to SQLAlchemy models, you would have a much shorter models.py files which is easier to maintain and understand. The plan for the openapi-SQLAlchemy package is to do just that. The first step has been completed with the addition of support for allOf for column definitions. If you aren’t familiar with the package, the Reducing API Code Duplication article describes the aims of the package.

To start using the column inheritance feature, read the documentation for the feature which describes it in detail and gives and example specification that makes use of it.

openapi-SQLAlchemy Now Supports $ref for Columns

A new version of openapi-SQLAlchemy has been release which adds support for $ref for columns. The package now supports openapi schemas such as the following:

components:
  schemas:
    Id:
      type: integer
      description: Unique identifier for the employee.
      example: 0
    Employee:
      description: Person that works for a company.
      type: object
      properties:
        id:
          $ref: "#/components/schemas/Id"
        name:
          type: string
          description: The name of the employee.
          example: David Andersson.

If you are interested in how this was accomplished with decorators and recursive programming, the following article describes the implementation: Python Recursive Decorators.

Python Recursive Decorators

In Python decorators are a useful tool for changing the behaviour of functions without modifying the original function. For a recent release of openapi-SQLAlchemy, which added support for references among table columns, I used decorators to de-reference columns. I needed a way of supporting cases where a reference to a column was just a reference to another column. The solution was to essentially keep applying the decorator until the column was actually found.

What is a Column Reference?

If you are not familiar with openapi specifications, a simple schema for an object might be the following:

components:
  schemas:
    Employee:
      description: Person that works for a company.
      type: object
      properties:
        id:
          type: integer
          description: Unique identifier for the employee.
          example: 0
        name:
          type: string
          description: The name of the employee.
          example: David Andersson.

To be able to re-use the definition of the id property for another schema, you can do the following:

components:
  schemas:
    Id:
      type: integer
      description: Unique identifier for the employee.
      example: 0
    Employee:
      description: Person that works for a company.
      type: object
      properties:
        id:
          $ref: "#/components/schemas/Id"
        name:
          type: string
          description: The name of the employee.
          example: David Andersson.

In this case, the id property just references the Id schema. This could be done for other property definitions where the same schema applies to reduce duplicate schema definitions.

The Simple Case

The openapi-SQLAlchemy package allows you to map openapi schemas to SQLAlchemy models where an object becomes a table and a property becomes a column. The architecture of the package is broken into a factory for tables which then calls a column factory for each property of an object. The problem I had to solve was that ,when a reference is encountered, the column factory gets called with the following dictionary as the schema for the column (for the id column in this case):

{"$ref": "#/components/schemas/Id"}

instead of the schema for the column. Apart from doing some basic checks, what needs to happen is that the schema of Id needs to be found and the column factory be called with that schema instead of the reference. A perfect case for a decorator! The following is the code (minus some input checks):

_REF_PATTER = re.compile(r"^#\/components\/schemas\/(\w+)$")

def resolve_ref(func):
    """Resolve $ref schemas."""

    def inner(schema, schemas, **kwargs):
        """Replace function."""
        # Checking for $ref
        ref = schema.get("$ref")
        if ref is None:
            return func(schema=schema, **kwargs)

        # Retrieving new schema
        match = _REF_PATTER.match(ref)
        schema_name = match.group(1)
        ref_schema = schemas.get(schema_name)

        return func(schema=ref_schema, **kwargs)

    return inner

The first step is to check if the schema is a reference schema and call the column factory if not (lines 3-4). If it is a reference, then the referenced schema needs to be retrieved (lines 14-16) and the factory called with the referenced schema instead (line 18).

The Recursive Case

The problem with the simple decorator is that the following openapi specification is valid:

components:
  schemas:
    Id:
      $ref: "#/components/schemas/RefId"
    RefId:
      type: integer
      description: Unique identifier for the employee.
      example: 0
    Employee:
      description: Person that works for a company.
      type: object
      x-tablename: employee
      properties:
        id:
          $ref: "#/components/schemas/Id"
        name:
          type: string
          description: The name of the employee.
          example: David Andersson.

Noe that the Id schema just references the RefId schema. When this schema is used, the column factory would now be called with:

{"$ref": "#/components/schemas/RefId"}

That looks a lot like the original problem! The solution is that the decorator needs to be applied again. This also looks like a case for recursive programming. In recursive programming, you have the base case and the recursive case. The base case sounds a lot like the code that checks for whether the schema is a reference (lines 3-4 above). The recursive case needs to take a step towards the base case and apply the function again. In this case this is done by resolving one reference and then calling the decorator again. This means that on line 18 above we need to somehow apply the decorator again. The solution is actually quite simple, the code needs to be changed to the following:

_REF_PATTER = re.compile(r"^#\/components\/schemas\/(\w+)$")

def resolve_ref(func):
    """Resolve $ref schemas."""

    def inner(schema, schemas, **kwargs):
        """Replace function."""
        # Checking for $ref
        ref = schema.get("$ref")
        if ref is None:
            return func(schema=schema, **kwargs)

        # Retrieving new schema
        match = _REF_PATTER.match(ref)
        schema_name = match.group(1)
        ref_schema = schemas.get(schema_name)

        return inner(schema=ref_schema, schemas=schemas, **kwargs)

    return inner

The change is that inner is called instead of the original function func. We then also need to pass in all the required arguments for inner, which in this case, means passing through the schemas which func does not need.

Conclusion

Recursive decorators combine the ideas behind decorators and recursive programming. If the problem a decorator solves can be broken into steps where, after each step, the decorator might need to be applied again, you might have a case for a recursive decorator. The decorator must then implement a base case where the decorator is not applied again and a recursive case where one step is taken towards the base case and the decorator is applied again.

openapi-SQLAlchemy Update

The first alpha version of openapi-SQLAlchemy has been released! It is basically what I wrote about in the post on Reducing API Code Duplication. The project is available on PyPI here: openapi-SQLAlchemy. To read more documentation and about how to use the project you can visit the GitHub repository.