Some context, right now at $dayjob I work on the “platform team” which braodly means I have all the responsibilities of your typical ops/infra girl but I am also the developer of first and last resort for all code, 1st and 3rd party, we use across all our products. I promise it’s not as stressful as it sounds but it does give me a… unique view of codebases. I care a whole heck of a lot about things like logging, error handling, tracing, and stack traces.

Let’s take a look at some real logs from one of our products.

This request will be satisfied with Forms Support Level 8

Can’t verify CSRF token authenticity.

[S3.Client 206 0.067541 0 retries] head_object(Bucket="[FILTERED]", Key="[FILTERED]", PartNumber=1)

http_get on instance credentials

Using email attribute from ’email’: [REDACTED]

Completed unread_count

None

Like I’m not putting you on, this is all I get. There’s no context I’m holding back. You got any idea what the hell any of these mean? Where you might even start looking in the source to see where it’s generated? Me either. I mean the developers were clearly looking our for me by adding a log.source where it show’s me the class the log origi… aaaaannd they’re all JSONApiController. Guess I’m grepping for the string and hoping.

Okay let’s pretend that [REDACTED] was some obvious garbage value and do an analysis. Since we don’t have any code refs off to ag 'Using email attribute from' and huzzah!

    class JwtMixin:
      ...

      USER_CLAIM = "ourapp/username"
      EMAIL_CLAIM = "ourapp/email"

      @classmethod
      def find_email_claim(cls, decoded_token):
        key = next(for c in [cls.USER_CLAIM, cls.EMAIL_CLAIM] if decoded_token[c])
        token_value = decoded_token[key]

        logger.info(f"Using email attribute from '{key}': {token_value}")
        return token_value

Cool. Cool. From here I pretty much have two questions. Where did this method get called from and what was the value of decoded_token that is clearly causing some kind of problem?

Let’s break out that nice little find-references feature of my editor. Ugh, there’s like 20 different places this could have been called from. Great. Maybe I can figure that out with tracing? Haha what tracing? Guess I’ll play “match up the timestamps” with the frontend and lb logs and see what endpoint the user probably hit. Nice I actually got a hit! It’s /api/v2/notifications/unreads which let me check urls.py to see where it goes and JSONApiController. The black box into which all hope dies.

God I wish an actual error had occured because then I might have gotten a stack trace. Okay, let’s just make this codepath actually raise an error and run the tests. Okay it’s not so dire, all of the callsites wind up coming together at this authenticate method.

jwt_mixin.py:127 `find_email_claim'
jwt_mixin.py:100 `jwt_email'
jwt_mixin.py:31 `from_jwt'
jwt_backend.py:113 `user_from_token'
jwt_backend.py:47 `validate_request'
jwt_backend.py:14 `authenticate'

So now I just have to connect the dots and trace all the variables from authenticate to find_email_claim to determine the most likely path that garbage was introduced and the bug will probably show itself.

Or we could just have the logger do all of that.

Proof of concept.


from inspect import currentframe, getouterframes

class Logger:
  @classmethod
  def log(cls, message: str):
      f_cur = currentframe()
      assert f_cur is not None

      print(f"Message: {message}")
      for f in getouterframes(f_cur.f_back):
          print(f'File "{f.filename}", line {f.lineno}, in {f.function}')

          if f.code_context:
              print("\n".join(f.code_context), end="")

          if len(f.frame.f_locals.items()):
              print("Locals")
              for k, v in f.frame.f_locals.items():
                  print(f"    {k}={v!r}")

          if len(f.frame.f_code.co_freevars):
              print("Vars From Clojure")
              for k in f.frame.f_code.co_freevars:
                  print(f"    {k}")

and some code to use it

def outer(a=4, b=3):
    local = f"foo {a} {b}"

    def clojure():
        cloj = local + " bar"
        return middle([cloj, "meep"])

    return clojure()


def middle(x):
    return inner(*x)


def inner(c=15, d="bloop"):
    Logger.log(f"Doing inner {c} {d}")
    return "baz"

Gives us

Message: Doing inner foo 4 3 bar meep
File "[REDACTED]/ccc/logger.py", line 43, in inner
    Logger.log(f"Doing inner {c} {d}")
Locals
    c='foo 4 3 bar'
    d='meep'
File "[REDACTED]/ccc/logger.py", line 39, in middle
    return inner(*x)
Locals
    x=['foo 4 3 bar', 'meep']
File "[REDACTED]/ccc/logger.py", line 33, in clojure
        return middle([cloj, "meep"])
Locals
    cloj='foo 4 3 bar'
    local='foo 4 3'
Vars From Clojure
    local
File "[REDACTED]/ccc/logger.py", line 35, in outer
    return clojure()
Locals
    a=4
    b=3
    clojure=<function outer.<locals>.clojure at 0x10ba8db80>
    local='foo 4 3'
File "<stdin>", line 1, in <module>
Locals
    __name__='__main__'
    __doc__=None
    __package__=None
    __loader__=<_frozen_importlib_external.SourceFileLoader object at 0x10b9a2ca0>
    __spec__=None
    __annotations__={}
    __builtins__=<module 'builtins' (built-in)>
    currentframe=<function currentframe at 0x10bcf9b80>
    getouterframes=<function getouterframes at 0x10bcf9a60>
    Logger=<class 'ccc.logger.Logger'>
    outer=<function outer at 0x10ba8dd30>
    middle=<function middle at 0x10bd00160>
    inner=<function inner at 0x10bd001f0>

😍 😍 😍 I’m in love! Serioiusly, can you imagine debugging with all this juicy info available to you?!?

Prior Art

It looks like it should be possible to get this working but the docs seem to be pointing in the direction that the addition of local vars happens in the traceback formatter (i.e the pretty output) and not part of the data that’s actually logged.

#  hynek/structlog:/src/structlog/processors.py#L579
class StackInfoRenderer:
    """
    Add stack information with key ``stack`` if ``stack_info`` is `True`.

    Useful when you want to attach a stack dump to a log entry without
    involving an exception and works analogously to the *stack_info* argument
    of the Python standard library logging.

    """

Maybe you could make it work with your logger wrapper outlined here but that’s as much effort as making your own Logger class.