Archives

Categories

TIL: Python logger extra keyword

Published April 21, 2022 in today I learned - 0 Comments

Today I learned that extra arguments can be passed to any Python stdlib logger method that logs a message with the extra keyword argument as a dict. The extra arguments may be used in a formatter or with other logging tools like structlog. Output is: 2022-04-21 00:47:31,691 Critical error message! | Linux #29~20.04.1-Ubuntu SMP Fri […]

Tags: python

Python Dependency and Environment Management Deep-Dive

Published April 11, 2022 in devops - 0 Comments

Struggling to make sense of the Python package and virtual environment landscape? Not sure how what tools have features that make building CI/CD for your Python project easier? Frustrated by slow dependency resolution? To improve Recursion‘s Python developer experience, my colleague Dan Maljovec and I did a deep dive on those very topics and wrote […]

Tags: python

TIL: How to pass extra gunicorn arguments to MLflow server

Published March 22, 2022 in today I learned - 0 Comments

The MLflow server exposes the –workers flag to change the number of gunicorn workers, but if you want to pass other arguments to gunicorn then you need the –gunicorn-opts flag: It looks like –waitress-opts serves the same purpose if running MLflow on Windows. If passing the –gunicorn-opts flag to a container running the server on […]

Tags: kubernetes , mlops

Book Review: The Pragmatic Programmer: your journey to mastery, 20th Anniversary Edition, 2nd edition

Published January 31, 2021 in review - 0 Comments

Disclosure: I was not compensated for writing this post nor was my review solicited. All opinions are 100% my own. Quick Take: mostly worth your time and can be a good review for professional programmers. This book may offer the most value for people entering the field. The Pragmatic Programmer book was recommended by a […]

Tags: review

Python text analysis tools: FuzzyWuzzy’s basic string matching

Published March 29, 2020 in data - 0 Comments

The Python FuzzyWuzzy module uses Levenshtein edit distance to implement fuzzy string matching. FuzzyWuzzy’s matching tools return results on a scale from 0 to 100. The simplest matching tool FuzzyWuzzy offers is the ratio(..) function: The basic ratio function works well for simple string matching. However if you’re trying to fuzzy match a single word […]

Ijson coroutines and generators

Published February 27, 2020 in data - 0 Comments

Reader comments on an old post about the ijson parser prompted me to check out the project’s more recent releases. The latest pre-release (v3.0rc1) added a coroutine interface, which allow users to supply their own file readers and have more control over when the parser is called. It looked like a fun feature to explore, […]

Python text analysis tools: Levenshtein Distance

Published January 31, 2020 in data - 0 Comments

Figuring out how similar two strings are and then making that similarity a quantitative measurement is a basic problem in text analysis, text mining and natural language processing. There are a number of efficient methods to solve this problem. This survey looks at Python implementations of a simple but widely used method: Levenshtein distance as […]

A closer look at Airflow’s KubernetesPodOperator and XCom

Published July 11, 2019 in data - 8 Comments

The KubernetesPodOperator handles communicating XCom values differently than other operators. The basics are described in the operator documentation under the xcom_push parameter. I’ve written up a more detailed example that expands on that documentation. An Airflow task instance described by the KubernetesPodOperator can write a dict to the file /airflow/xcom/return.json (always the same file) that […]

1 2 3 4