- Home>
- python
Today I learned that you can get (sys.getrecursionlimit()) and set (sys.setrecursionlimit(limit)) the Python interpreter’s maximum stack depth. I can see that on my system, the maximum stack depth or recursion limit is 3000. Exceeding the limit results in a RecursionError. This came up in a Prefect flow using a LocalDaskExecutor that attempted to serialize an […]
Today I learned that extra arguments can be passed to any Python stdlib logger method that logs a message with the extra keyword argument as a dict. The extra arguments may be used in a formatter or with other logging tools like structlog. Output is: 2022-04-21 00:47:31,691 Critical error message! | Linux #29~20.04.1-Ubuntu SMP Fri […]
Struggling to make sense of the Python package and virtual environment landscape? Not sure how what tools have features that make building CI/CD for your Python project easier? Frustrated by slow dependency resolution? To improve Recursion‘s Python developer experience, my colleague Dan Maljovec and I did a deep dive on those very topics and wrote […]
Disclosure: I was given access to Effective Pandas – Standard to review. I was not compensated for writing this post and the links to the course are not affiliate links. All opinions are 100% my own. Quick take: Effective Pandas is a high quality course with well thought out content. I felt like it was […]
The Python FuzzyWuzzy module uses Levenshtein edit distance to implement fuzzy string matching. FuzzyWuzzy’s matching tools return results on a scale from 0 to 100. The simplest matching tool FuzzyWuzzy offers is the ratio(..) function: The basic ratio function works well for simple string matching. However if you’re trying to fuzzy match a single word […]
Reader comments on an old post about the ijson parser prompted me to check out the project’s more recent releases. The latest pre-release (v3.0rc1) added a coroutine interface, which allow users to supply their own file readers and have more control over when the parser is called. It looked like a fun feature to explore, […]
Figuring out how similar two strings are and then making that similarity a quantitative measurement is a basic problem in text analysis, text mining and natural language processing. There are a number of efficient methods to solve this problem. This survey looks at Python implementations of a simple but widely used method: Levenshtein distance as […]
The KubernetesPodOperator handles communicating XCom values differently than other operators. The basics are described in the operator documentation under the xcom_push parameter. I’ve written up a more detailed example that expands on that documentation. An Airflow task instance described by the KubernetesPodOperator can write a dict to the file /airflow/xcom/return.json (always the same file) that […]
This article and code is applicable to Airflow 1.10.13. Hopefully the REST API will mature as Airflow is developed further, and the authentication methods will be easier. The experimental REST API does not use the Airflow role-based users. Instead, it currently requires a SQLAlchemy models.User object whose data is saved in the database. The code […]
KubernetesExecutor The KubernetesExecutor sets up Airflow to run on a Kubernetes cluster. This executor runs task instances in pods created from the same Airflow Docker image used by the KubernetesExecutor itself, unless configured otherwise (more on that at the end). Getting Airflow deployed with the KubernetesExecutor to a cluster is not a trivial task. I […]