<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Slack on Project Wintermute</title><link>https://wintermutecore.com/tags/slack/</link><description>Recent content in Slack on Project Wintermute</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 18 Mar 2026 08:00:00 +0200</lastBuildDate><atom:link href="https://wintermutecore.com/tags/slack/index.xml" rel="self" type="application/rss+xml"/><item><title>Building a daily data pipeline with Dagu, Python, and a JSONL data lake</title><link>https://wintermutecore.com/posts/daily-data-pipeline-dagu-python/</link><pubDate>Wed, 18 Mar 2026 08:00:00 +0200</pubDate><guid>https://wintermutecore.com/posts/daily-data-pipeline-dagu-python/</guid><description>&lt;p&gt;&lt;strong&gt;TL;DR.&lt;/strong&gt; Three stages (ingest, index, alert), one stage per script, JSONL as the index format, a flat dedup state file, Dagu for scheduling. Boring, reliable, and dramatically less work than reaching for a heavyweight orchestrator.&lt;/p&gt;
&lt;p&gt;There is a class of pipeline that does not deserve a Spark cluster, an Airflow deployment, or a multi-tenant orchestrator. It is the &amp;ldquo;fetch a few hundred records from an API every day, index them, alert on the interesting ones&amp;rdquo; job. We have built this kind of thing dozens of times. Here is the shape that has held up.&lt;/p&gt;</description></item></channel></rss>