Posted On 16.01.2026

Publishing Pipeline – Refactoring

0 comments
confdroid.com >> blog >> Publishing Pipeline – Refactoring

From a One-Off Script to a Publishing Platform

Three Weeks of Refactoring, Learning Python, and Building Something That Scales

Three weeks ago, my publishing “pipeline” was exactly what many automation projects start as:
a single Python script, built to solve a single problem, for a single platform.

It worked — until it didn’t.

Today, that script has evolved into a modular, extensible publishing platform that can target multiple services, reuse logic cleanly, and grow without collapsing under its own weight. Along the way, I learned more about Python, architecture, and disciplined automation than I had in the previous three years combined.

This post is a recap of that journey, what changed technically, why it matters, and where this project is heading next.


Where We Started

The original setup had some clear limitations:

  • One script, one platform
  • Tight coupling between:
  • content loading
  • publishing logic
  • database access
  • Minimal state tracking
  • Hard to debug
  • Hard to extend
  • Impossible to reason about once it grew past a few hundred lines

In short: it worked, but it didn’t scale — neither technically nor mentally.

Here is what the flow looked like:

Mermaid diagram

Characteristics

  • One script owns everything
  • Platform logic intertwined
  • Hard to add new platforms
  • Any change risks breaking unrelated behavior
  • Re-running safely is difficult

The New Architecture: Clear Responsibilities Everywhere

Over the past weeks, the pipeline was redesigned around layers, each with a single responsibility.

1. Pipeline Layer (Orchestration)

Responsible only for flow:

  • Load posts
  • Process media
  • Inject backlinks
  • Publish to enabled platforms
  • Track results

No platform-specific logic lives here.


2. Publisher Layer (Per Platform Semantics)

Each platform gets:

  • a Publisher (what to publish, when, and how)
  • a Client (how to talk to the API)

This distinction turned out to be crucial.

For example:

  • WordPress supports updates
  • Dev.to supports updates with constraints
  • X (Twitter) is publish-once-only

That difference now lives exactly where it belongs: inside the publisher, not smeared across the pipeline.


3. Client Layer (Pure API Communication)

Clients do one thing only:

  • authenticate
  • send requests
  • return normalized results

There is

  • No database access
  • No publishing decisions
  • No content logic

This makes them:

  • testable
  • replaceable
  • reusable

4. Database Layer (State & Deduplication)

The database now tracks:

  • what was published
  • where
  • with which content hash
  • which media assets already exist
  • canonical URLs

This enables:

  • idempotent runs
  • safe re-publishing
  • deduplication of media
  • correct canonical handling across platforms

This is what the new flow looks like:

Mermaid diagram

What Changed

  • Pipeline only controls flow
  • Publishers decide what to do
  • Clients only talk to APIs
  • Database provides state and idempotency
  • Each part is testable and replaceable

This is the turning point where the system stops being fragile.


Why This Is Faster (and Calmer)

One unexpected benefit: performance and reliability improved naturally.

Not because of clever optimizations, but because:

  • Small modules load faster
  • Less global state stays in memory
  • Failures are isolated
  • Logs are precise and meaningful
  • Re-runs are cheap and safe

Instead of hoping nothing breaks, the pipeline now knows when nothing needs to happen.


Adding a New Platform Is Now Boring (That’s a Compliment)

To add a new platform today, the steps are:

Advertisements
  • Write a client (platforms/foo_client.py)
  • Write a publisher (publishers/foo.py)
  • Wire it into the composition root

No changes to:

  • media handling
  • backlinks
  • canonical logic
  • database schema
  • pipeline flow

That’s the kind of boring you (well, I for certain) want.


Media, Backlinks, Canonicals — All Solved Properly

Some highlights that are now just working:

  • Media deduplication via hashing
    → upload once, reuse everywhere
  • Featured images correctly attached per post
  • Backlinks injected deterministically
  • Canonical URLs handled centrally and propagated correctly
  • Per-platform semantics respected (no more accidental re-posting)

None of this required hacks — only the right separation of concerns.


What I Learned (The Unexpected Part)

This project wasn’t about learning Python — but it turned into exactly that.

In three weeks, I learned more about:

  • Python module design
  • data modeling
  • immutability vs mutation
  • error handling
  • API semantics
  • architectural boundaries

…than I had in the previous three years of “occasionally writing Python”.

The key insight:

Python becomes dramatically easier once the architecture is clean.

Most complexity wasn’t Python at all — it was unclear responsibility.


What’s Next: Roadmap to v1.4.0

The current version 1.3.1 closes the stabilization phase.
The next milestone, 1.4.0, will focus on capabilities.

Planned Features

  1. Platform-Aware Update Strategies
    * Publish-once (X)
    * Update-in-place (WordPress)
    * Conditional update (Dev.to)
    * Future: quote-tweet or follow-up logic

  2. Rate Limiting & Scheduling
    * Per-platform throttling
    * Optional delayed publishing
    * CI-safe dry-run mode

  3. Observability Improvements
    * Structured logs
    * Optional JSON output
    * Better failure summaries


New Platforms (Starting with One)

At least one new platform will be added in v1.4.0.

Candidates:

  • LinkedIn (high priority)
  • Xing
  • Indeed (content syndication angle)

Later roadmap:

  • Substack
  • Patreon
  • Possibly others

Thanks to the current architecture, adding these is now a contained task, not a rewrite.


And Yes — This Will Become a Series

This post is likely the first of a small series covering:

  • Python for infrastructure-minded people
  • Real-world refactoring
  • Automation that survives growth
  • Designing for change, not just success

Not tutorials — but honest write-ups of systems that evolved under real constraints.


Final Thoughts

What started as a utility script has become a small platform.

Not because it needed to be fancy —
but because it needed to be understandable, extendable, and trustworthy.

And that, more than anything, made the difference.

Down the road it might become a service interested people can use for their own writing. Since I am developing and running my own kubernetes cloud, there should be nothing in the way of making this a container solution accessible for tenants.


Did you find this post helpful? You can support me.

"Buy Me A Coffee"

Substack

ConfDroid Feedback Portal

Related posts

Author Profile

12ww1160DevOps engineer & architect
Advertisements

Leave a Reply

Your email address will not be published. Required fields are marked *

five × 4 =

Related Post

Migrating my cloud to Kubernetes – storage – the final decision SSHFS

Anyone following my quest to migrate to Kubernetes has been reading about the thoughts and…

Renewing Kubernetes cluster certificates via kubeadm can break your kubelet.conf.

Are you running a Kubernetes cluster to optimize your workload? Excellent idea. Running apps in…

Importing Puppet modules to Foreman

In order for Foreman to be able to work with our Puppet modules, we'll need…