Publishing Pipeline - Refactoring

From a One-Off Script to a Publishing Platform

Three Weeks of Refactoring, Learning Python, and Building Something That Scales

Three weeks ago, my publishing “pipeline” was exactly what many automation projects start as:
a single Python script, built to solve a single problem, for a single platform.

It worked — until it didn’t.

Today, that script has evolved into a modular, extensible publishing platform that can target multiple services, reuse logic cleanly, and grow without collapsing under its own weight. Along the way, I learned more about Python, architecture, and disciplined automation than I had in the previous three years combined.

This post is a recap of that journey, what changed technically, why it matters, and where this project is heading next.

Where We Started

The original setup had some clear limitations:

One script, one platform
Tight coupling between:
content loading
publishing logic
database access
Minimal state tracking
Hard to debug
Hard to extend
Impossible to reason about once it grew past a few hundred lines

In short: it worked, but it didn’t scale — neither technically nor mentally.

Here is what the flow looked like:

Characteristics

One script owns everything
Platform logic intertwined
Hard to add new platforms
Any change risks breaking unrelated behavior
Re-running safely is difficult

The New Architecture: Clear Responsibilities Everywhere

Over the past weeks, the pipeline was redesigned around layers, each with a single responsibility.

1. Pipeline Layer (Orchestration)

Responsible only for flow:

Load posts
Process media
Inject backlinks
Publish to enabled platforms
Track results

No platform-specific logic lives here.

2. Publisher Layer (Per Platform Semantics)

Each platform gets:

a Publisher (what to publish, when, and how)
a Client (how to talk to the API)

This distinction turned out to be crucial.

For example:

WordPress supports updates
Dev.to supports updates with constraints
X (Twitter) is publish-once-only

That difference now lives exactly where it belongs: inside the publisher, not smeared across the pipeline.

3. Client Layer (Pure API Communication)

Clients do one thing only:

authenticate
send requests
return normalized results

There is

No database access
No publishing decisions
No content logic

This makes them:

testable
replaceable
reusable

4. Database Layer (State & Deduplication)

The database now tracks:

what was published
where
with which content hash
which media assets already exist
canonical URLs

This enables:

idempotent runs
safe re-publishing
deduplication of media
correct canonical handling across platforms

This is what the new flow looks like:

What Changed

Pipeline only controls flow
Publishers decide what to do
Clients only talk to APIs
Database provides state and idempotency
Each part is testable and replaceable

This is the turning point where the system stops being fragile.

Why This Is Faster (and Calmer)

One unexpected benefit: performance and reliability improved naturally.

Not because of clever optimizations, but because:

Small modules load faster
Less global state stays in memory
Failures are isolated
Logs are precise and meaningful
Re-runs are cheap and safe

Instead of hoping nothing breaks, the pipeline now knows when nothing needs to happen.

Adding a New Platform Is Now Boring (That’s a Compliment)

To add a new platform today, the steps are:

Media, Backlinks, Canonicals — All Solved Properly

Some highlights that are now just working:

Media deduplication via hashing
→ upload once, reuse everywhere
Featured images correctly attached per post
Backlinks injected deterministically
Canonical URLs handled centrally and propagated correctly
Per-platform semantics respected (no more accidental re-posting)

None of this required hacks — only the right separation of concerns.

What I Learned (The Unexpected Part)

This project wasn’t about learning Python — but it turned into exactly that.

In three weeks, I learned more about:

Python module design
data modeling
immutability vs mutation
error handling
API semantics
architectural boundaries

…than I had in the previous three years of “occasionally writing Python”.

The key insight:

Python becomes dramatically easier once the architecture is clean.

Most complexity wasn’t Python at all — it was unclear responsibility.

What’s Next: Roadmap to v1.4.0

The current version 1.3.1 closes the stabilization phase.
The next milestone, 1.4.0, will focus on capabilities.

Planned Features

Platform-Aware Update Strategies
* Publish-once (X)
* Update-in-place (WordPress)
* Conditional update (Dev.to)
* Future: quote-tweet or follow-up logic
Rate Limiting & Scheduling
* Per-platform throttling
* Optional delayed publishing
* CI-safe dry-run mode
Observability Improvements
* Structured logs
* Optional JSON output
* Better failure summaries

New Platforms (Starting with One)

At least one new platform will be added in v1.4.0.

Candidates:

LinkedIn (high priority)
Xing
Indeed (content syndication angle)

Later roadmap:

Substack
Patreon
Possibly others

Thanks to the current architecture, adding these is now a contained task, not a rewrite.

And Yes — This Will Become a Series

This post is likely the first of a small series covering:

Python for infrastructure-minded people
Real-world refactoring
Automation that survives growth
Designing for change, not just success

Not tutorials — but honest write-ups of systems that evolved under real constraints.

Final Thoughts

What started as a utility script has become a small platform.

Not because it needed to be fancy —
but because it needed to be understandable, extendable, and trustworthy.

And that, more than anything, made the difference.

Down the road it might become a service interested people can use for their own writing. Since I am developing and running my own kubernetes cloud, there should be nothing in the way of making this a container solution accessible for tenants.

Did you find this post helpful? You can support me.

Author Profile

12ww1160DevOps engineer & architect

Latest entries

blog06.07.2026CNPG – Configuration – TLS Encryption
blog02.07.2026CNPG – Installation – Monitoring
blog01.07.2026CNPG – Installation – Connection pooling
blog27.06.2026CNPG – Installation – Database Import

Publishing Pipeline – Refactoring

From a One-Off Script to a Publishing Platform

Three Weeks of Refactoring, Learning Python, and Building Something That Scales

Where We Started

Characteristics