← Back to Blog

How We Are Elevating Resilience at Click2Deploy

How We Are Elevating Resilience at Click2Deploy

At Click2Deploy, we continue to iterate on something many platforms tend to underestimate: real production resilience. This goes beyond infrastructure—it’s about how a system behaves under change, failure, and scale.

In our latest updates, we’ve taken a significant step forward in three key areas: dynamic system control, smarter backup automation, and a much stronger testing strategy.


Full Control with Feature Flags

We’ve introduced a feature flag system that allows us to enable or disable functionality without deploying new code.

But this goes far beyond simple on/off switches.

We can now:

  • Enable features per user
  • Roll out changes progressively
  • Instantly disable critical components if something goes wrong
  • Safely test new capabilities without impacting all users

This is crucial in complex systems, where even small changes can have unexpected consequences. With this approach, we reduce risk and gain fine-grained control.


Backups That Adapt, Not Get in the Way

One of the most impactful changes was integrating feature flags directly into our backup engine.

This allows us to move away from rigid, global backup processes.

Backups can now:

  • Be enabled or disabled per customer
  • Adapt dynamically based on context
  • Avoid unnecessary executions
  • Stop immediately if abnormal behavior is detected

Additionally, critical tasks like orphaned backup recovery and stale backup detection are now dynamically controlled.

This means the system can react in real time—without manual intervention or emergency deployments.


Less Uncertainty: Better Error Handling

We also improved how the system handles errors, especially when dealing with external integrations.

Instead of generic exceptions, we now use more precise error handling for cases such as:

  • Invalid webhooks
  • Missing or inactive repositories
  • Missing or invalid signatures

This not only improves debugging, but also enhances observability and speeds up incident response.


Testing That Reflects Reality

Another major improvement is our testing strategy.

We’re no longer relying on simple unit tests alone. We now cover:

  • End-to-end business flows (like vouchers and quote approvals)
  • External integrations (Stripe, AWS S3, Odoo services)
  • Real-world network failures (timeouts, SSL errors, retries)
  • Edge cases that typically only appear in production

We also introduced code coverage configuration to focus on what actually matters, filtering out noise like migrations or boilerplate code.

The goal is simple: catch issues before they reach production.


Why This Matters

In deployment and automation platforms, the biggest risk isn’t that something fails—it’s not being able to react when it does.

With these improvements:

  • We can disable critical functionality in seconds
  • We reduce the impact of production issues
  • We improve stability without slowing down development
  • We build a stronger foundation for scaling

What’s Next

These changes are part of a clear direction: building a platform where system behavior is predictable, controllable, and resilient, even under complex conditions.

We will continue investing in:

  • Intelligent automation
  • Observability
  • Dynamic system control

Because in the end, the true quality of a platform isn’t measured when everything works… but when something doesn’t.


If you’re dealing with similar challenges or care about building resilient infrastructure, Click2Deploy is being built exactly for that.