Centova Server / Licensing gone down :(

Read 11486 times
Why has no announcement been made to keep people informed with the current issues at hand?
The licensing server went down for a few days, my clients couldn't access their control panel.

When this happened last time a few years ago we were assured something had been put in place so the problem wouldn't re-occur, why has this happened again?

Hi Brad, I suffered same situation today

Try to run

/home/centovacast/system/runascc/runascc exec ccmanage reissuelicense @current


TRIBALHOST.NET
Audio & Video Streaming
Thanks, for now its working, but I'm very disappointed nothing formally has come from Centova about the issue.
Hi Ricky that's right, our control panels should not be shut down if/when a centova licensing server goes offline. Ideally their licensing servers should have redundancy, perhaps a cluster of servers working in a round robin.
I also have an owned license so I don't think it should need to check with the licensing servers on such a regular basis, something like once a fortnight should suffice.
Still no official reply from Centova about whats happened, very disappointed!!
Why has no announcement been made to keep people informed with the current issues at hand?
During the actual outage, we posted regular updates quite frequently to keep everyone posted.  It was in fact the only thing you could access when visiting secure.centova.com.


When this happened last time a few years ago we were assured something had been put in place so the problem wouldn't re-occur, why has this happened again?
This was an entirely different scenario, per below.


Hi Ricky that's right, our control panels should not be shut down if/when a centova licensing server goes offline.
While I sympathize with everyone's frustration here, this is of course not a realistic expectation... if Centova Cast didn't stop working when it couldn't validate its license, then there would be no point in having a licensing system at all.

That said, Ricky (and only Ricky, so far) claims that the tune-in links do not work with an expired license, and if that's the case then no, that's not the intended behavior.  We are unable to reproduce that on our end, though -- I actually carefully arranged a test scenario in response to a ticket from Ricky and had no trouble tuning in.  Nobody (Ricky included) has been able to demonstrate the issue since then.

If anyone can actually demonstrate this for me to diagnose, I'd gladly investigate and fix it.


Ideally their licensing servers should have redundancy, perhaps a cluster of servers working in a round robin.
We have exactly this.  Our billing system is (for obvious reasons) in control of the licensing servers, however, so an extended (week or more) outage of the billing server will impact the licensing cluster, as was the case in this incident.


Still no official reply from Centova about whats happened, very disappointed!!
I can't personally monitor every single forum thread but if you search, I have made a number of posts on the forums about this incident.  Hell, I'll even save you the effort of searching -- here's one with some technical details and another with a public apology:

Quote
It sounds like you're not aware of the scope of what happened last month.  Our datacenter brought us down hard -- their negligence ranged from causing major data loss on one of our servers, to null-routing a number of our IP addresses for no quantifiable reason during our recovery efforts, to completely ignoring their support SLA and keeping us waiting for days on end in some cases for simple status updates.  This (not the migration) was the root cause for the issues you experienced.  We had no choice but to perform an emergency migration (entirely unplanned) to another DC because the folks at our original DC apparently lost their minds overnight.  We're now in a new facility and working on deploying a failover infrastructure at another, separate DC as well to ensure that nothing like this ever happens again.

In any case, that is not by any means an attempt to pin the blame elsewhere -- it was our fault (my fault, specifically) for choosing what ended up (despite reasonable reviews) being an awful DC -- but I thought it was important to explain that we didn't just randomly and carelessly decide to perform a massive infrastructure migration without notifying our clients.  It was purely a reaction to the situation at hand and was not at all something we had hoped to have to do.

Quote
In a nutshell, I made a horrible, horrible choice to put my trust in a datacenter that seemed ideal, but which I later found out was seemingly unfit to be in business at all, and which repeatedly screwed us in several different ways over the course of a 2 week period.  I (and unfortunately our clients by extension) am now paying the price for my mistake.

The current situation is that we are migrating all of our services to a new pair of geographically-disparate, well-reviewed datacenters, and will be finalizing that transition over the next couple of days.  After that, there should be no further unpleasant "surprises".

Again, my deepest apologies to everyone who has been affected by these licensing issues.  Please know that I am every bit as angry and frustrated by the situation as you all must be, and that the steps we're taking now in migrating to multiple datacenters for redundancy are intended to prevent anything like this from ever happening again.

I should note that the "multiple datacenters for redundancy" in the above post referred to redundancy in our secure (billing/helpdesk) server.  The licensing servers have always been redundant.

At this point everything has been stable for some time now.  Our new Eastern Canada datacenter is hosting our primary infrastructure and we are still working on configuring database and filesystem replication to our new Arizona datacenter, which we hope to have online soon.  Additionally, an aggressive new backup system at our Vancouver office is taking snapshots of our client data at 2hr intervals, as well as imaging our production servers on a regular basis.

Nothing is ever absolute, but with all this in place I hope I'm reasonably safe in saying that we have our bases covered well enough -- and enough copies of data in enough places -- to ensure that in future, no single service provider should be able to do a fraction of the damage that our last one did to us.
Last Edit: October 04, 2013, 04:39:26 am by Centova - Steve B.
Thanks Steve appreciate the detailed reply!
Sorry if the tone of my posts were a little harsh, as you know its frustrating when things out of your control go sour, and you have clients bugging you for answers.

Cheers
  Brad