EdgeWorkers - Enhanced Reliability With Fail Over
There are several reasons why an EdgeWorkers function may fail. Programming errors, unexpected input from end-users, changes in documents returned from htmlRequest(), and execution timeouts are some of the reasons for failure. When an error does occur, the user will likely see a message like:
Not very graceful. So it’s important to build in more graceful error handling for a better user experience when the inevitable error does occur.
JavaScript Error Handling
The first line of defense for catching errors is in the code itself. We won’t get into this subject in depth here, but there are plenty of useful resources to assist in catching errors in JavaScript:
Be sure to include logging in your error handling to assist with debugging.
Platform Error Handling
Unfortunately, not all errors can be caught in JavaScript. There are certain runtime errors, like exceeding the memory or CPU limitations, that happen in the platform and not the code. Fortunately, these errors can be detected in Property Manager and managed gracefully through the Site Failover behavior. Upon detected a error, actions can be taken to:
- Retry the request.
- Redirect to a different location by specifying a redirect action.
- Serve existing content from cache.
- Continue processing delivery property metadata, ignoring the failed EdgeWorkers function.
- By-pass the failed EdgeWorkers function and serve the same URL.
Setup
The first step in the process is detection of the EdgeWorkers failure. The following Property Manager logic detects errors from all event handlers.
- Create a rule with a match condition that enables the PMUSER_RP_STATUS variable when the Metadata Stage is client-response.
2. Create another rule with a match condition that enables the PMUSER_RP_ERROR variable when the PMUSER_RP_STATUS is not empty AND is not *success* or *unimplementedHandler*.
3. You also need to enable wildcards for this rule. To do this, click the gear icon in the match condition and select the Wildcards in value check box from the Additional Options window and uncheck the case-sensitive value check box.
If you don’t see the gear icon, hover your mouse over the match condition.
4. Create a third rule with a match condition that enables the Site Failover behavior when the EdgeWorkers Execution status is Failure OR the PMUSER_RP_ERROR is true.
Failover Options
Now that your property is set up to detect EdgeWorkers failures, it’s time to decide what to do upon failure detection.
1. Retry the Request
Retrying the request is useful when experiencing occasional failures due to resource limits. A retry will often be successful when the original error was due to CPU or memory limitation timeouts. Be sure to use enhanced debugging to understand the cause of failure.
The example above retries the request to the original hostname (user.PMUSER_ORIGINAL_AK_HOST) and path.
2. Failover to Static Content
You can serve alternate content from NetStorage upon failure. This is likely appropriate when the function is used for authentication. Serve an alternate message rather than allowing the user access to restricted content.
3. Failover to a different location
Forward the request to a different location so that origin logic can be executed.
4. Serve Stale Content
This option allows you to serve content that is already in cache.
Bypass failover for testing
It’s often useful to see the original error and headers when a failure occurs, bypassing the Property Manager failover logic. This is possible to do by setting and testing for a particular query parameter.
The match condition below can be used to bypass the failover behavior if the query string bypass-failover=true is present on the request.
Conclusion
JavaScript and platform failures are certain to occur, so it makes sense to plan for these failures in advance. JavaScript errors should be accounted for by following good exception handling best practices within the code. Platform failures can be handled using the Site Failover behavior in Property Manager.