New Flood Protection in Ultimate Forms for Office 365
Ultimate Forms gives you a wide variety of capabilities to extend and automate your SharePoint-based business solutions. Our business logic components in particular provide you with tremendous power to create, update and manage items, list and site as well as data in external business application. As a truly sophisticated and generic platform we put the steering wheel in your own hands and allow you to configure any business process you need.
There is however a dark side to this approach. When used incorrectly, it can generate an extreme workload for the servers, in some cases grinding them to a halt. In Office 365, Ultimate Forms is implemented as a provider-hosted app, meaning that all the processing takes place on our own servers and not inside SharePoint. Our servers are based on the Azure infrastructure and are designed to handle the workload, with the ability to scale out as much as needed as more and more customers start using Ultimate Forms. Of course, there is a cost associated with running those servers and as any business, we try to optimize as much as we can, running only the number of servers needed for normal operation.
In some case, a customer, without any ill intent, might configure an action (or alert, etc.) in a way that will generate an extremely high volume of work. For instance, an action might be configured that updates all the items in the list, which triggers the same action over and over again, updating those items endlessly and generating a huge workload, overwhelming our servers and causing slowdown and even interruption of service to customers within the same geographic location.
We understand that that is not intentional, but in most cases the user might not even be aware of it. We recently discovered a case where a certain list generated over 500,000 action calls in less than 12 hours!
We decided to handle this potential issue proactively and develop an automated system to monitor and resolve such cases before they start affecting our quality of service. This system has already been implemented for actions and will also be added to alerts, item IDs and associated items summary columns.
The idea is quite simple: we need to know when a certain list starts generating unusual volumes of updates and stop that from flooding our system. We will now count how many events were triggered by each list during every minute (resetting the counter as each minute passes). If we reach 30 events, we stop handling any more events within the same minute. We count how many times we reached the threshold during the same day. If we reach 10 times, we permanently disable the event type for all actions on the list and send an email to the site collection administrators. You can then review your list, adjust your actions and re-enable event handling. Note that the counter for that day will not be reset and if the number of events reaches 30 again during the same day, the actions will be disabled immediately and you will receive another email.
We analyzed usage data before settling on these limits. In 99.9% of the cases, no one ever reaches such frequency of events (it is normally 1 or less, on average). When over 30 items are updated in a list at the same time, it usually indicates a problem. Note that the mechanism only applies to event-based actions. Manual actions or timer-based actions are not affected.
I hope you understand the motivation behind these changes, we strive to make our system as reliable as possible with near 100% availability, to ensure you can build robust and dependable solutions.
UPDATE 2018-08-23: Based on the performance analysis, we decided to relax some of the restrictions. Now, actions are allowed to run up to 100 times without being stopped (with the first attempt not being stopped at all) and actions only get blocked if per minute count was exceeded 20 times. Only actions are affected by the new rules for now. We are monitoring how the system reacts to the new rules and will consider further improvements, as long as there is no adverse effect.
UPDATE 2018-09-04: We further relaxed the restrictions based on the performance metrics. Currently we allow up to 100 times a minute / 20 times a day for all components. Also, the first 3 attempts will not be stopped once they reach 100 items, but allowed to execute. We will be monitoring the system to ensure these restrictions are suitable. We might further relax or restrict based on the information we collect. Note that you are now able to temporarily disable all event handling to allow you to perform bulk operations without the risk of being throttled.
Add your comment
100% No-Code Solution
It's never been easier, to create, innovate and share, all you need is your web browser!
Address business process pain points immediately. Save time and money.
Fantastic Support Team
Facing difficulties installing the application? Contact our fantastic support team.