For the second consecutive June 30th, Office 365 has had a disruption to the service. As reported in The Inquirer, users across Europe were having difficulty logging into services. Have you ever wondered what’s going on with the Office 365 services your company uses? Maybe you can’t really tell if things are working properly. You aren’t alone! There have been a number of events that have happened in the past and it’s highly likely there will be more in the future. All clouds have bad days. If you look at recent headlines, you quickly see disruptions that have impacted Amazon, Google, Apple and yes, even Office 365.
As an administrator, any problem with a cloud service that the organization relies on is tricky. When things are working, the benefits are obvious. When things aren’t working as expected, the problems are glaring. End users are frustrated and administrators often have little that they can do to restore key systems. With that in mind, here are a few things you can do to make sure productivity remains high, or so that you, as an administrator, can provide an informative update to your staff.
First, try and determine whether the problem is internal. Since Office 365 is a cloud service, it’s important to pinpoint exactly where the issue is. And, if you haven’t first checked to see if the problem could be internal, you risk wasting a good deal of time. Try a couple of things before moving on down the list:
- Can you access the internet? Start by trying a couple of websites on more than just one machine. This will determine if there is a problem with your network.
- If you do note an issue on your end it’s important to determine the cause. Is it affecting all users, some, or is it isolated to one or two users? Is it an issue at your ISP, an internal hardware issue (failed router for example), a DNS problem or an internal DHCP issue affecting users? Using an old school command line tool, like ipconfig, ping or tracert can help narrow it down.
- If the problem is ISP related, the organization should have a secondary ISP ready. When using cloud solutions it’s essential you have multiple ways to access the cloud.
- If it’s not on your end… it has to be on their end. At the same time you’re confirming your side is solid you should be looking at step 2.
Assess which methods of service connectivity are being impacted. It’s possible, for example, that users connecting to Exchange Online through Outlook are working as expected but users accessing mail through Outlook on the Web are experiencing problems. Microsoft offers some tips on how to check the Office 365 service health. But that alone may not be enough.
- Answering the question of what services inside Office 365 are being impacted can be tough. The Office 365 admin console is not specific to YOUR tenant. In other words Microsoft doesn’t sound the alarm unless there is an outage large enough to warrant sounding the alarm. In the past there have been service outages but the health status in the Office 365 admin console continues to show as green for several hours before Microsoft switches to red. You can find some information on the Office 365 status and service status pages.
- Is there any indication that something may be impacting your service? If not, proceed to down the list.
Verifying with non-Microsoft sites can be extremely helpful in determining what is happening with the service. Microsoft recently reported there are more than 100 million active users on Office 365. With that many users, if there is a problem, people are likely talking about it and their experiences can help you pinpoint the problem.
- One site that aggregates user experiences with cloud services is DownDetector.com. This site acts as a “canary in the coal mine” and can provide an indication of problems. Be warned that this is noisy channel and you’ll need to sift through comments to find meaningful information.
- Reddit is another site that can provide updates as they are happening. Here’s a link to a thread from the June 30 Office 365 disruption. Searching Reddit for “Office 365 down” or “Exchange Online down” will uncover these threads.
- Twitter is another good source that can provide real-time updates to what is happening with other tenants around the globe. Administrators frequently turn to social channels to check on the experiences of others. For instance, keep an eye on the #O365Outage hashtag.
- Media sites are also worth checking for broad events. They usually won’t pick-up on the small disruptions but if something major is happening, certain publications usually provide some information. Searching sites like Computer Weekly, Business Insider and The Register can be useful. A Google search with specific time filters to include results within the last 24 hours or past week can also be helpful.
If there is a major problem impacting regional tenants, now is the time to activate continuity plans for your organization. Make sure employees know your plan beforehand. If end users don’t know what to do when there are service connectivity issues, keeping them productive will be difficult. Many applications have offline modes and/file sync to at least keep working on current tasks. Granted any connectivity to the outside world will be challenging.
- If you use a third party service like Mimecast for email connectivity it’s easy. Administrators can start a continuity event for the whole organization or specific groups to keep staff connected. Alternatively, individuals can activate continuity to continue sending and receiving mail.
After the event, think about what can be done to make it easier on the organization moving forward.
- Get feedback from the impacted groups. Identify best practices and what needs to change so that you’re more prepared when this happens again.
- Use the event as an opportunity to make changes to processes and technology. When a disruption happens, it’s a reminder that everything doesn’t work as expected. Use it to initiate action. As the event fades into memory, users will go back to thinking everything will always work as expected.
Don’t be caught flatfooted when a cloud service like Office 365 has a disruption that impacts your organization. By quickly assessing the problem, communicating effectively and initiating an action plan, you can keep things running smoothly until the services are back online.