Over the last few days we've had a number of issues delivering emails to Microsoft's consumer email services at Hotmail, Windows Live and Outlook.com. The root cause of these issues has now been identified and addressed, and while our engineering team has already ensured this specific situation won't occur again, we'll be putting in place more proactive monitoring and reporting systems so we can catch compromised or malicious users in future before they can become a problem.
No Accelo systems were breached or client data exposed. The short version is that someone took advantage of our trust and we then delivered some email for them which cost us a lot of time and money.
A spammer compromised a system/server outside of Accelo, and then used our trust of this system's pathway for email delivery to select mail domains (including Hotmail, Live and Outlook.com) to then use us to deliver over spam emails to primarily @hotmail.com email addresses. This occurred primarily on October 29th and 31st.
As a result of this malicious behavior, Microsoft's mail protection teams saw a spike in bounce rates (and a few spam reports), and by comparing the delivery volume during these events to our baseline volumes, Microsoft decided to blacklist on of our IP addresses. This occurred a few days later, on November 3rd.
On the morning of Friday November 3rd (San Francisco, or Pacific time) we received reports from our users of emails being undeliverable because one of our IP addresses was blacklisted. They got a bounce-back from Microsoft, and while the details were still vague, the response was timely.
Our engineers immediately checked the databases of Blacklists and confirmed we weren't listed, as well as checking our SenderScore (where we scored 99%). This was something localized to Microsoft.
The instructions for requesting removal from the Microsoft-specific blacklist were not well maintained (for anyone who has this problem, the form you need to submit at the time of writing is here). Being unable to see any details for the rejection cause at all from Microsoft, we submitted the application for removal at around 8am Pacific on Friday, and it took Microsoft a couple of days to remove us from their blacklist. When removed, they sent us a slightly confusing message: "We have implemented mitigation for your IP and this process may take 24 - 48 hours to replicate completely throughout our system."
After almost a day of successful deliverability (Monday), another part of the Microsoft team decided to Blacklist us again on Monday evening, only this time, instead of bouncing emails immediately, they merely delayed them. When clients reported problems on Tuesday morning, we didn't have any bounce report information to work with (since they were being delayed, not rejected), and given Microsoft had told us it could take up to 48 hours for the blacklist to be "mitigated", we initially thought this was a symptom of the delay Microsoft had advised us to expect.
On Tuesday afternoon we followed up with Microsoft again, and an escalation team informed us that they couldn't look into the IP address issue further, saying cryptically at 8pm Pacific "Our investigation has determined that the above IP(s) do not qualify for mitigation". They did suggest we sign up for the SNDS service (https://postmaster.live.com/snds/trusted.aspx) so we wouldn't be completely in the dark (but this is proving difficult because Google admins intercept emails to abuse@ and postmaster@ so we're still in the dark).
Recognizing this wasn't a one-off error on Microsoft's part but a deliberate choice they'd made, but having zero information from them about their reasons blocking us, we undertook our own investigation, looking through our many GBs of logs to see if we could find spikes or patterns of deliverability to Hotmail/Live/Outlook addresses which would have caused Microsoft to be giving us the cold shoulder.
In this investigation, our engineering team found that one external client server (which appeared to be compromised) was taking advantage of a specific configuration on our side to relay emails specifically to hotmail.com addresses. Our engineering team then changed the configuration (which itself was only implemented to overcome a different Microsoft deliverability issue back in June of 2014) which ensured we can't be taken advantage of again this way by this server or any other trust servers.
In addition, we also updated our sender IP address to a new IP, and were able to get mail flowing to Microsoft's consumer email servers again (even though the previous address had been blocked). Mail which was sent after 10pm Pacific on Tuesday is now being successfully delivered to Microsoft consumer email servers (we haven't had a single deferral or bounce to a @hotmail.com address since then).
To mitigate the potential of good, trusted clients going bad (either because they've been compromised, or because they turn to the dark side), our engineering team will be implementing proactive analysis of sending patterns so we can block trusted people from doing the wrong thing before it can cause wider ripple effects. This is going to also result in an increase in our support/service burden (since we'll end up doing what Microsoft does and blocking people for our self-protection and then needing to tell them about it after the fact) - we will however work to ensure our communication is less cryptic and more forthcoming when/if these situations arise in the future.
We apologize for the inconvenience this issue has caused. Spam is a massive cost to society generally, and it really sucks when it goes beyond an irritation and stops us from doing our work. We also apologize for the frustration caused by us telling clients what we understood and believed to be true - but which, it turned out, wasn't always accurate (because of cryptic and out-of-date information we we'd received). We're right there in the infuriated camp with you.
If you've got any further questions, feel free to email [email protected] - given the nature of these issues we can't disclose any more information publicly than we have above (when it comes to spam and security issues, you want to have as narrow an attack surface as possible), so we won't be able to answer questions in the comments below.