Amazon Web Services Logo


A really great insight from Travis Greene on the launch of AWS Machine Learning cloud service and the impact it may have on IT Security.

I agree wholeheartedly with his commentary, but here are a few additional insights from work with our clients in the identity and access management space and the world of analytics. @idaxsoftware.

Firstly, on the analytics side, I’m really interested to find out what type of algorithms they’re offering. From the pricing structure and the mention of “training“ it looks like it’s based on supervised learning techniques. Supervised learning involves taking an algorithm and training it to spot future trends based on historical data and business rules and knowledge. It’s a very powerful technique – as amazon have shown on their website – but it does require huge amounts of data and a serious amount of business knowledge and set up time to get the best out of the technique.

What we at idax use is unsupervised learning. These algorithms require less data and no training or business knowledge. As a result, they’re incredibly quick to implement and run, with almost no setup costs and give repeatable results based on much smaller quantities of data. For example, we can input an access control file, Active Directory or LDAP dump into idax on a Monday and be showing senior managers something interesting and actionable by the middle of the week.

Using this techniques we can spot staff that have moved department and not had their access removed; staff who’s initial access has been copied from a peer; and staff that have had inadequate reviews.

On the security side, Travis points out that we have relied on sets of static rules for too long. Just think about poor hordes of consultants trying to maintain a static list of toxic combinations and you’ll see he has it spot on. Clearly any democratization of machine learning that helps get the message across that we should all be using dynamic predictive analytics rather than spreadsheets is a good thing.

At idax we have seen that just trying to get a handle on the access information in an organization can be overwhelming. If organisations are to have any chance of managing risk effectively they will need to use methods that don’t just focus on more efficient transaction processing.

Where I disagree with Travis is in the importance of behaviour analytics. Sure that’s at the end of the theoretical rainbow, but our experience shows that there are so many insights out there, and so much good information we can glean by looking at access rights on their own. Clearly opening up ML techniques should be about doing something quick and practical like analysing your current Active Directory to look for outliers. And I promise you, you won’t need a years’ worth of transaction data to tell you something useful. Only recently, we were able to tell an organisation that one of its warehousemen had the same AD privileges to the inventory system as a DBA. You really don’t need to know anything about behaviour to understand what action to take.

It is still an open question how these new Machine Learning as a Service will take us, and whether enough people will take the plunge, and if they do will it become just another BI front end, but the time for Analytics in IT Security has definitely arrived. And if that means security data being stored in the cloud then I guess we’ll all have to get used to it.