Seek has stood up a cloud-based security data lake, initially to improve its incident response but with a view to use it for more proactive activities including threat hunting and attack path mapping.
Head of cloud and product security Andrew Bienert told AWS re:Invent at the end of last year that the employment marketplace is up to its “first iteration and use” of Amazon Security Lake, a relatively new cloud service that was made generally available by AWS in May last year.
Seek had been after a way to bring together security logging and event data from myriad tools and services, both native AWS and third-party.
“The amount of log data that security teams today are expected to manage is just relentlessly increasing,” Bienert said.
“Compounding this problem is the insane number of security tools we have in the industry as well.
“That feeds into us having to manage multiple solutions [and] different ways of managing all of our data. For us, for example, we have one way of dealing with our VPC flow logs, another for CloudTrail, and yet another for how we work with our EDR [endpoint detection and response] logs.
“If you’re in the operations space or an analyst, having to work with all different types of applications all of the time, having to know where the data is, is an overhead. There’s a nontrivial cost of maintaining all these different moving parts at all times as well.”
Bienert said that work had already been underway internally with Seek’s data scientists to build a security-oriented data lake when AWS previewed its service.
“That really helped us focus our efforts [around] what we were designing towards,” Bienert said.
Bienert said that Seek set out to stand up a data lake that was “modular and flexible”, where value could be delivered “incrementally over time”.
He also wanted to stand up a lake that could be utilised to improve incident response in the first instance, but also support other use cases – security and non-security related – as time went on.
“Obviously the basic capability is in incident response, but we wanted to be able to expand out towards threat hunting capabilities and threat detection, look at high risk user behaviour, and also facilitate activities like attack path mapping,” Bienert said.
“What you’ll notice there is we’ve gone from incident response which is a very reactive security use case, to now expanding out into more proactive use cases.”
He added that over time, the team had also realised that “the data stored in the data lake didn’t necessarily need to just support security use cases.”
“We thought we could add value to other parts of the business with this data as well.”
So far, the security data lake is set up to ingest “supported native AWS sources” of log data, including VPC flow logs, CloudTrail, AWS Security Hub and DNS logs, as well as some third-party sources such as CrowdStrike and Netskope.
“We’ve also worked with AWS Professional Services as well as Carbon Black to build out our own custom source for Carbon Black logs to be ingested into security lake,” Bienert said.
Ingestion work is continuing, with efforts underway to bring in data from Okta, Proofpoint, GitHub “and other internal tools, as well as additional DNS and firewall logs.”
“We’re currently [also] running a proof of concept with Splunk, trialling out that integration with Security Lake and testing out those queries.”
To aid the expansion of Security Lake’s use cases, other feeds are set to be integrated, from Workday for “business context” data, to VirusTotal and AbuseIPDB.
Analysts run SQL queries on the data lake using Amazon Athena.
Incident investigations
Bienert highlighted two incidents that were investigated – and ultimately found to be benign – with the aid of the Security Lake.
In one incident, the lake was utilised to investigate a detection by Amazon GuardDuty of an “unauthorised access event to a [production] S3 bucket from a malicious IP caller from one of our custom threat lists.”
“For anyone who has worked in security for some time or operations or as an analyst … there’s certain events that you see that make you sit up and take notice,” Bienert said. “This was one of those.”
Bienert said the bucket contained customer data but wasn’t public – “We knew that because we’ve got internal controls to prevent that happening”.
After examining individual logs without a hit, Seek utilised the-then four-week-old data lake to ultimately determine the threat to be benign.
“It turned out that the IP address that was on our threat list had been reallocated some time before and so we had to actually remove that from our custom threat list,” Bienert said.
A second use case Bienert disclosed was on an indicator of compromise investigation.
“We had some intel provided to us that a well-known APT [advanced persistent threat] was targeting companies like ours, and we’d been provided an IP address to go and search on. We went and did that in some of our systems and got no hits,” he said.
“Rather than use our legacy VPC flow log system to search, we went straight to Security Lake. The reason for that was this was a ‘looking for a needle in a haystack’ problem, and we needed to search across all of our accounts, all 350-plus of them, across multiple regions, and [in] a variable time window.
“Security Lake was able to provide a hit for us really quite quickly. We were able to identify a host running in one of our accounts that had been communicating with this malicious IP which means that we kicked off a forensic investigation.
“Again, it turned out to be benign, which is fortunate because I can talk about it. Then we were able to close that investigation out.”