Implementing incident response in an organization’s delegated security account and leveraging automation to implement AWS security services in a master-member setup
About the Customer
The customer is a clearing house specializing in equity derivatives clearing, providing central counterparty clearing and settlement services to 16 exchanges.
Key Challenge/Problem Statement
The customer’s existing cloud security strategy consisted of several third-party tools that were implemented individually and did not feed into a centralized dashboard for monitoring purposes. On top of this, there was a lack of alerting configured for security controls, findings were reviewed ad-hoc, and findings were remediated manually. With thousands of security findings generated each day, it was impossible for the security team to keep up and many non-compliant resources continued to persist in the environment. Beyond the lengthy remediation times, the de-centralized approach resulted in a lack of visibility of security metrics across the whole AWS environment. Lack of visibility into security metrics made auditing their environment difficult, as they had to piece information together from several different sources in order to prove compliance and pass third party audits.
State of customer’s Business Prior to Engagement
The customer was originally using Dome9 as its enterprise cloud security tool for firewalls and endpoint protection. Many different third-party tools and compliance scanners were used to detect vulnerabilities and ensure the cloud environment is compliant – this caused sprawl and made compliance tools impossible to maintain at scale. This spread of tools often left the security engineers switching back-and-forth between tools to deal with hundreds, and sometimes thousands, of security alerts every day. To overcome this operational challenge, the customer’s infosec leadership decided to implement AWS Security Hub for its cloud infrastructure and enable “CIS AWS Foundations Benchmark and AWS Foundations Security Best Practices” to maintain a secure and compliant cloud environment. However, Security Hub was not consistently implemented from account to account, alerting was still not configured for critical findings, and the findings were not being centralized into Splunk as expected.
Proposed Solution & Architecture
To put the security baseline in place in each member account and automatically centralize the collection of findings to a single administrator Security Account, we implemented the solution outlined by the architecture below.
Figure-01
In order to replace the customer’s existing Dome-9 solution and to implement a robust IDS-IPS cloud native solution, our team enabled a slew of AWS monitoring services for the customer’s AWS Organization and added Lambdas to remediate any security issues. A single AWS security account was created to centrally control all security-related findings.
As part of our solution, the following services were deployed to a centralized security account to ensure that all the findings could be automatically forwarded to a single account.
- Security Hub: Single pane of glass to view all the findings from Security and Compliance tools across each account in the organization. Additionally, the CIS AWS Foundations Benchmark and AWS Foundations Security Best Practices packages were enabled to monitor the cloud security posture of the accounts across the organization.
- Remediation Lambdas: As soon as findings arrived in administrator Security Hub, findings that could be automatically remediated were sent to a remediation Lambda via Custom EventBridge Rules. Some types of remediation actions that this Lambda could perform included but were not limited to isolating potentially compromised compute instances and blocking principals from unauthorized accounts that are trying to access resources.
- SNS: Alerting was implemented by automatically sending findings to the appropriate stakeholders via Slack as soon as they landed in Security Hub. This same flow was also used to automatically forward findings into the customer’s Splunk tenant so the cloud security and compliance findings could be viewed in the same dashboards as the on-premises findings.
Then, we distributed Security Tool deployment Terraform scripts to each account owner in the organization to automatically configure and deploy a security baseline with AWS-Native security tools. Since this security baseline was written in Terraform, the different application teams could add to it if they needed to add security functionality specific to their application. The tools and capabilities enabled by this baseline security Terraform script were the following:
- Security Hub: Member account Security Hub tenant that aggregates findings from all the security and compliance tools across the account, and automatically shares all of the findings with the centralized administrator Security Hub tenant in the Security Account.
- AWS Config: Enforce detective security and compliance controls in each account in the organization. Controls specific to the client’s auditing requirements were implemented.
- GuardDuty: Collect GuardDuty findings from VPC Flow Logs, DNS Logs, and CloudTrail trails.
- Access analyzer: Detect resources across each account that are configured with overly permissive privileges and violate regulatory compliance controls.
- AWS Systems Manager: Provides automatic remediation on instances that have triggered a Security Hub rule. Automatic remediations include patching hosts with insecure packages, running ad-hoc virus scan to ensure that host is virus free, scans for sensitive information on host disk, etc.
For Security Hub, the administrator security account must invite each member account across the organization. After the invitations were accepted, the member accounts are configured to automatically send their Security Hub findings to the centralized administrator security account. With this setup, all key security findings were sent to this account for centralized monitoring and remediation. Then, with all the findings centralized in the Security Account, SNS could be leveraged to automatically send alerts for critical findings to Slack and Splunk.
The administrator security account also had custom actions that linked the findings to the remediation Lambdas to orchestrate automatic remediation. This enabled us to implement automatic remediation (via SSM) for findings that should always be remediated and did not require human approval.
AWS Services Used
- AWS Infrastructure Scripting – Terraform
- AWS Storage Services – S3
- AWS Compute Services – Lambda
- AWS Management and Governance Services – CloudWatch, CloudTrail, Config, Organizations, SSM
- AWS Security, Identity, Compliance Services – Security Hub, IAM, Key Management Service, GuardDuty, Amazon Inspector, IAM Access Analyzer,
- AWS Application Integration Services – SNS, EventBridge
Third-party applications or solutions used
- Splunk
- Slack
Outcome Metrics
- 50% more compliance failures and vulnerabilities were reported throughout the organization compared to the third party tool previously in place. Stage and Production environment issues were prioritized, a backlog was created to immediately address these findings, and many of the findings were automatically remediated.
- Automated remediation Lambdas significantly reduced the issue resolution timeframe. Automation with one click invocation and instant notification via Slack and HPSM brought down the resolution timeline by 75%. Certain existing approval processes still need amendments to pair well with the new remediation solution. However, the overall remediation timeline to address critical issues came down drastically.
- Security Hub and underlying Config packs were primary detective controls that reduced the AWS threat radius, whereas other detective services mentioned above added another layer of security to further strengthen the security posture of the AWS infrastructure. Such multi-faceted security setup with 24*7 support from AWS, provided much better protection against incoming threats.
- Significant time reduction when analyzing aggregated findings in Security Hub in the centralized security account.
- Quick incident response times with custom actions (powered by Lambda and SSM) to automatically remediate critical issues.
- Config was implemented to enforce compliance and auditing requirements, and evidence was centralized so that the client could pass audits with less manual effort
Summary
With the execution of this project, our team enabled a set of security services and created automated-remediation actions in a desired and self-service manner. The combination of network, application, and monitoring services alongside notification and custom actions helped the customer set up a secure perimeter for all its AWS accounts. It helped build threat detection and detective controls in a cloud-native and cost-effective way. It also helped them tailor it as per their infosec policies. Preliminary tests are positive as most vulnerabilities were identified and reported during these tests. Post-production, this cloud-native solution will replace the existing third-party IDS-IPS system that has been deployed on AWS.