The Cloud Gets High Performance Security "Hardware"

September 28, 2019  •  Vishal Jain

The Cloud Gets High Performance Security

In the ancient history of network security, there existed a time when network security was simple and straightforward. Then apps and attackers got better at getting through simple port/protocol firewalls and netsec got interesting. Network security devices (firewalls, IPS, proxies, etc.) were expected to do a lot more analysis (more complex pattern recognition, longer encryption keys, compressed payloads) to determine the nature and intent of traffic, but still pass the right packets at line rate with minimal and predictable latency. The difficulty of doing this led to the incorporation of ASICs (application-specific integrated chips) into network security devices - at the cost of flexibility. With the development and adoption of FPGAs (field-programmable gate arrays) network security appliance manufacturers could get nearly all of the performance of ASICs, with the flexibility of x86. But when the public cloud became the new enterprise datacenter, dedicated security hardware disappeared and everything went x86.

As public cloud matured, the demand for more specialized compute became apparent - driven by the demand for artificial intelligence and machine learning applications in the public cloud - public cloud providers began offering FaaS (FPGA-as-a-Service) like AWS EC2 F1, for customers to take advantage of. While enterprises won’t typically start coding specialized applications for FaaS (like network security), it does offer some of the same advantages that FPGAs offer in security hardware - namely better security with predictable (i.e., consistently fast) performance. The trick is to make this FPGA-powered security available to enterprises that don’t care about FPGAs, but just want high-performance network security.

At Valtix, we’ve built the first cloud-native network security platform. In pioneering the marriage of cloud and netsec, we ran into a few new requirements - mostly around helping folks understand what apps they need to protect, and helping them to deploy the appropriate network security in their cloud accounts to protect those apps. This forced us to focus on a specific approach - one that enables customers to discover, deploy, and defend. As organizations look to protect more critical, performance-sensitive, high-growth apps in the cloud, we saw a need to protect those apps with higher performance netsec - the kind we used to think that dedicated, FPGA-powered network security devices provided. Fortunately, this coincided with the advent of FaaS, and we have been able to take advantage - building an FPGA-powered network security capability in the cloud. Of course, as I mentioned above, we had to make it easy for organizations to adopt - networking and security folks don’t have time to learn how to interface with FPGAs directly, nor would we expect them to figure out where and how it can be applied. So we built our dataplane using a pipelined architecture - an x86 dataplane for x86-based instances and an x86/FPGA dataplane for FPGA-based instances – and our controller decides which to deploy, for what tasks, and how.

So how does our controller decide which to deploy? One key differentiation of Valtix architecture compared to legacy appliance-based firewalls is that we have decoupled our data plane and control plane. Our controller has full state of each firewall instance. So depending upon traffic behavior and/or volume, latency requirements, availability of FPGA instances in a particular region and/or cloud, the controller chooses the right dataplane for that particular firewall instance.

Another question that came during initial architecture was, what to implement in x86 vs what to implement in FPGA for the dataplane that runs on Amazon EC2 F1 instances. So, we look to what x86 does well vs. what FPGAs do well. Where x86 (a state machine) does well with sequential compute, FPGAs do better with parallel compute - so we use FPGAs to offload many network security tasks:

  • TLS acceleration (ECC, RSA)
  • Fast pattern search acceleration (for SNORT/ModSec rules)
  • Regex acceleration (SNORT/ModSec rules)
  • Decompression acceleration (to look into compressed payloads)
  • HTTP API parsing based analytics acceleration (to fully parse a deeply nested JSON request body and get the full details of the API invocation).

As you can imagine, network security folks are excited about our development here. But given that there are multiple, mature network security players who have moved their appliances to the cloud, why hasn’t this been done before? Because it’s not their hardware. It’s the cloud providers’ hardware. In order to get the performance advantage of FPGA in the public cloud (e.g., AWS F1), one needs to take a clean slate approach. Your network security has to have a lockless asynchronous architecture in x86, so that when work is offloaded to the FPGA, another request can be processed by x86 (with minimum context switch) as opposed to a synchronous architecture where the request just waits for the FPGA-offloaded task to be completed. Synchronous offload actually makes for worse performance due to PCIe roundtrips. So we saw it as an imperative to completely re-think the architecture, rather than simply bolting on an FPGA offload for marketing reasons.

So what’s the result? Well, I can tell you that we’re pretty happy with it. We are announcing our F1 Beta here at Xilinx Developer Forum (XDF 2019). Have a look at the charts below (tested with x86 vs. F1 on AWS) - we see highly consistent latency, and much higher performance with less consumption of compute. All told, higher, more predictable performance and latency - regardless of how intense the attacks got. 

Want to try it? We’d love for you to see how it works to protect your apps. Our F1 support will be generally available beginning in November of 2019, and we are happy to set up an evaluation for your organization. 

Available on AWS and Azure Marketplace

PAYG Bundled pricing