![]() |
The main reason behind the need for these two implementations has to do with the growing performance disparity between hot edge server flash and cold core spinning disk. I also mentioned that storage systems will have to become more adept at moving data from server to HDD to cloud. In fact, there is a growing trend to minimize the amount of HDD on premise to avoid buying, installing, repairing, and powering HDD. At the beginning of the disk array era the storage controller would boot and recognize a static set of disks; now the storage controller must look "up" to the server and "out" to a cloud provider (as well as "down" to any on-premise HDD). One possible implementation of cloud tiering involves storage system interfaces with cloud gateways, as depicted below: There is a great deal of algorithmic complexity for storage tiering, especially when integrating cloud gateways into the configuration. In this post, however, I'd like to propose that tiering in the context of new data lake architectures allows us to envision a secure tiering solution that (a) starts with the application, and then travels (b) down through the server and HDD tiers, and (c) continues out to a cloud service provider if desired. If a new instrastructure is emerging in the industry, it makes sense to design secure data tiering and placement into that implementation. The application starts by specifying its SLA and security needs, and also could in theory specify a percentage of application data that can be stored off-premise. This approach (specifying security-centric SLAs) happens to be the focus on new research in Europe that I was briefed on last week: secure provisioning of cloud storage (and other cloud-based resources).
The SPECS project aims at developing and implementing an open source framework to offer Security-as-a-Service, by relying on the notion of security parameters specified in Service Level Agreements (SLA), and also providing the techniques to systematically manage their life-cycle. This concept would allow a data center operator to articulate the current security level for storage implemented within their own data center, and match that against advertised security capabilities from cloud service providers. The SPECS research looks forward to an environment when data center operators have multiple choices for cloud storage providers and want guarantees on not only performance and reliability SLAs but security as well. Note that the SPECS research is not limited to storage and is applicable to a wider range of computing service from cloud service providers. This series of posts is one of many that discuss the issues involved with building a data lake architecture. Tiering between server, HDD, and cloud brings to a close the second phase of the discussion from moving between 2nd and 3rd platform data center infrastructures. In an upcoming post I will touch on new management paradigms for a data lake implementation. Steve Twitter: @SteveTodd EMC Fellow
|
