![]() |
In a nutshell: Today we announced and demonstrated “Project Caspian”. What is it?:
I want to be clear – the hardware is the LEAST interesting part of Project Caspian (in fact, the commodity hardware is very similar to what you saw in VxRack). What’s more interesting is the software stack – which is designed with a “clean sheet” approach looking at exclusively P3 workloads. Project Caspian “industrializes” a pure (directly from the trunk) OpenStack implementation into a turn-key solution. On stage, in the demo, we also demonstrated Project Caspian’s roadmap as well – being a turn-key delivery platform for Cloud Foundry and all the major Hadoop distributions (Cloudera and Hortonworks/Pivotal ODP). It is a Rack CI offer, and one designed for webscale, cloud native applications. Why care? Well, on one level - it’s really cool. It’s also an open source story at it’s core. On another level – it reflects the 2nd way that the federation is tackling these new workloads (the first way being the VMware VIO and “Cloud Native Apps” efforts) – and is a proof point of our core tenet of “Choice”. Some will choose VIO + Photon + CF. Some will want a 100% open source model. On perhaps the most important level – it reflects a important new way of tackling net new workloads and generating innovation by industrializing open source for the enterprise, something many are tackling, and EMC is doing it head on. Ok – read on for background! In my experience, there’s an interesting “fork” occuring in how many customers look at infrastructure.
This is really a fork in the road – and the jury is out on which of these is the “right choice” (I suspect the answer varies). If you put together VMware’s announcements from the beginning of the year with today’s tech preview of Project Caspian, you can see how the federation is tackling BOTH of these routes. For customers that see themselves in the first route… VMware Integrated Openstack and VMware’s approach for “Cloud Native Applications” (See more on Project Photon here) represent the first of the two approaches. This approach looks at workloads and starts with a foundation of looking at the abstraction model assuming a kernel-mode VM (in which you can have everything, including containers). This approach assumes the base persistence strata is mostly a transactional layer (VSAN/ScaleIO). It’s the manifestation of the “One Cloud” approach. EMC will support that with hyperconverged offers in both open and flexible engineered systems (VxRack that we demonstrated on monday) and appliances (the EVO program). For customers that see themselves in the second route… Project Caspian is the “pure platform 3” approach, and the answer when you start with a “clean sheet of paper”. This picture is a visualization of the idea: There’s no value judgement in this – Prius is awesome, so is Tesla. For some customers, evolution is the answer, for some, revolution. There’s an important note that flows from that…. Nothing in Project Caspian focuses at all at apps that need infrastructure resilience. Put a “Pet” workload in it – and it will not do well. Caspian’s software stack is built exclusively for “Cattle”. It’s built using an “open source always” model. It views the workloads as having some Nova instances, a lot of containers (Rocket, Docker, Diego), some bare-metal (for next generation data fabrics – which have their own abstraction) as the low level abstraction layer. It also has less transactional open SDS than VxRack – and a LOT of Object and HDFS via ECS and in the future DSSD. Object and HDFS tend to be the volume persistence layer for “pure P3” apps. This customer decision tree looks something like this – and the “go left” or “go right” choices have no “value judgement”: Also note that Project Caspian could very well fit into part of the “RACK” taxonomy of CI I talked about on Monday herehere. Project Caspian’s software stack is also really about scale. It’s just not optimal if you don’t have a fair amount of scale. It’s not that it can’t be small, it’s just not the sweet spot. Also, it’s not just about scale. Remember, “RACKS” and “APPLIANCES” can both use hyper-converged storage/compute designs – but “RACKS” bias towards “Flexibility” (in other words, a broader variation in personas, and hardware configurations) and “Appliances” (even those targeting rack scale deployments) bias towards “Simplicity” (narrower variation in personas and hardware configurations). Project Caspian has to cover a broad range of more disaggregated compute/memory/persistence – as at web-scale, people don’t use appliance form factors. Put it this way – it would have to be able to run on a broad range of the stuff that’s in the Open Compute Project. Here’s the Project Caspian Demo we did today: As I noted, the hardware used in Project Caspian is not the main point. The main point is the softare. The software is a cool story in itself – and has “OPEN” at its core:
What about the hardware? Here are 3 examples of Project Caspian builds – each with different core/memory/persistence mix. The orange one uses the next-generation version (Haswell/Broadwell based) of the 4 module/2U design used in the VSPEX Blue hardware. It would be good for general purpose, and would use a mix of ScaleIO and ECS Object/HDFS as it’s persistence layer (it has a moderate amount of storage/IOps). The persona is a mix of Openstack, CF, and a moderate amount of Hadoop/Object. The yellow one targets a much denser core count, and a smaller amount of persistence capacity (but lots of IOps via local SSDs). The persona mix a large amount of Openstack, CF, and a small amount of Hadoop/Object. The blue one targets a persistence capacity design center, and you can see two things: 1) the fact that Project Caspian builds on the ECS appliance experience; 2) the next-generation ECS is actually a Project Caspian variation – one that is very capacity-centric. Also look at the crazy capacity density! It’s on this that the mix is a small amount of Openstack, CF and a large amount of Hadoop/Object. In the future, we will also include DSSD in these configurations when the persona mix includes a lot of in-memory data fabric and hyper-transactional workloads. You can see the space in the middle of the racks (and the fact that we separate the racks into networking/failure domains) above, and if you look at the examples below – you can see that DSSD can nicely fit right in there – and use PCIe/NVMe connectivity to all the hosts in the rack… Hence DSSD D5 is “Rack Scale Flash”. You can see that these are IOps/latency persistence layers that just melt faces. Netting it out – Project Caspian has a laser-focus: creating the industries best hardened and industrialized Platform 3 stack for customers who are going “full bi-modal”, with a full embrace of open source and commodity hardware models. It’s a “tech preview”, but we’re dead earnest. The first customer council to bring customers into the inner fold will be in June, and expect more on Caspian a little later in the year. In this new phase of the OpenStack’s, Cloud Foundry, and Hadoop community and lifecycle – it’s a race to try to make these open-source models work well in the enterprise. There is a false meme out there that no one knows how to make money around open source software. That’s not true.
We’ve recently seen many of the early “industrialized OpenStack” offers (think Nebula) move on, and there is a need in the marketplace for people who make this easier for enterprises to consume. Enterprises who have tried to deploy and maintain OpenStack have highlighted how hard this can be. Those that have tried to deploy Cloud Foundry on premise have said “unbelievably awesome once it was up – but even harder to get running right than OpenStack”. Project Caspian is our effort to create the best way to deploy an “industrialized” platform for platform 3 “cloud native apps” via is a clean sheet design for vanilla OpenStack, Cloud Foundry, and the major Hadoop Distributions.. It will be interesting to see if the giants (as I’m sure others are working down this path) can make it work for “P3 purists” – and ultimately customers will chose. Those that turn to EMC will choose their path. Pure platform 3 = Project Caspian. “P3 on top of P2.5” – will go VIO and VMware’s Cloud Native Apps efforts, and run CF and Hadoop on top of the vSphere big data extensions. As a federation, we’re all in, and playing to win! Would love your input, your thoughts! Interesting stuff! |
