FreeStor® Powered by FalconStor

The BeFree Blog

As an early pioneer and industry leader in innovative, software-defined storage solutions, we often have thoughts and expertise we would like to share. Here at FalconStor, we strive to provide IT organizations and customers with solutions that provide the flexibility to BE FREE. Our newest platform, FreeStor is all about delivering the freedom and flexibility to manage storage sprawl and truly unify heterogeneous storage infrastructures. We also like to provide thought provoking and alternative views to Storage challenges, infrastructure, and the industry itself. Check back often for our latest thoughts and BE FREE to share your thoughts and comments. After all, ideas spark other ideas, and community discussion shapes cultures. Let’s share and learn together. | Sincerely – Gary Quinn – CEO

 

By Pete McCallum
14 Jul 2016

There was a time, not so long ago, that a company could afford to be an “<Insert Vendor Name Here> company.” There was a go-to supplier, go-to VAR, and a go-to product line. The go-to company provided training and lunches, and that built up loyalty. In all of that loyalty, the hope is that the go-to company innovates regularly enough to continue providing value, and maintaining goodwill even after the sale. However, what I believe the consumer space is discovering is that trusted vendors didn’t live up to their end of the bargain. Perhaps it’s not the vendors themselves , but industry practices that need to change?

When I was running datacenters, and being entertained by vendors who wanted to sell me things, the big differentiator was always the extra-mile folks. Usually, technical resources who improved my craft a bit – and didn’t just rest with teaching me about their product. I worked for an aerospace contractor who had to follow government rules about three-bids, single-source, and diverse vendor strategies – mostly for security reasons. I was able to buy technology based on my business needs and trade technology coaching for purchasing tips with my preferred vendors –This practice helped me to vet out vendors very quickly into “providers” and “partners.”

Now that I’m out of that role, I see things a little differently. I know now, why long-term vendor affinity can crush innovation. Without explicitly naming names, take one of the best-known technology providers of the past decade. Most of their revenue has traditionally been tied to years 4-5(+) of maintenance on a CAPEX basis. Typically, your initial investment is discounted, upgrades and expansions are at a premium (to make up for the discount on initial purchase), and maintenance runs at least 20% of initial purchase price annually for as long as you want support. If you do the math, you actually re-purchase your product every 5 years. This should not be an epiphany to anyone as it has been a standard practice in the technology sector for decades.

Financials aside, this kind of practice causes some trickle-down effects on product innovation. If I stand to make 60-70% of my revenue in years 4+, why would I destabilize that base by asking a customer to upgrade or switch product lines sooner rather than later? My goal is to keep my customer happy on their current portfolio as long as possible. So how do I do this if I have to keep my hardware somewhat “flat” in capabilities? Software, of course. I can innovate all day long in my software as long as I only allow my software to speak with my hardware. Software also requires training and has both soft and hard loyalty due to skill, process, and product affinity. The effect of this practice is that you keep my hardware for years past its prime and lock deeper into my revenue stream. So now, instead of using technology to support my business, I change my business to support my technology.

I believe that today, this paradigm is being rejected more and more, causing an interesting reaction in the client and vendor landscape. The CIO cries out from the wilds and shouts: “I have too many silos, too many processes, too many people, too many moving parts!! I need simplicity!!” And the industry reacts with glee: Let’s give them convergence. So now the big companies band together to hold onto market share and bundle up end-to-end solutions with “cross-platform” orchestration. Datacenter in a box, we were promised. That’s all fine and dandy as long as you like and only use their boxes. Now we are even MORE tied into the revenue stream – and that one throat to choke requirement ends up being yours, Mr. CEO as you allow your business to slowly get muscled by your technology providers.

We followed this trend with Hyperconvergence which took out the multiple vendors, and built a consolidated, optimized reference architecture of best-in-class capabilities and built little tiny boxes of lock-in. As long as we like the hypervisor, and don’t mind the restrictions of scale, and the lack of mobility, and have the ability to toss our whole old datacenter (admittedly not a bad choice for many…), then Hyperconvergence is a great concept. But that’s a lot of very important “as long as you...” caveats.

Now we see the growth of open-source in the crazy juggernaut of OpenStack. Let’s get all the big players in the industry to contribute their best-in-breed feature differentiators to a “free” platform that gives everyone of any size an amazing platform of features and capabilities that will completely remove the need for your commercial software, and will tend to commoditize the value proposition of your hardware further into the ground. “[throat clears] sure. I’ll give you my best! [rolls eyes].” Technology socialism, for sure, but it does have lots of promise.

While I’m partially bashing on everything else, I have to make sure I include the cloud in my rant, right? Let’s get this definition crystal clear: the cloud is somewhere else, with some other technology, who I pay to maintain my data for the lowest possible cost and lots of contractual assurances. Heh. Yes, some clouds are puffier than others, but at some point, financials and competition will require cloud providers to push prices to the floor – and force lower standards. Am I saying you tend to get what you pay for? Yes. Yes, I am. The great thing, though, about the cloud providers is that they can be a greenfield technology transition without that investment on the part of the customer and it is about as simple as it gets to consume.

So the echo in the wind back to the CIO is this: “Now that you’ve asked, we’ve provided very little of value other than to offer you a place to offload the responsibility to someone else!!” And now the cloud provider has to deal with the vendor-lock-in and silos and cost fighting, and the CIO fidgets at her desk with almost zero control over what happens to her business’s critical data.

What is it that we need, then? There is something of value going on here, even if you can’t see it under my somewhat cynical overview: there is something happening . Movement is taking place. New technologies are creeping in that change the paradigm of consumption and bring cloud-like models into on-premise IT. What convergence taught us is that orchestration across storage, networking, and compute are critical for IT operations. What hyper-convergence is teaching us is that focusing down into workload-optimized topologies and optimizing operations through analytics-infused automation is critical to IT operations. OpenStack is answering the call for hardware-agnostic, agile platforms that leverage software to provide the right personality of services for a given workload at a given stage of a workload’s requirements. The cloud is teaching us that perceived simplicity and optimizing efficiency for cost is possible and viable. And we’ve proven once and for all that the old way of vendor “techstortion” (technical extortion, anyone?) is finally not welcome in business anymore.

What is the direction things are moving, then? Unless your business is simply hobbled by indecision and freezing all technical assets until something makes sense again we feel ya! You aren’t alone by any stretch; you have been looking at everything I have mentioned in this rant (er… blog!). You have small SAN, mid-SAN, server-san, all-flash array, good-old’-boy vendors, NKOTBs (New Kids on the Block), hypervisor-based, containerized, SDS, SDx, traditional, modern, memory-resident, SaaS, Cloud-enabled, gateway-ed, referenced, marketed, funded, and struggling vendors knocking on your door all day long asking you to listen. And you are listening, and making your choices. You are buying something, even if it’s only time. In the end, you will dabble. IT always has. We take bits of all the cookies on the plate and spit out the bad tasting tech. We keep a handful of old standards and some of the weirder cookies we didn’t think we’d ever like.

What I’m saying is that you will end up buying or trying some of almost everything I have mentioned. Your gut is telling you that something needs to change – and I think I’ve shown you what is broken. The industry will react to what you keep and what you spit out. What you will need is a way to protect your data assets while you test drive and kick tires. You need subscription-based purchase plans that do not obligate you to a return-on-investment time. If a technology does not provide value, you will simply stop using it and your money will go elsewhere. In order to facilitate this, you will need a platform that allows you to bridge your infrastructure across locations, form-factors, vendors, technologies, providers, company names, and other traditional barriers – all while maintaining your processes, skills, and guarantees of service levels to your business. Yep, you need a chaos platform – a middleware between the uncomfortable known and the exciting unknown.

Some of you may be forced into a multi-vendor strategy for compliance or procurement practices, but all of us will be goaded into it simply because the old model is broken and there is simply too much to choose from these days that tangibly helps your business.

I was this customer. I had all these challenges, I saw these changes coming. I embraced a variable technology architecture with multiple locations, and chose a platform to bring it all together and solve the challenges of shrinking budgets, compliance issues, reductions in force, changes in IT leadership, mergers and acquisitions, migrations of technology, consolidations of infrastructure, monolithic vendor platform overturn, and other challenges.

With all the chaos you will be finding in your technology options today, I would highly recommend kicking the tires a bit on a product that will help you to make better sense of what you have, and enable you to safely constrain the chaos of what is to come. 

By Pete McCallum
14 Jul 2016

I was taking a wander through the internet and for some reason searched for “Windows 3.1 architecture” in Google. Perhaps I was feeling nostalgic for the days when I was first dabbling in computers, and swapping floppy disks instead of tossing a ball around with my friends. I think, more likely, that I was (and am) feeling that the current state of the datacenter is simply wading back into old territory and re-discovering the foundations of modern IT. Don’t get me wrong: this is not a bad thing! Far from it. But it does put a pause on technology movement for a little while. After you read this blog, open this link in another tab and read it ( https://en.wikipedia.org/wiki/Windows_NT_3.1 ). Focus on the “Operating System Goals” section. I had to laugh after doing some research on the difference between hypervisors and containers. I’m sure I could draw your attention back to similar legacy technologies like DEC/VAX architectures and we’ll all have a good chuckle about just how modern our hypervisors really are. All in all, the goals have been the same in our datacenter architecture: portability, reliability, and personality. I personally really love the last one.

So let’s just put it out there: there is nothing new in technology today. Erasure coding is an improvement on RAID. Calm down. I said it, and won’t take it back. All-flash is a hilarious marketing ploy to get you to think that slapping a bunch of RAM in a disk format and making that same SAN you’ve been using all-new is something exciting. Fast is exciting but is not new by a long stretch. Hypervisors added some kind of new take on mainframe workload sharing and isolation of workloads. Containers really follow the Windows NT architecture of shared-kernel workload tokenization. But these new names!!! Oh for the love of marketing. Converged and Hyper-converged are just more compact reference architectures. Not to exclude my own product: virtualization is a detached HAL. There, I said it. Middleware for hardware with some protocol routing thrown in for portability.

Even Big Data is more of a social commentary than a datacenter “thing.” We generate and keep everything. Digital packrats. At one point, there was someone who had a handle on where everything was, but all of our movement in the direction of agility and automation have opened Pandora’s Box of data overload if you are interested ( https://en.wikipedia.org/wiki/Pandora%27s_box ).  Analytics is a real thing, but by no means new. Enterprise service bus technologies and BI has been around for a long time, and is now, reactively, moving into the sprawl of infrastructure. Let’s talk about that, too.

One somewhat new aspect of technology is location, which falls directly under the portability goals of our modern IT thought. Once we erase the variability of technology choices, we are left with a distributed workforce, nearly infinite endpoints, global presence, nearly limitless choices in vendors and variants of technology interconnectedness. Whew. That’s a lot of scrabble words. It is increasingly apparent to me that we are still struggling to adhere to our three goals: portability, reliability, and personality.

Someone asked me recently at a show what I would spend my money on if I were an IT Director today. My real reaction was: nothing. I wouldn’t spend a dime until something interesting comes along. But that is rather myopic. I would invest in any technology that can provide me portability of my data, agility in choice, protect my business, and match my agenda and requirements. I would avoid any technology that tells me how and when to deploy, expand, or contract. I would avoid a technology that locks me into a given location. I would invest in anything that keeps track of what is growing or creeping through my data. I would invest in a product that talks to and watches over everything I have today, and has the vision to encompass future technology. Otherwise, what am I investing in?

Then this antagonist says to me: “Does your product do these things? Why should I settle for anything less?”

In order to answer that very deep question, we must dispassionately evaluate against some criteria. I cannot be agnostic if I am an evangelist, now can I be? So I look at FreeStor and remove the marketing terminology and all the neat things we think it does. Let me assess it against the three goals set way back when: portability, reliability, and personality.

Will FreeStor accomplish portability of my data? On-premise and on-cloud? I would emphatically say “yes!”. Will FreeStor provision, protect, and mobilize any workload? I would also say “Yes!” to this one (should I mention we still support most flavors of UNIX in addition to x86?). Will FreeStor work with any current and future compute technology? Let’s face it, we can only build as fast as things are presented to us, but as much as possible, “Yes!” From physical to hypervisor, to cloud, we can run in or around each of these topologies with more being announced every day (so far no containers, but it’s coming!). Perhaps most importantly, can your product follow the dream of Software-Defined-anything and truly run on any platform without concern (see reliability)? Can I support my DevOps initiatives with consistent operations across vendor silos? I’m feeling pretty good about portability – it’s in the fiber of the product.

Will FreeStor provide reliability for my business? There has long been a misconception (misdirection?) in our industry that redundancy is the same thing as availability. Does having two or more copies of my data mean I’m safer? Does having a backup mean my business won’t be impacted by a disaster? Alternately, does reliability mean that when I go to install something new, it will work as advertised? From a business perspective, reliability may mean that when a customer asks for something to happen, I am ready to service the request in a timely fashion. Does my solution become more or less reliable if I rely on a 3rd party to operate? I think all of these are valid questions and scenarios, and FreeStor has been built from the ground up to address both redundancy and availability of data and business. A platform like ours has to be able to get lost things back, restore function, and avoid loss/downtime in the process.

Now we come to personality and what it means to you. Can FreeStor maintain multiple personalities to match and meet the personality of my business needs at this point-in-time. The term personality was used in that wiki article I pointed out to you to indicate the operating system being able to support multiple different workloads that were once restricted to a specific (foreign) platform. Woah. Can FreeStor support multiple personalities? If you’ve ever met our company leadership, you’d know that’s a “Yes!!!” All joking aside, FreeStor has been designed to not only adjust to individual workloads from a provisioning and protection (application/data-aware, anyone? Yes!) perspective but from a performance and availability perspective as well. FreeStor can operate as local disk, raw disk, cloud resources, virtual disks, each with different profiles and capabilities even to the same machine. FreeStor can jump protocols and form factors, operating as both virtual and physical resources as needed for the workload. So, Yes, FreeStor has tons of personality.

I did make a comment outside of the big three that I thought I should pull together before moving on: the concept of a data awareness. It has been said by many that one of the reasons storage hasn’t really moved as much as other realms of technology because of two key elements: It’s far easier to do storage wrong than it is right AND data is just too dang important to mess with. Couple those two factors in with the massive explosion in sheer volume of data and the only reason we can find anything is the speed at which modern compute can operate. FreeStor added a layer of full-stack analytics into the storage layer to assist with all aspects of storage operations. While not innovation in and of itself, the goal is to make use of all the performance and utilization data, coupled with rich metadata to optimize operations across any underlying or client environment. So not only do we know what our data has done and where it has been, we can determine where it should be before it needs to be there. A tall order for sure, but I believe it’s where storage needs to be.

All in all, we’ve found that while 3D NAND and 80GB iSCSI are really neat improvements, I couldn’t look you in the eye and tell you that they will fundamentally change how you do business in and of themselves. I couldn’t tell you that building a Hadoop-based data warehouse will allow you to understand your business any better than if you ran a really great spreadsheet. But I can tell you that just because something works for Ebay does not mean it will work at a scale for your local dentist’s office. There is room for all the little permutations and capabilities of technology. However, in the end, when you type your name into a field on a web form, or save that word document, you need to know it is committed to something persistent, and it will be there when you get back to it. No matter what technology that data is stored to, it must maintain portability, reliability, and personality – coupled with awareness – in order to be of value.

So in a blink of my eye, I respond to this show-goer at my booth: “I may not be all things to all people, but I can be something pretty amazing for you if you give me a shot to show you.”

And I think that’s pretty refreshing in today’s technology ecosystem.

By Farid Yavari – Vice President, Technology - FalconStor
07 Mar 2016

This year FalconStor attended the HMG CIO Summit of America in Manhattan where we showcased FreeStor, our software defined storage solution for enterprise storage abstraction and management. I also participated in a panel discussion titled “Identifying Future Trends that Will Impact the Enterprise”.  During the summit the FalconStor team engaged with various CEO’s and CIO’s from different industry segments and gained valuable insight into what is top of mind for the top management at these companies.

A widely noticeable trend in this summit was the appetite to migrate to public cloud solutions where the internal IT organizations fall short in delivering the speed and cost points required to run an efficient and competitive business. Most CIOs were comfortable with moving a portion of their data processing workloads to the public cloud, and some hinted at projects to build a cost-effective private cloud infrastructure based on Open Compute (OCP) or other white box solutions. There is also unrelenting pressure to deliver cost efficient IT operations based on self-provisioning and “everything as a service” business model.

The internal IT organizations of the future need to focus more on business integration and how their services and expertise can help the business run better and faster with shorter implementation cycles and instant provisioning capabilities. IT organizations of the future will be part of the business teams and look to solve IT problems from the business perspective.

Another noticeable trend was the momentum towards open source software. Quite a few enterprises were taking a serious look at how they can engage and contribute to open source and utilize open source solutions in their data centers. The combination of white box hardware and open source software is extremely appealing to some of the large size enterprises.

Internet of things (IOT) is also gaining momentum in most industry segments. It is used extensively in agriculture to help with precision planting the seeds and optimizing harvests.  IOT is also used in health care for monitoring and managing remote devices and in consumer electronics where it plays a big role in monitoring and tracking user behavior and data. The amount of data generated by the IOT-enabled sensors and remote devices is unprecedented. Most IOT implementations are either backed by a private or a public cloud infrastructure that can provide the flexibility and availability required by these technologies.

Data management was a huge topic at the summit.  The need to sift through the raw data and convert useable portions of the data into actionable information is an enormous challenge for the enterprise.  Tiering of data plays a significant role at enterprise scale, and the possibility of utilizing cost-effective hardware generated a lot of interest with the CIOs.

During the summit, we saw a huge interest in FreeStor’s capabilities to enable utilization of commodity storage hardware, while benefiting from our enterprise grade data services. The fact that FreeStor can be implemented as a brownfield solution separated it from most startup companies with fresh green field requirements.

The 2016 CIO Summit of America was a huge success and provided valuable insight into the state of the IT industry as a whole.

By Farid Yavari – Vice President, Technology - FalconStor
17 Feb 2016

In our previous blog post, we covered how to build a disaggregated storage model out of commodity hardware for lowest Total Cost of Ownership (TCO) in today's hyper-scale data centers. Deploying commodity hardware usually requires an intelligent orchestration and management software like FreeStor® to provide an abstraction layer separating heterogeneous storage hardware from the applications.

FreeStor enables deployment of storage technologies with different capabilities, at various cost points, in a tiered model. With FreeStor, data is placed on the right storage media, which provides the Service Level Agreement (SLA) -driven services to apps and achieve the lowest possible TCO. Combining heterogeneous commodity storage with an intelligent policy-driven software-defined storage (SDS) also provides flexibility in data migration, enables seamless tech refresh cycles, and independent scaling of the storage and server infrastructure.  As depicted in the diagram below, having a comprehensive Representational State Transfer (REST) Application Programming Interface (API) interface to interact programmatically with the Intelligent Abstraction® layer, provides an industry standard approach to managing the SDS environment, while enabling automation in deployment, monitoring, and management of the storage infrastructure.

Some data protection and high availability (HA) capabilities aren't necessary for a scale-out storage architecture because often applications that are running in these environments feature built-in resiliency and self-healing capabilities. Other features such as application-aware snapshots, clones, deduplication, and compression are tremendously valuable depending on the types of workloads.

Private and public cloud capabilities, and having the right drivers to interface seamlessly with the cloud, are an essential feature to look for when evaluating FreeStor or any other software-defined storage solution. Being able to place data in the cloud, and access it as needed, is a requirement in most hyper-scale data centers. Complete storage abstraction software needs to operate in a virtualized environment as well as bare-metal implementations supporting block, file, and object protocols.

There are many challenges involved in architecting a resilient, scalable storage infrastructure for the next generation of hyper-scale modern data centers. Storage disaggregation for scaling and tiering purposes needs to address density, cost, and performance for various use cases. FreeStor helps companies get the most out of these disaggregated storage models, and meet the heightened scrutiny of their TCO. FreeStor users recently surveyed by IDC reported an average savings of $243,000 in annual storage costs, representing a total return on investment of 448%, and a payback period of 5.5 months. For hyper-scale environments, it's hard to imagine a more efficient way to control costs.

By Farid Yavari – Vice President, Technology - FalconStor
09 Feb 2016

Today, in most data centers, cloud, no Structured Query Language (No-SQL) and analytics infrastructures have been largely deployed on a direct-attached storage (DAS) architecture and is generally a Total Cost of Ownership (TCO) -driven deployment.

The DAS approach binds the compute and storage resources together, preventing independent scaling and tech refresh cycles. The converged DAS model works very well at smaller scale, but as the infrastructure grows to a substantial size, wasted compute or storage can greatly affect the TCO of the environment. Since the DAS model is constrained by the available slots in a server, scale is limited and often quickly outgrown. In some compute heavy environments, there may be enough DAS allocated to the servers, but the work load needs more Central Processing Units (CPUs), therefore some of the allocated DAS stays unused when additional nodes are added.  In addition, since the compute infrastructure is usually on a more aggressive tech-refresh cycle than the storage, converging them together in a single solution limits the flexibility for the tech-refresh. There is a trend to disaggregate at least the warm, cold, and archive data from the compute capacity, and use storage servers in separate racks as Internet Small Computer System Interface (iSCSI) targets to carve out the storage capacity. Hot data, especially if it resides on Solid State Disks (SSDs), is not easily moved to a disaggregated model because of network bandwidth and throughput requirements

The disaggregated iSCSI storage servers are basically commodity servers with just enough compute to drive the input/output (IO), and a large amount of storage to act as a pool of dense capacity. They can contain high performance SSDs, Hard Disk Drives (HDDs), or extremely low-cost, low-performance solutions such as Shingled Magnetic Recording (SMR) drive, depending on the workload performance and price requirements. In some SMR-based storage servers, a very thin layer of Non-Volatile Dual In-line Memory Module (NVDIMM) is used as a buffer to convert random write IOs to sequential for better efficiency

Some high-performance storage servers accommodate up to 240 terabytes (TB) of all-flash capacity sitting on a 12G Serial Attached SCSI (SAS) backend in 2 Rack Units (RUs), with two separate X86 servers in the same chassis, acting as “controllers” and a total of four, 40-Gigabit (40G) Ethernet connections (two on each server). There are other examples of very low cost, all-HDD storage servers with up to 109 6TB 3.5” Serial Advanced Technology Attachment (SATA) drives and two, single-core X86 controllers with 10G Ethernet connections to the network in a 4 RU stamp.

Carving out ISCSI target logical unit numbers (LUNs) in a storage rack and presenting them to various initiators in a different compute rack is a valid disaggregated model for storage in a scale-out architecture. In some instances using iSCSI Extensions for RDMA (ISER) with routable Remote Direct Memory Access (RDMA) can further speed up the throughput and input/output operations per second (IOPS) of the architecture. There is an added cost of the network upgrade that needs to be accounted for, usually around 20-25 percent of the total cost of the solution. The storage network needs a minimum of 40G connectivity on the storage servers and 10G connectivity on the initiator side. The network switches need to have extra-large buffers to prevent packet drops and in many cases priority flow control (PFC), explicit congestion notification (ECN), and quantized congestion notification (QCN) become necessary.

There are many ways to build a disaggregated storage model depending on use cases and requirements. In our next blog, we will cover how a disaggregated model benefits from a properly architected software platform, to gain not only value and utility, but essential features for the applications and the driving business needs.

By Farid Yavari – Vice President, Technology - FalconStor
02 Feb 2016

Storage has finally become an interesting field, full of innovation and change, addressing growing new requirements for storage flexibility, density and performance. The falling prices of flash, the introduction of various flavors of storage class memory, combined with increasing appetite for commoditization of the data center infrastructure, has helped fuel the innovation in how data is stored and accessed.

Companies are faced with the challenges of how to store their ever growing data efficiently, at a cost point that is palatable to CTOs and CFOs, while keeping the right levels of performance and SLAs in order to provide storage services to end users and applications. At the same time, internal IT organizations are facing the challenge of competing with flexibility and price points offered externally through public clouds.

Two new trends have emerged in architecting solutions for next-generation “hyper-scale” data centers and storage infrastructures that must grow on demand to meet compute, memory, and storage requirements. These requirements include on-demand provisioning, instant capacity management, and flexibility to scale each individual component independently, driving cost efficiency and a direct impact on the TCO.

First, as outlined in the table below, there are legacy SAN environments running transactional OLTP workloads, primarily based on Fibre Channel and NFS, with high-performance (greater than ~500K IOPS, less than ~5ms response time to the application) SLA targets. This environment is built on storage appliances and SAN installations with complete HA capabilities that provide data protection and service resiliency to the applications. The growth rate of the traditional SAN environment compared to other storage infrastructures is relatively low, and its impact on revenue is high enough to justify paying the premium for brand-name storage technologies that come with all the HA and data protection capabilities. Understandably, many companies are unwilling to try groundbreaking technologies within an OLTP infrastructure as stability; security and availability are the primary goals for this environment.

The second architecture described in the three columns to the right of the table, and arguably the fastest growing segment of every data center, is the scale-out environment running No-SQL, cloud and big data workloads. From the storage perspective, these environments usually run in a direct-attached storage (DAS) or disaggregated storage model based on various protocols such as ISCSI, PCIe, or NVMe. The scale of the storage infrastructure, especially for big data analytics, can reach hundreds of petabytes, which makes them extremely TCO driven. Many applications running in these environments have built-in resiliency, anticipate hardware failures, and can self-heal at the application layer. The document and key value stores, as well as analytics applications, feature server- and rack-aware replication-based data resiliency to guard data from hardware failures. When data protection and self-healing features are handled at the app layer, the need for building HA features at the storage layer is eliminated, which opens the door to utilizing consumer-grade, commodity hardware that can fail without impact to the service availability.

In future blogs, we’ll take a closer look at these trends in hyper-scale data centers, and how to achieve desirable TCO as well as resiliency and performance.

By Pete McCallum – Director, Data Center Solutions Architecture - FalconStor
14 Oct 2015

There was a time, not so long ago, when a storage administrator actually had to know something in order to do their job. There was no automation, auto-tiering, virtualization, API sets, QoS, or analytics; all we had was metaLUNs, concatenated metaLUNs, extent management, and RAID sets. We used to sit at the same lunch table as the UNIX guys who had to write code to open a text editor, and who had never EVER used a mouse.

Yes, it used to be that when something broke or started to slow down, we would fix it by actually going into a console or shell and typing some magic commands to save the day.

These days, managing storage is a very different proposition. We have such awesome capabilities emerging from software-defined platform stacks, such as IO-path QoS, hypervisor-and-cloud-agnostic protocols, and scale-out metadata - just to name a few. With all of these advancements, one would tend to think the days of the storage administrator have gone away. And I would tend to agree to some extent.

No longer is the storage administrator really concerned with finite volume management and provisioning. Today, storage performance almost manages itself. Thin provisioning is less about capacity optimization and more about data mobility. And there is almost as much data about our data as there is data.

In some ways we have converted storage administration into air-traffic control: finding optimal data paths and managing congestion of IO as things scale beyond reason. This is where analytics really comes into play.

In all aspects of IT, administration is taking a back seat to business integration, where knowing what has happened (reporting), plus what is happening (monitoring) starts to generate knowledge (analytics) about what is happening in the business. When we add predictive analytics, we add the ability to not only make technology decisions, but ostensibly, business decisions too, which can make a huge difference in meeting market demands and avoiding pitfalls. This moves IT (as well as storage) out of reactive mode and into pro-active mode, which is the number one benefit of predictive analytics.

Let’s see how this applies to a business-IT arrangement through a real world example: month-end close-out of books in a large company. In the past, an IT department would provide infrastructure that met the worst-case scenario of performance impact: So, despite having a 3000 IOPS requirement for 27 days of the month, the 35,000 IOPS month-end churn (for about eight hours) pushed for an all-flash array at 4x the cost of spinning disk. Because the volumes require a tremendous amount of “swing space” as journals fill and flush, reporting is run against copies of data, and Hadoop clusters scale up to analyze the data sets, almost a PB of storage capacity is required to support 200TB of actual production data. All of this is thick-provisioned across two datacenters for redundancy and performance in case of a problem or emergency.

Most of this data would be made available to the business through reporting and monitoring, which would allow an IT architect to decide on a storage and server platform that would handle this kind of load. Manual or semi-manual analytics of many different consoles and systems would merge the data (perhaps into a spreadsheet) where we would find that (apologies if my math is off a little):

  • 30% of all data load happens on one day of the month.
  • 90% of the “storage sprawl" is used for other than production data. Of the remaining 10% used for production data, perhaps 2% of that space actually requires the performance.
  • Cost/TB/IOPS is skewed to fit 10% of the capacity (or .2% for real!), and 30% of the total load, at 8-20x the cost.

There are far more correlations of data that can be made – and are obviously actionable and meaningful to business. For example, one could:

  • Right-size the performance load to the actual requirements of the dataset, rather than incurring tremendous expense to meet the worst-case scenario.
  • Manually shift storage performance tiers prior to month-end (or automatically if the storage platform allows).
  • Thin provision or use non-volatile, mountable snapshots for handling data mining and “copy data” to reduce storage sprawl.

All of these are actionable through a good virtualization platform (like FreeStor) and analytics on platform and application metadata. If we add a truly heterogeneous SDS platform (like FreeStor) that can operate across different performance and platform tiers of storage, we start gaining a breadth of insight into the infrastructure that surpasses anything an admin could reasonably wrap their day around. However, because of the sheer volume and complexity of capabilities, automation and foresight MUST be imbued into the control plane.

This is where intelligent predictive analytics comes in: It’s not about seeing into the future as much as it is correlating events from the past with current events to adjust capabilities in the present. If I know all the capabilities of my targets (performance, capacity, cache, storage layout for read/write optimization, etc.) and I know the trends in requirements from the source applications, AND I know the capabilities and features of the SDS platform (like FreeStor), then I should be able to correlate events and occurrences into policy-based actions to monitor security, performance, protection, and cost SLAs with actual point-in-time events in the system. I can then recommend or automate adjustments to IO paths, storage targets, DR strategies, and new operational requests through intelligent predictive analytics.

All this boils down to operational efficiencies for the business, cost savings in key infrastructure purchasing decisions, better SLA management for business workloads, faster conversion of data into information, and faster time-to-value. I know these are big phrases and promises, but we see it every day. No longer is it enough to be an administrator or an infrastructure architect. No longer is it enough for the CIO to manage a budget and hope systems don’t go down. These days, every aspect of IT is part of the business revenue stream and is a partner in making businesses profitable and efficient. Predictive analytics is a key enabler for this new requirement.

By Pete McCallum – Director, Data Center Solutions Architecture - FalconStor
21 Aug 2015

Let’s face it; embracing new storage technologies, capabilities, and upgrading to new hardware often results in added complexity and costs. The reality is that when IT equipment, platforms, and applications do not integrate with one another, the resulting “sprawl” of storage islands and silos on disparate systems can be costly, risky, disruptive, and time-consuming.  But it does not have to be that way. 

Few organizations have the luxury of performing a massive infrastructure replacement or maintaining completely identical infrastructures for primary and secondary storage. Hardware/platform incompatibility, different system generations, different architectures, and different media types can compromise even the most diligent efforts at protecting and replicating business critical data.

A properly architected Software-defined storage approach can ease many of these integration and management pains.   Software-defined storage implemented at the network fabric layer, abstracted from the underlying hardware, will bypass storage sprawl issues because it standardizes all tools, data services, and management. 

Horizontal, software-defined storage deployed across the infrastructure in a common way should accommodate storage silos on geographically dispersed data centers, locally on different storage systems, or across physical and virtual infrastructures. Software-defined storage eliminates the accumulation of point solutions, and regards all storage as equal.  This enables the delivery of common data services like migration, continuity, recovery, and optimization that can be executed consistently across the entire storage infrastructure. That reduces complexity, the numbers of silos to manage, as well as lowers licensing costs for data services array by array.

The key to solving the problem is not to solve it at all, but to work with it, through a truly horizontal software-defined storage platform that can marry unlike infrastructures, including arrays, servers, hypervisors, and the private or hybrid cloud.  It’s time move the industry forward and BE FREE to eliminate the legacy of silos and infrastructure complexity.


DISCOVER MORE
LET'S CONNECT
Facebook
Twitter
LinkedIn
YouTube
Vine
FalconStor Software
+1.631.777.5188  
Site Map    |    Regional Sites    |    China    |    France    |    Germany    |    Japan    |    Latin America    |    Spain    |    Switzerland    |    Privacy Policy & Legal
© 2016 FalconStor Software. All Rights Reserved