Pure Storage claims that it has developed the IT industry’s first integrated AI-ready infrastructure for deep learning. That may be true, but artificial intelligence (AI) is not new – far from it. There are many phrases that marketers can use to make even an old piece of technology sound completely brand new, or to place a vendor on the first rung of any given market’s ladder; but, in actuality, it may mean that they’re not the first. Yet, they might be.
AI has been the talk of the town for quite a while now, and it’s not all about storage, as you also have solutions that use machine learning and AI for a wide range of purposes, including for WAN data acceleration. So what, if anything, has changed to support Pure Storage’s claims? After all, there are so many companies out there that say their technology is being driven by integrated AI and by machine learning for a plethora of purposes.
Not ‘one thing’
Tony Lock, Director of Engagement & Distinguished Analyst at Freeform Dynamics, says: “Artificial intelligence, machine learning (ML) and deep learning (DL) systems all require IT infrastructure on which to run. The important thing to recognise is that each of these technologies is not ‘one thing’.”
They are not one thing in his view because you could have “very different requirements in terms of CPU, storage and networking.” This depends on the specific details of the workload. He explains that DL and ML have two components for training, learning and model execution. “The learning phase may involve feeding very large datasets to build the model, while the execution phase may then need to run in a very tight environment in a lot of scenarios or may also have to handle large datasets”, he says.
So, he thinks that the key challenge is to “use building blocks of IT Infrastructure suited to each workload, using a phase of the project, rather than design and assemble bespoke solutions each time.” He adds: “The biggest change is that AI, MI, and DL are moving out of R&D and into mainstream usage. Mainstream infrastructure that uses demand consistently can be put together quickly without having to design unique systems every time.”
To design infrastructure and new systems every time would be unjustifiably costly and completely unnecessary from a time and financial perspective. For efficiency’s sake, there needs to be a big measure of flexibility. He explains that several vendors are bringing to market infrastructure systems that are designed specifically for AI and ML workloads. He says these can be adapted to meet specific requirements and uses without much need to “research, design and build each one individually, as they are also fully supported, and a matter of great importance for everyday enterprise usage.”
Not just storage
It’s also important to recognise that AI isn’t just about storage. As with all AI and big data applications, the greater the number of data points the better the outcomes. The freshness of the data has an impact on this too, and WANs have always been a long-standing problem in the computing and data world. That issue is occurring even more so, now that increased bandwidths are now available at lower costs per Mb.
Organisations are also now collecting increasing amounts of data from further and further afield and, because of the latencies involved, it is becoming increasingly difficult to realise the true performance of these WAN connections. As latency increases to even a modest 10ms, this can have a dramatic effect on throughput – reducing it by at least 70% of its full potential. By using AI and parallelisation techniques to mitigate the effects of latency and to manage the data on the WAN, it becomes possible to return the efficiency of the WAN connection up to 98% of capacity.
Poor WAN speed isn’t just about latency. It is a combination of many factors, with another key element being the effects of packet loss on the WAN. As with latency, packet loss doesn’t take much to make a huge difference to the throughput, as even a modest amount of packet loss at 0.1% has a considerable effect. When you move to 0.5%, the effect is dramatic. When combining the two factors of latency and packet loss, the efficiency of the WAN can drop below 1%.
Lock thinks: “WAN acceleration may be a requirement for specific use cases, for instance, if large data sets need to be moved a lot, or for model training.” He adds that he thinks latency may be important for some systems but not for others, and he rightly points out that “latency is a natural consequence of physics; So as bandwidth increases it allows more data to be moved per second, but this does not move data more quickly”. He, therefore, finds that the best way to mitigate latency is to put the data processing close to the action.
The first reaction to poor performance over networks is to add more bandwidth and this has been true of early WAN connections, where the bandwidth was low, compared to the WANs now commonly available. Typically, adding bandwidth resolved very little. For example, a connection with WAN speeds of 300Mb/s and above, all the way up to 10 Gb/s and beyond with latency over 10ms, will show no improvement if extra bandwidth is added.
Selling extra bandwidth is only good for the carriers’ revenues; they will sell your unused bandwidth to other customers (a bit like airlines that overbook flights), but you will see a declining return on ROI as you increase the bandwidth. That is not the end of the story, I’m afraid: Just pumping as much data down the WAN link as possible can have a negative effect. As soon as you ‘overfill the pipe’, devices along the route will start dropping packets – leading to packet loss.
Edge computing certainly is one way to address latency and packet loss, but WAN data acceleration goes a step further by mitigating the impact of data at speed and at a distance. Even big data analysis needn’t be performed at the edge. With solutions such as PORTrockIT, data can be processed, moved and analysed even when datacentres, or more specifically the data source, are located thousands of miles apart.
Latency is misunderstood
The trouble is that not everyone seems to comprehend the true impact and nature of latency and packet loss. I’ll explain why. For the past 15 years or so, organisations have used WAN optimisation techniques to improve the performance of applications across the WAN by using local caching and data deduplication techniques. These have been hiding the effects of WAN latency and packet loss from the user. So, even the most experienced IT professional may not have fully worked out their impact on their datacentres, systems, networks and applications.
Today, the nature of data has changed dramatically over the last couple of years. Organisations are now transferring increasing amounts of data that contain rich media, compressed data or encrypted data. These do not lend themselves to optimisation with data deduplication techniques. It is therefore only now that the effects of latency and packet loss are rearing their ugly heads.
To maximise the flow of data across the WAN you must manage the data flow, to avoid the all too common situation companies often finds themselves in on the freeways or motorways, where you get the concertina effect. Everything stops and then starts again. Then there are large gaps with no cars on them.
AI, ML and DL predictions
There will always be fads. It was once about the cloud, and before that, it was “I” everything. Throwing AI into the equation is not the answer. The fundamentals need to be changed first, before adding AI. For example, the European jet fighter, the Typhoon, has fantastic manoeuvrability but it is impossible to fly without computer control. Without ‘AI’, the plane becomes inherently unstable in flight.
Lock nevertheless predicts that AI, machine and deep learning are likely to evolve very rapidly over the next decade. “Its use will become more widespread and embedded in many processes, perhaps without even being recognised”, he says.
Whatever the future holds, here are my top 5 tips for using AI to accelerate WAN infrastructure:
- If you are using Application data optimisation and your WAN is below 100Mb/s consider staying with WAN Optimisation.
- If your data is compressed, already deduped or encrypted, investigate WAN data acceleration.
- WAN acceleration will allow you to use higher latency, noisy, lower-cost WAN connection, rather than low latency dedicated low loss WAN connections.
- For large data volumes and richly-compressed, encrypted data, WAN data acceleration is key
- Think differently – if distance and performance are no longer an inhibitor to WAN performance, then how can you change your digital strategy to include the use of the WAN as an enabler?
AI and machine learning can be used to accelerate WAN infrastructure by analysing the performance and removing human error by automatically making network administrative adjustments. By doing this, it mitigates the effect of latency and packet loss. In turn, this will accelerate your data, reducing data transfer times down from days, to hours and minutes. It will also reduce the buffering experienced by organisations that use or consume rich media.
By David Trossell, CEO and CTO of Bridgeworks.