Is DNA the Next Generation of Data Backup?
May 2020 by Michael Cade, Senior Global Technologist, Veeam
As more of our work and personal lives have become digital, we’ve seen a staggering growth in the amount of data we’re generating, storing and accessing. According to various studies, Google processes 3.5 billion searches every day, while 4.3 million videos are watched on YouTube. More than 350 million photos are uploaded to Facebook every day. By 2025, it’s estimated that 463 exabytes of data will be created each day globally. And with around 40% of the world’s population still to be connected online, the amount of data we’ll need to store and manage will skyrocket further.
Data is now the common denominator which sits across everything organisations do. Whether it’s driving the day-to-day activities we all take for granted or providing the new insights which shape our thinking around some of humanity’s biggest questions, data augments and empowers human intelligence.
With all of this in mind, it’s likely that we’ll need to fundamentally reconsider the current data storage technologies that we have on hand. The staggering amount of data we’re generating is already causing challenges, with data centre technologies requiring significant power and cooling, as well as ongoing maintenance and monitoring. We could be moving towards a huge bottleneck in the capabilities that are available, as both the volumes and speed of access to data increase further. What’s more, hardware such as servers, hard drives and flash storage can degrade. It seems unlikely at first, but there’s much we can learn from the natural world about data storage. The medium here is DNA, and when it comes to preserving and archiving vital information, it has an unbeatable track record.
Nature’s storage medium
One alternative to our current storage devices could be DNA-based data storage. Being ultra-compact and easy to replicate – thanks to its primary role in creating life – gives DNA two big advantages. One gram of DNA could potentially hold as much as 455 exabytes of data, according to the New Scientist. That’s more than all the digital data currently in the world, by a huge margin. And while DNA is itself quite fragile, when stored in the right conditions it can be incredibly stable. Thousand-year-old fossilized remains have been found with DNA still intact. The longevity of cassettes and CDs just doesn’t compare, and so from an archiving and backup perspective, it could be the perfect material.
Progress on the technology has been extremely promising, with Microsoft and University of Washington researchers last year developing the world’s first DNA storage device that can carry out the entire process automatically. Using the device, researchers encoded the word ‘hello’ on to DNA, and were able to convert it back to data readable by a computer.
From DNA to glass
In the race to find the data storage medium of the future, glass is another material in the running. Microsoft’s Project Silica, for example, is a proof of concept that uses quartz glass as a storage medium. Lasers permanently change the structure of glass, making it possible to store data that can then be read by machine learning algorithms. By taking up a fraction of the space, and not requiring the climate-controlled storage or other regular maintenance of typical storage mediums, it holds immense promise for archiving and backup activity.
But while techniques might be steadily improving, the time and cost of decoding the information needs to come down before DNA data storage can be used commercially. While scientists have been experimenting with storing digital data in DNA since 2012, for example, it took 21 hours for that 5-byte ‘hello’ message to be written and then read back out. However, progress is steady – it cost $100m in 2001 to sequence a human genome, today all it takes is two days and $1,000.
The business of backup could be transformed by DNA. Archives and data centres, and their immense physical footprints, could be eliminated. The sum of the world’s knowledge may well one day be stored on something you need a microscope to observe. And as we generate even more data, and reach the limit of our current storage technologies, the value of powerful alternatives will only become greater. Today’s complex backup efforts could be reduced down to a single record, created once, that lasts well beyond any living memory. The next generation of storage technology is in some ways already here - we just need to learn how to harness it.