A lot has changed in the world of data storage and unstructured data management over the past 12 months. Cloud storage strategies are in the spotlight as costs increase and pressure on IT budgets increases in uncertain economic times, generative AI creates new data storage and management requirements, data migration becomes increasingly complex but necessary in an era of data consolidation. centers, and IT organizations are under intense pressure to contain costs and deliver greater value from data.
AI will enrich unstructured data to achieve better business results
Unstructured data is huge and still cannot be used for several reasons: it is difficult and expensive to search, classify, segment and transfer to AI engines and analytical tools. As AI tools and services evolve and become more accessible and in demand by many, not just the largest deep-pocketed organizations, there is a growing need to use this data to generate new business value.
Here’s the problem: Researchers and data scientists who want to feed data to AI don’t have easy ways to do so safely. This requires writing manual scripts and several days or weeks of work. Additionally, AI and ML technologies are still too inaccurate and suffer from bias and false results.
We foresee increased demand for solutions that create a workflow in which AI can quickly find the data it needs, enrich it, and validate the results. Such a process might involve first using an AI tool that scans data in a cloud data lake or data center to find the right types of data for a project, such as all mammograms for 2022. The AI then enriches the metadata by scanning the contents of the files and by assigning labels to them (e.g., “contains token X for later diagnosis”), and returns a set of data that can be human-checked to see if the result is correct. The ability to manage unstructured data with support for searching a global file index that can connect via API to AI tools to further identify and enrich the data is invaluable: it saves time, improves the efficiency and accuracy of AI projects.
From Cloud First to Data First
During the height of the global pandemic, cloud-first strategies were all the rage, but today they are being reconsidered. IT organizations are building flexible, hybrid cloud and multi-cloud environments using multi-vendor technologies to suit different workloads. Some organizations have gotten burned by the cloud, finding that not only are they not saving enough, but sometimes they are even spending more than if they had kept the data in-house.
There are many reasons to explain this reality, but the idea that moving most or all workloads to the cloud provides the greatest cost savings has not materialized. IT organizations will choose from a variety of data storage options on the market—on-premises or in the cloud—based on performance, cost, and data security requirements throughout the data lifecycle. The ability to easily move data as requirements change or better technology becomes available is paramount.
Therefore, data management tools that allow you to move huge volumes of unstructured data without vendor lock-in will become increasingly valuable.
Migration of unstructured data will become more intelligent and automated
Enterprise data migration has traditionally been a complex, manual task and requires a lot of professional services, especially when dealing with huge volumes of unstructured data. Automation and AI will change this by enabling intelligent and efficient data migration that no longer requires the oversight of IT managers and is adaptive.
These tools will be able to solve problems on the fly and fix them on their own. As we gain knowledge, advanced migration planning tools will recommend optimal storage levels for different workloads and use cases. This is a timely development because data migration is dependent on a customer’s changing environment: their firewall, network connections and security configurations. Enterprise customers will be looking for solutions that provide orders of magnitude faster migration with better long-term results and fewer data losses, errors and security risks.
Developing a career in storage technology will require mastering FinOps and Cross-Silo skills
Given these trends, IT storage professionals will need to gain additional knowledge and experience to operate more cost-effectively, efficiently and in line with business needs. FinOps will become part of the storage architect’s skill set in 2024. As storage becomes more software and service-centric, managing hardware becomes less necessary. Therefore, the majority of data warehousing professionals’ time will be spent managing vendors, contracts, and providing secure and cost-effective data processing services to departments and users.
In addition, enterprises refuse the services of a single supplier. Therefore, storage administrators must be able to switch between different technologies rather than specialize in one platform. This requires broader skills and knowledge in networking, security, cloud architecture, cost modeling and data analytics.
In addition, job titles related to data storage will be replaced by job titles related to data, such as “insight engineer” or “data management architect.” In mature infrastructure teams, data warehouse managers will work more closely with data science and AI teams to acquire AI-ready infrastructure and develop plans for classifying data and processing it into analytics platforms.