How VAST Data’s Platform Is Removing Barriers To AI Innovation
Faster access to more data regardless of where the data resides will accelerate the adoption and success of AI-driven applications, solutions, and discoveries.
Join the DZone community and get the full member experience.
Join For FreeI recently had the opportunity to speak with Renen Hallak, Founder and CEO of VAST Data, about their new unified data platform for AI. VAST made waves in 2019 with the release of their VAST DataStore, a highly performant and scalable all-flash storage system. However, as I learned from Renen, storage was only the opening act in VAST's grander vision to become an AI data platform.
With the hype and investment around AI reaching astronomical levels, the demands on infrastructure are greater than ever. VAST aims to eliminate common compromises around performance, scale, geography, and ease of use to unlock AI's potential. On August 1st, VAST unveiled its expanded data platform, comprising a new database and compute capabilities alongside its flagship VAST DataStore.
The VAST Data Journey Started With a Revolutionary Architecture
VAST's journey began in 2016 with the creation of an innovative architecture called Disaggregated Shared Everything (DASE). According to Renen, VAST's goal from the outset was to provide AI algorithms with unfettered access to more data more quickly.
DASE completely reimagines data center design by separating storage and computing into independent resource pools that can scale in parallel. This eliminates bottlenecks like cache coherence and metadata management that restrict scale-out architectures. VAST also developed new shared data structures and protocols enabling consistent, efficient data access across the disaggregated environment.
As a result, DASE delivers previously unattainable performance at scale. It empowers AI workloads to rapidly analyze immense datasets in ways not possible with traditional infrastructure. By merging more data, faster access, and direct connectivity to analog and digital data sources, VAST believes DASE will unlock new algorithm breakthroughs.
VAST DataStore: High-Speed Unstructured Data Repository
Built on DASE, VAST's flagship product is the VAST DataStore, released in 2019. The VAST DataStore condenses SAN and NAS capabilities into a unified all-flash system specialized for unstructured data.
Leveraging the parallelism of DASE, the VAST DataStore cost-effectively offers file, object, and HPC storage using only flash memory. There is no need for a separate flash performance tier with a slower disk handling capacity in the background. All data enjoys rapid, random access.
The VAST DataStore efficiently handles unstructured data at an exabyte scale through standard interfaces like NFS, SMB, and S3. Behind the scenes, DASE stores data in tiny elements accessed in parallel by compute resources. Features like deduplication, compression, snapshots, and QoS are implemented in real time via DASE's persistent write buffer.
New VAST DataBase and VAST DataEngine Expand Capabilities
Building on the VAST DataStore's success, VAST Data recently announced their expanded platform, introducing the VAST DataBase and VAST DataEngine. Together with the VAST DataStore, these form a unified environment for data-centric AI spanning ingestion, storage, processing, and querying.
The VAST DataBase leverages DASE to deliver a hyperscale database for both transactional and analytical workloads. Using an innovative columnar format, the VAST DataBase reduces data sizes for lightning-fast query performance at scale. DASE allows simultaneous OLTP inserts and OLAP queries with no tradeoffs. The database also serves as a metadata catalog across unstructured data in the VAST DataStore.
The VAST DataEngine enables processing data workloads directly within the global data fabric. It can optimize task placement based on factors like data locality and cost. Developers can create recursive compute loops triggered by data events anywhere in the fabric. This continuous processing paradigm supercharges data-driven AI workflows.
VAST DataSpace: Limitless Data Fabric Powering AI Innovation
Tying everything together is VAST DataSpace, a global namespace unifying data silos across on-prem, cloud, and edge locations. This groundbreaking data accessibility allows apps to harness data without central ownership. Instead of moving data to compute, compute comes to the data for optimal efficiency.
With a unified data fabric removing traditional limitations, exciting new AI use cases emerge. VAST customer Pixar revolutionized animated film production through globally shared datasets. Online travel giant Agoda uses VAST to power its entire big data and machine learning pipeline.
By eliminating compromises around data access, VAST Data is pioneering the next evolution of AI infrastructure. Performance, scale, geography, and ease-of-use barriers are collapsing, allowing enterprises to focus on innovations rather than infrastructure. VAST Data is unlocking a new era where ideas, not technology constraints, determine the boundaries of AI innovation.
The Possibilities With the Unified Vast Data Platform
The capabilities enabled by VAST Data's unified platform are diverse, spanning real-time analytics, model training, database applications, and more. Let's explore some use cases:
Real-Time Analytics
For real-time analytics, the VAST DataStore offers ultra-fast access to vast amounts of unstructured data. The VAST DataBase facilitates ad hoc analytical queries across billions of rows of structured data. Bringing these together in VAST DataSpace allows for rapid analysis correlating unstructured and structured data streams.
Continuous Model Training
The VAST DataEngine enables continuous model training workflows. As new unstructured data lands in the VAST DataStore, events trigger model training jobs to execute in VAST DataSpace using the latest data. Results get written for immediate inference access.
Cloudbursting
To scale analytics or training workloads, VAST DataSpace can burst into the public cloud while maintaining a unified global namespace. This allows leveraging cloud resources for extra capacity without data migration.
Hyperscale Database
The VAST DataBase's simultaneous OLTP and OLAP support at an extreme scale provides an ideal foundation for large-scale transactional applications that also require analytical insights.
Data Lakes
For data lake needs, the VAST DataStore offers a centralized repository for all enterprise data. The VAST DataBase provides a metadata catalog of data assets. VAST DataSpace ties everything together into a cohesive environment.
In summary, the unified nature of the VAST Data platform lends itself to an array of data-intensive use cases. By removing infrastructure limitations, the possibilities are endless.
The Road Ahead for VAST Data
VAST shows no signs of slowing down. The company recently raised $210 million at a $3.7 billion valuation. VAST is aggressively expanding, including the launch of a new R&D facility focused on advancing DASE technologies.
Some areas VAST is innovating on include:
- Making DASE accessible as a composable data services fabric
- Expanding global file system capabilities
- New data reduction techniques like DNA compression
- Optimizations for AI/ML, GPGPU workloads
- Zone storage tiering for low-latency data access
- Hybrid and multi-cloud data management
Additionally, Renen hinted at expanding VAST's market focus beyond AI and analytics into emerging areas like ML Ops, the metaverse, and Web 3.0.
It's an exciting time to watch how pioneers like VAST Data reshape the limits of what's possible with data. As innovations in AI and next-generation applications create immense data demands, the companies fulfilling these infrastructure needs will power the most groundbreaking advancements.
Opinions expressed by DZone contributors are their own.
Comments