Accessibility Statement Skip Navigation
  • Resources
  • Blog
  • Journalists
  • +44 (0)20 7454 5110
  • Client Login
  • Send a Release
Return to PR Newswire homepage
  • News
  • Products
  • Contact
When typing in this field, a list of search results will appear and be automatically updated as you type.

Searching for your content...

No results found. Please change your search terms and try again.
  • News in Focus
      • Browse News Releases

      • All Public Company News
      • All Multimedia News
      • View All News Releases

      • Regulatory News

      • D/A/CH Regulatory News
      • UK Regulatory News
      • View All Regulatory News

  • Business & Money
      • Auto & Transportation

      • Aerospace & Defense
      • Air Freight
      • Airlines & Aviation
      • Automotive
      • Maritime & Shipbuilding
      • Railroads & Intermodal Transportation
      • Supply Chain/Logistics
      • Transportation, Trucking & Railroad
      • Travel
      • Trucking & Road Transportation
      • View All Auto & Transportation

      • Business Technology

      • Blockchain
      • Broadcast Tech
      • Computer & Electronics
      • Computer Hardware
      • Computer Software
      • Data Analytics
      • Electronic Commerce
      • Electronic Components
      • Electronic Design Automation
      • Financial Technology
      • High Tech Security
      • Internet Technology
      • Nanotechnology
      • Networks
      • Peripherals
      • Semiconductors
      • View All Business Technology

      • Entertain­ment & Media

      • Advertising
      • Art
      • Books
      • Entertainment
      • Film & Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • View All Entertain­ment & Media

      • Financial Services & Investing

      • Accounting News & Issues
      • Acquisitions, Mergers & Takeovers
      • Banking & Financial Services
      • Bankruptcy
      • Bond & Stock Ratings
      • Conference Call Announcements
      • Contracts
      • Cryptocurrency
      • Dividends
      • Earnings
      • Earnings Forecasts & Projections
      • Financing Agreements
      • Insurance
      • Investments Opinions
      • Joint Ventures
      • Mutual Funds
      • Private Placement
      • Real Estate
      • Restructuring & Recapitalisation
      • Sales Reports
      • Shareholder Activism
      • Shareholder Meetings
      • Stock Offering
      • Stock Split
      • Venture Capital
      • View All Financial Services & Investing

      • General Business

      • Awards
      • Commercial Real Estate
      • Corporate Expansion
      • Earnings
      • Environmental, Social and Governance (ESG)
      • Human Resource & Workforce Management
      • Licensing
      • New Products & Services
      • Obituaries
      • Outsourcing Businesses
      • Overseas Real Estate (non-US)
      • Personnel Announcements
      • Real Estate Transactions
      • Residential Real Estate
      • Small Business Services
      • Socially Responsible Investing
      • Surveys, Polls & Research
      • Trade Show News
      • View All General Business

  • Science & Tech
      • Consumer Technology

      • Artificial Intelligence
      • Blockchain
      • Cloud Computing/Internet of Things
      • Computer Electronics
      • Computer Hardware
      • Computer Software
      • Consumer Electronics
      • Cryptocurrency
      • Data Analytics
      • Electronic Commerce
      • Electronic Gaming
      • Financial Technology
      • Mobile Entertainment
      • Multimedia & Internet
      • Peripherals
      • Social Media
      • STEM (Science, Tech, Engineering, Math)
      • Supply Chain/Logistics
      • Wireless Communications
      • View All Consumer Technology

      • Energy & Natural Resources

      • Alternative Energies
      • Chemical
      • Electrical Utilities
      • Gas
      • General Manufacturing
      • Mining
      • Mining & Metals
      • Oil & Energy
      • Oil & Gas Discoveries
      • Utilities
      • Water Utilities
      • View All Energy & Natural Resources

      • Environ­ment

      • Conservation & Recycling
      • Environmental Issues
      • Environmental Policy
      • Environmental Products & Services
      • Green Technology
      • Natural Disasters
      • View All Environ­ment

      • Heavy Industry & Manufacturing

      • Aerospace & Defence
      • Agriculture
      • Chemical
      • Construction & Building
      • General Manufacturing
      • HVAC (Heating, Ventilation & Air-Conditioning)
      • Machinery
      • Machine Tools, Metalworking & Metallurgy
      • Mining
      • Mining & Metals
      • Paper, Forest Products & Containers
      • Precious Metals
      • Textiles
      • Tobacco
      • View All Heavy Industry & Manufacturing

      • Telecomm­unications

      • Carriers & Services
      • Mobile Entertainment
      • Networks
      • Peripherals
      • Telecommunications Equipment
      • Telecommunications Industry
      • VoIP (Voice over Internet Protocol)
      • Wireless Communications
      • View All Telecomm­unications

  • Lifestyle & Health
      • Consumer Products & Retail

      • Animals & Pets
      • Beers, Wines & Spirits
      • Beverages
      • Bridal Services
      • Cannabis
      • Cosmetics & Personal Care
      • Fashion
      • Food & Beverages
      • Furniture & Furnishings
      • Home Improvement
      • Household, Consumer & Cosmetics
      • Household Products
      • Jewellery
      • Non-Alcoholic Beverages
      • Office Products
      • Organic Food
      • Product Recalls
      • Restaurants
      • Retail
      • Supermarkets
      • Toys
      • View All Consumer Products & Retail

      • Entertain­ment & Media

      • Advertising
      • Art
      • Books
      • Entertainment
      • Film & Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • View All Entertain­ment & Media

      • Health

      • Biometrics
      • Biotechnology
      • Clinical Trials & Medical Discoveries
      • Dentistry
      • FDA Approval
      • Fitness/Wellness
      • Health Care & Hospitals
      • Health Insurance
      • Infection Control
      • International Medical Approval
      • Medical Equipment
      • Medical Pharmaceuticals
      • Mental Health
      • Pharmaceuticals
      • Supplementary Medicine
      • View All Health

      • Sports

      • General Sports
      • Outdoors, Camping & Hiking
      • Sporting Events
      • Sports Equipment & Accessories
      • View All Sports

      • Travel

      • Amusement Parks & Tourist Attractions
      • Gambling & Casinos
      • Hotels & Resorts
      • Leisure & Tourism
      • Outdoors, Camping & Hiking
      • Passenger Aviation
      • Travel Industry
      • View All Travel

  • Policy & Public Interest
      • Policy & Public Interest

      • Animal Welfare
      • Corporate Social Responsibility
      • Economic News, Trends & Analysis
      • Education
      • Environmental
      • European Government
      • Labour & Union
      • Natural Disasters
      • Not For Profit
      • Public Safety
      • View All Policy & Public Interest

  • People & Culture
      • People & Culture

      • Aboriginal, First Nations & Native American
      • African American
      • Asian American
      • Children
      • Diversity, Equity & Inclusion
      • Hispanic
      • Lesbian, Gay & Bisexual
      • Men's Interest
      • People with Disabilities
      • Religion
      • Senior Citizens
      • Veterans
      • Women
      • View All People & Culture

  • Overview
  • Distribution
  • Paid Placement
  • Multimedia
  • Disclosure Services
  • SocialBoost
  • Rooms
    • MediaRoom
    • ESG Rooms
  • AI Tools
  • General Enquiries
  • Media Enquiries
  • Partnerships
  • Hamburger menu
  • Cision PR Newswire UK provides press release distribution, targeting, monitoring, and marketing services
  • Send a Release
    • Phone

    • +44 (0)20 7454 5110 from 8 AM - 5:30 PM GMT

    • ALL CONTACT INFO
    • Contact Us

      +44 (0)20 7454 5110
      from 8 AM - 5:30 PM GMT

  • Client Login
  • Send a Release
  • Resources
  • Blog
  • Journalists
  • News in Focus
    • Browse News Releases
    • Regulatory News
  • Business & Money
    • Auto & Transportation
    • Business Technology
    • Entertain­ment & Media
    • Financial Services & Investing
    • General Business
  • Science & Tech
    • Consumer Technology
    • Energy & Natural Resources
    • Environ­ment
    • Heavy Industry & Manufacturing
    • Telecomm­unications
  • Lifestyle & Health
    • Consumer Products & Retail
    • Entertain­ment & Media
    • Health
    • Sports
    • Travel
  • Policy & Public Interest
    • Policy & Public Interest
  • People & Culture
    • People & Culture
  • Client Login
  • Send a Release
  • Resources
  • Blog
  • Journalists
  • Overview
  • Distribution
  • Paid Placement
  • Multimedia
  • Disclosure Services
  • Cision Communications Cloud®
  • AI Tools
  • Client Login
  • Send a Release
  • Resources
  • Blog
  • Journalists
  • General Enquiries
  • Media Enquiries
  • Partnerships
  • Client Login
  • Send a Release
  • Resources
  • Blog
  • Journalists

WEKA Debuts New Solution Blueprint to Simplify AI Inferencing at Scale

This image opens in the lightbox

News provided by

WekaIO

19 Nov, 2024, 15:06 GMT

Share this article

Share toX

Share this article

Share toX

WARRP Reference Architecture Provides Comprehensive Modular Solution That Accelerates the Development of RAG-based Inferencing Environments

ATLANTA and CAMPBELL, Calif., Nov. 19, 2024 /PRNewswire/ -- From Supercomputing 2024: WEKA, the AI-native data platform company, debuted a new reference architecture solution to simplify and streamline the development and implementation of enterprise AI inferencing environments. The WEKA AI RAG Reference Platform (WARRP) provides generative AI (GenAI) developers and cloud architects with a design blueprint for the development of a robust inferencing infrastructure framework that incorporates retrieval-augmented generation (RAG), a technique used in the AI inference process to enable large language models (LLMs) to gather new data from external sources.

Continue Reading
This image opens in the lightbox
Introducing WARRP (PRNewsFoto/WekaIO)

The Criticality of RAG in Building Safe, Reliable AI Operations
According to a recent study of global AI trends conducted by S&P Global Market Intelligence, GenAI has rapidly emerged as the most highly adopted AI modality, eclipsing all other AI applications in the enterprise.[1]

A primary challenge enterprises face when deploying LLMs is ensuring they can effectively retrieve and contextualize new data across multiple environments and from external sources to aid in AI inference. RAG is the leading technique for AI inference, and it is used to enhance trained AI models by safely retrieving new insights from external data sources. Using RAG in the inferencing process can help reduce AI model hallucinations and improve output accuracy, reliability and richness, reducing the need for costly retraining cycles.

However, creating robust production-ready inferencing environments that can support RAG frameworks at scale is complex and challenging, as architectures, best practices, tools, and testing strategies are still rapidly evolving.

A Comprehensive Blueprint for Inferencing Acceleration
With WARRP, WEKA has defined an infrastructure-agnostic reference architecture that can be leveraged to build and deploy production-quality, high-performance RAG solutions at scale.

Designed to help organizations quickly build and implement RAG-based AI inferencing pipelines, WARRP provides a comprehensive blueprint of modular components that can be used to quickly develop and deploy a world-class AI inference environment optimized for workload portability, distributed global data centers and multicloud environments.

The WARRP reference architecture builds on WEKA® Data Platform software running on an organization's preferred cloud or server hardware as its foundational layer. It then incorporates class-leading enterprise AI frameworks from NVIDIA — including NVIDIA NIM™ microservices and NVIDIA NeMo™ Retriever, both part of the NVIDIA AI Enterprise software platform — advanced AI workload and GPU orchestration capabilities from Run:ai and popular commercial and open-source data management software technologies like Kubernetes for data orchestration, and Milvus Vector DB for data ingestion.

"As the first wave of generative AI technologies began moving into the enterprise in 2023, most organizations' compute and data infrastructure resources were focused on AI model training. As GenAI models and applications have matured, many enterprises are now preparing to shift these resources to focus on inferencing but may not know where to begin," said Shimon Ben-David, chief technology officer at WEKA. "Running AI inferencing at scale is extremely challenging. We are developing the WEKA AI RAG Architecture Platform on leading AI and cloud infrastructure solutions from WEKA, NVIDIA, Run:ai, Kubernetes, Milvus, and others to provide a robust production-ready blueprint that streamlines the process of implementing RAG to improve the accuracy, security and cost of running enterprise AI models."

WARRP delivers a flexible, modular framework that can support a variety of LLM deployments, offering scalability, adaptability, and exceptional performance in production environments. Key benefits include:

  • Build a Production-Ready Inferencing Environment Faster: WARRP's infrastructure and cloud-agnostic architecture can be used by GenAI developers and cloud architects to streamline GenAI application development and run inferencing operations at scale faster. It seamlessly integrates with an organization's existing and future AI infrastructure components, large and small language models, and preferred server, hyperscale or specialty AI cloud providers, giving organizations exceptional flexibility and choice in architecting their AI inference stack.
  • Hardware, Software, and Cloud Agnostic: WARRP's modular design supports most major server and cloud service providers. The architecture enables organizations to easily achieve workload portability without compromising performance by allowing AI practitioners to run the same workload on their preferred hyperscale cloud platform, AI cloud service, or on-premises server hardware with minimal configuration changes. Whether deployed in a public, private, or hybrid cloud environment, AI pipelines demonstrate stable behavior and predictable results, simplifying hybrid and multicloud operations.
  • End-to-End AI Inferencing Stack Optimization: Running RAG pipelines can be highly demanding, especially when dealing with large model repositories and complex AI workloads. Organizations can achieve significant performance improvements by integrating the WEKA Data Platform into their AI inferencing stack, particularly in multi-model inference scenarios. The WEKA Data Platform's ability to load and unload models efficiently further accelerates and efficiently delivers tokens for user prompts, particularly in complex, chained inference workflows involving multiple AI models.

"As AI adoption accelerates, there is a critical need for simplified ways to deploy production workloads at scale. Meanwhile, RAG-based inferencing is emerging as an important frontier in the AI innovation race, bringing new considerations for an organization's underlying data infrastructure," said Ronen Dar, chief technology officer at Run:ai. "The WARRP reference architecture provides an excellent solution for customers building an inference environment, providing an essential blueprint to help them develop quickly, flexibly and securely using industry-leading components from NVIDIA, WEKA and Run:ai to maximize GPU utilization across private, public and hybrid cloud environments. This combination is a win-win for customers who want to outpace their competition on the cutting edge of AI innovation."

"Enterprises are looking for a simple way to embed their data to build and deploy RAG pipelines," said Amanda Saunders, director of Enterprise Generative AI software, NVIDIA. "Using NVIDIA NIM and NeMo with WEKA, will give enterprise customers a fast path to develop, deploy and run high-performance AI inference and RAG operations at scale."

The first release of the WARRP reference architecture is now available for free download. Visit https://www.weka.io/resources/reference-architecture/warrp-weka-ai-rag-reference-platform/ to obtain a copy.

Supercomputing 2024 attendees can visit WEKA in Booth #1931 for more details and a demo of the new solution. 

Supporting AI Cloud Service Provider Quotes

Applied Digital
"As companies increasingly harness advanced AI and GenAI inferencing to empower their customers and employees, they recognize the benefits of leveraging RAG for greater simplicity, functionality and efficiency," said Mike Maniscalco, chief technology officer at Applied Digital. "WEKA's WARRP stack provides a highly useful reference framework to deliver RAG pipelines into a production deployment at scale, supported by powerful NVIDIA technology and reliable, scalable cloud infrastructure."

Ori Cloud
"Leading GenAI companies are running on Ori Cloud to train the world's largest LLMs and achieving maximum GPU utilization thanks to our integration with the WEKA Data Platform," said Mahdi Yahya, founder and chief executive officer at Ori Cloud. "We look forward to working with WEKA to build robust inference solutions using the WARRP architecture to help Ori Cloud customers maximize the benefits of RAG pipelines to accelerate their AI innovation."

Yotta
"To run AI effectively, speed, flexibility, and scalability are required. Yotta's AI solutions, powered by NVIDIA GPUs and built on the WEKA Data Platform, are helping organizations to push the boundaries of what's possible in AI, offering unparalleled performance and flexible scale," said Sunil Gupta, chief executive officer at Yotta. "We look forward to collaborating with WEKA to further enhance our Inference-as-a-Service offerings for natural-language processing, computer vision, and generative AI leveraging the WARRP reference architecture and NVIDIA NIM microservices."

About WEKA 
WEKA is architecting a new approach to the enterprise data stack built for the AI era. The WEKA® Data Platform sets the standard for AI infrastructure with a cloud and AI-native architecture that can be deployed anywhere, providing seamless data portability across on-premises, cloud, and edge environments. It transforms legacy data silos into dynamic data pipelines that accelerate GPUs, AI model training and inference, and other performance-intensive workloads, enabling them to work more efficiently, consume less energy, and reduce associated carbon emissions. WEKA helps the world's most innovative enterprises and research organizations overcome complex data challenges to reach discoveries, insights, and outcomes faster and more sustainably – including 12 of the Fortune 50. Visit www.weka.io to learn more or connect with WEKA on LinkedIn, X, and Facebook.

WEKA and the WEKA logo are registered trademarks of WekaIO, Inc. Other trade names used herein may be trademarks of their respective owners. 

[1] 2024 Global Trends in AI, September 2024, S&P Global Market Intelligence

Photo - https://mma.prnewswire.com/media/2561543/4304845.jpg
Logo - https://mma.prnewswire.com/media/1796062/WEKA_v1_Logo.jpg

Modal title

Also from this source

WEKA Unveils Industry's First AI Storage Cluster Built On NVIDIA Grace CPU Superchips

WEKA Unveils Industry's First AI Storage Cluster Built On NVIDIA Grace CPU Superchips

From Supercomputing 2024: WEKA, the AI-native data platform company, previewed the industry's first high-performance storage solution for the NVIDIA...

WEKA Introduces New WEKApod Appliances to Accelerate Enterprise AI Deployments

WEKA Introduces New WEKApod Appliances to Accelerate Enterprise AI Deployments

WekaIO (WEKA), the AI-native data platform company, unveiled two new WEKApod™ data platform appliances today: the WEKApod Nitro for large-scale...

More Releases From This Source

Explore

Computer Hardware

Computer Hardware

Computer Hardware

Computer Hardware

Computer & Electronics

Computer & Electronics

Computer Software

Computer Software

News Releases in Similar Topics

Contact PR Newswire

  • +44 (0)20 7454 5110
    from 8 AM - 5:30 PM GMT
  • General Enquiries
  • Media Enquiries
  • Partnerships

Products

  • Content Distribution
  • Multimedia Services
  • Disclosure Services
  • Cision Communications Cloud®

About

  • About PR Newswire
  • About Cision
  • Partnering Opportunities
  • Careers
  • APAC
  • APAC - Simplified Chinese
  • APAC - Traditional Chinese
  • Brazil
  • Canada
  • Czech
  • Denmark
  • Finland
  • France
  • Germany
  • India
  • Indonesia
  • Israel
  • Japan
  • Korea
  • Mexico
  • Middle East
  • Middle East - Arabic
  • Netherlands
  • Norway
  • Poland
  • Portugal
  • Russia
  • Slovakia
  • Spain
  • Sweden
  • United States
  • Vietnam

My Services

  • All News Releases
  • Customer Portal
  • Resources
  • Blog
  • Journalists
  • Data Privacy

Do not sell or share my personal information:

  • Submit via Privacy@cision.com 
  • Call Privacy toll-free: 877-297-8921

Contact PR Newswire

Products

About

My Services
  • All News Releases
  • Customer Portal
  • Resources
  • Blog
  • Journalists
+44 (0)20 7454 5110
from 8 AM - 5:30 PM GMT
  • Terms of Use
  • Privacy Policy
  • Information Security Policy
  • Site Map
  • RSS
  • Cookie Settings
Copyright © 2025 PR Newswire Europe Limited. All Rights Reserved. A Cision company.