James Din
March 18, 2025 21:23
NVIDIA will introduce mission control, an AI data management platform, to enhance the operation of AI factories with advanced orchestration and automation, as announced at the NVIDIA GTC conference.
Nvidia has announced the latest innovation, mission control, a comprehensive operational and orchestration software platform designed to streamline the management of AI data centers. According to NVIDIA blog, the software, announced at the NVIDIA GTC Global AI Conference, is intended to automate and enhance the complex processes involved in running AI factories.
Conversion of AI Factory Operations
Mission control is set to revolutionize AI factory operations by promoting the efficient post-training of Nvidia Blackwell-based systems from pre-sales. Enterprises can switch seamlessly between training and inference workloads, allowing dynamic optimization of resource allocation. This feature is extremely important for businesses looking to quickly transform their data into viable insights.
The software integrates NVIDIA Run: AI technology, enhances job orchestration, and integrates infrastructure usage up to five times. Autonomous recovery capabilities supported by rapid checkpointing and automated tiered reboots promise up to 10 times more job recovery, significantly improving AI training and inference efficiency.
Strengthening infrastructure management
The design of Mission Control focuses on minimizing the time that companies spend managing their AI infrastructure. Automate every aspect of AI Factory operations, from deployment configuration to developer workload management. It aims to save time, energy and costs with the ability to predict and identify sources of downtime and inefficiency.
This platform offers several advantages, including a simplified cluster setup, seamless workload orchestration, energy-optimized power profiles, and customizable dashboards. These features help businesses maintain uninterrupted operationality while optimizing performance.
Collaboration with major system manufacturers
Major system manufacturers such as Dell, HPE, Lenovo and Supermicro are planning to integrate Nvidia Mission Control into their products. This integration allows businesses to easily scale AI models and turn data into faster and more practical insights than ever before. For example, Dell includes mission control in AI Factory Solutions, and HPE will be provided with Nvidia Grace Blackwell Systems.
Availability and future prospects
NVIDIA Mission Control is currently available on NVIDIA DGX GB200 and DGX B200 systems. The GB200 NVL72 system will soon be available from global providers such as Dell, HPE, Lenovo and Supermicro. Additionally, NVIDIA’s base command manager software is now available for free to use within a limited range, providing a cost-effective solution for AI cluster management.
As Nvidia continues to strengthen its AI solutions, mission control represents a critical step for industries around the world to make sophisticated AI infrastructure more accessible and efficient.
Image source: ShutterStock