Database Software boosts big data analytics.

Press Release Summary:



With gNet(TM) for Apache Hadoop, EMC® Greenplum® Database 4.2 enables parallel import and export of all data from Hadoop. Integration with EMC Data Domain deduplication storage systems via EMC Data Domain Boost results in fast, inline deduplication with up to 26.3 TB/hr throughput and backup of over 173 TB in less than 8 hr. Release 4.2 enables turnkey in-database analytics via Greenplum Extensions, which can be downloaded from EMC Subscribenet and installed using Greenplum Package Manager.



Original Press Release:



New EMC Greenplum Database Enhancements Boost Big Data Analytics



Updates Include High-Performance gNet for Apache Hadoop, EMC Data Domain Boost, Turnkey In-Database Analytics and Faster Migration to Greenplum; Introduces Greenplum Command Center

HOPKINTON, Mass., --

News Summary:

-- The new EMC® Greenplum® Database 4.2, significantly enhancing Big Data integration and database manageability and performance, includes a high-performance gNet(TM) for Apache Hadoop; simpler, scalable backup with EMC Data Domain® Boost; faster migrations to Greenplum; turnkey in-database analytics; and targeted performance optimization.

-- gNet combined with Hadoop integration enables quicker analysis and insights through high-performance parallel import and export of compressed and uncompressed data from Hadoop clusters.

-- Greenplum Database now includes simpler, scalable backup with Data Domain Boost integration. This new feature dramatically increases aggregate throughput, reduces costs by reducing the amount of data transferred over the network and increases simplicity by eliminating the need to create and manage virtual drives.

-- New Greenplum language and compatibility enhancements streamline support of third-party tools and make migrations to Greenplum from Oracle faster and simpler.

-- Greenplum Command Center, the first web-based Big Data infrastructure management console, provides a unified interface for monitoring and administration of all Greenplum products.

Full Story:

EMC Corporation (NYSE:EMC) today announced version 4.2 of EMC Greenplum Database, bringing to the industry-leading platform for in-database analytics new levels of Big Data integration, database manageability and performance. That means customers can run massive-scale mission-critical analysis even more easily and rapidly, thus further boosting their analytic productivity, business value and business decision-making prowess. Sitting at the heart of the EMC Greenplum family of products, Greenplum Database 4.2 includes a high-performance gNet for Hadoop; language and compatibility enhancements for faster migrations to Greenplum; simpler, scalable backup with EMC Data Domain Boost; an extension framework and turnkey in-database analytics; and targeted performance optimization.

In order to expand the range of solutions that can be created for data integration and processing and to run queries for mission-critical complex analysis, customers seek the most efficient and flexible data exchange between Greenplum Database and Hadoop, in addition to the existing parallel data access. To address this, Greenplum 4.2 now enables high-performance parallel import and export of all data (compressed and uncompressed) from Hadoop using gNet for Hadoop, a parallel communications transport. This achievement represents the industry's first direct query interoperability between Greenplum Database and Hadoop.

A key new Greenplum Database feature is the advanced integration with EMC Data Domain deduplication storage systems via EMC Data Domain Boost, resulting in significantly faster (10 to 30x data reduction average), more efficient backup. This integration distributes parts of the deduplication process to Greenplum database servers, enabling them to send only unique data to the Data Domain system, thus dramatically increasing aggregate throughput, reducing the amount of data transferred over the network and eliminating the need to create and manage virtual drives (fast, inline deduplication with up to 26.3 TB/hour of throughput; backup over 173 TB in less than eight hours).

Addressing database manageability and performance, Greenplum Database delivers an agile, extensible platform for in-database analytics, leveraging the system's massively parallel architecture. With Release 4.2, Greenplum enables turnkey in-database analytics via Greenplum Extensions, which can be downloaded from EMC Subscribenet and installed using the new Greenplum Package Manager--a new utility that ensures automatic installation and updates of functional extensions to simplify the task of enabling and managing advanced in-database functionality across a cluster. Release 4.2 also supports dynamic partition elimination and query memory optimization, thus drastically reducing the data scanned for a query, significantly accelerating query processing and allowing for more concurrency.

Greenplum Command Center

-- Greenplum Command Center is the first web-based Big Data infrastructure management console, provides a unified administrative and real-time/historical health-monitoring dashboard for all currently available Greenplum products.

-- Supported Greenplum Database administrative operations include start, stop, and initialize Greenplum Database; search, prioritize, or cancel any query; and recover and rebalance data mirrors.

-- Initial release of Greenplum Command Center is available with Greenplum Data Computing Appliance version 1.2.

EMC Greenplum Database version 4.2 and the Greenplum Command Center are available now.

Greenplum Quote

Scott Yara, Senior Vice President of Products, Greenplum, a division of EMC

"The EMC Greenplum Database continues to be at the core of driving Big Data insights and decisions for our customers. As more organizations create data-driven cultures, the Greenplum Database's shared-nothing, massively parallel processing (MPP) makes business intelligence and analytical processing much faster. It is this analytic productivity that is the real benefit of the database and is something we're proud to offer."

Additional Resources

-- Data Sheet: EMC Greenplum Database

-- White Paper: Advanced Cyber Analytics with Greenplum Database

-- Data Domain Boost Software

-- Video: The Greenplum Unified Analytics Platform

-- Video: The Big Data Opportunity with Pat Gelsinger, President and COO of EMC

-- Press Release: EMC Drives Evolution of Big Data Analytics and Business Agility with Breakthrough Greenplum Unified Analytics Platform

-- Connect with EMC via Twitter, Facebook, YouTube, LinkedIn and Greenplum About EMC

EMC Corporation is a global leader in enabling businesses and service providers to transform their operations and deliver IT as a service. Fundamental to this transformation is cloud computing. Through innovative products and services, EMC accelerates the journey to cloud computing, helping IT departments to store, manage, protect and analyze their most valuable asset--information--in a more agile, trusted and cost-efficient way. Additional information about EMC can be found at www.EMC.com.

EMC, Greenplum, gNet, and Data Domain are trademarks or registered trademarks of EMC Corporation in the U.S. and other countries. All other trademarks are the property of their respective owners.

Web Site: www.emc.com

All Topics