Rechercher
Contactez-nous Suivez-nous sur Twitter En francais English Language
 











Freely subscribe to our NEWSLETTER

Newsletter FR

Newsletter EN

Vulnérabilités

Unsubscribe

Cloudera Democratizes Apache Hadoop for Enterprise End Users With Open Source, Interactive Search

June 2013 by Marc Jacob

From The Economist Information Forum in San Francisco, Cloudera announced the public beta of Cloudera Search, the industry’s first fully integrated search engine for interactive exploration of data stored in the Hadoop Distributed File System (HDFS) and Apache HBase(TM). The latest in a series of innovations from Cloudera designed to simplify and increase Hadoop’s usability by more departments of an organization and powered by the leading open source search engine, Apache Solr(TM), Cloudera Search enables anyone within an organization to perform interactive, natural language keyword searches and faceted navigation on data stored in Hadoop, without additional training or advanced programming knowledge.

Cloudera Search was developed to address a rapidly emerging need, as enterprises’ Hadoop deployments mature and advance to become the primary repositories for more and more kinds of data: how to better and more quickly combine and refine data into a single, integrated platform. At its core, Cloudera Search incorporates Apache Solr and other search-related open source projects to support a comprehensive big data infrastructure, and to alleviate the significant costs of maintaining the disparate systems that many enterprises currently depend on to execute search queries.

The arrival of Cloudera Search provides the enterprise with breakthrough simplicity and exploration capabilities, so users can drill down deeper into data using full-text and faceted search to solve critical business problems in real-time. Cloudera’s search solution combines the established, feature-rich, open source search platform of Solr and its extensible APIs for easy integration with production legacy systems, offering valuable integration with CDH that address many of the common pain points of standalone search solutions for Hadoop. Through the new, robust failover features available in SolrCloud (Solr4), Cloudera Search delivers the same feature set of the search platform with more scalable indexing and query serving than was ever previously possible.

Like Cloudera Impala, the industry’s first open source, interactive SQL query engine for Hadoop, Cloudera Search extends the reach and capability of Cloudera Enterprise, the definitive Platform for Big Data. Cloudera is now making it possible for enterprises to "unaccept the status quo" imposed by closed source solutions vendors and benefit from the superior economics and unparalleled opportunity of Hadoop as a central, enterprise data platform that addresses the challenges and opportunities presented by big data.

Beyond SQL: Now Everyone Can Benefit from Hadoop
As enterprises increasingly look for ways to derive greater value from all their data, a pervasive challenge has emerged: how to make all data available and consumable beyond IT departments, so it can be more widely leveraged across an entire organization. Cloudera’s search solution expands the data exploration capabilities of Hadoop with faceted navigation and full-text search to more quickly find data for processing and analysis. Cloudera Search puts the power of data discovery into the hands of non-technical teams, enabling line of business and everyday users to interact with and uncover relevant correlations from data in a familiar, easy to use search interface. Companies can provide secure access to a centralized data repository and make it accessible to anyone who wants to derive valuable insight and consolidate search and Hadoop cluster investments into one, complete solution with unified management and control through Cloudera Manager.

Beyond Batch: Real-Time Interaction with Data in Hadoop

Cloudera Search provides enterprises scalable indexing options for big data and extends the Apache Solr project to offer near real-time document processing and indexing of data in transit to Hadoop and other storage endpoints. Data is immediately available to Search and other Hadoop computing frameworks, like Apache Hive(TM) and Cloudera Impala. Cloudera Search also provides linearly scalable batch indexing for large data stores within Hadoop on-demand, and with the introduction of an innovative GoLive feature can now incorporate incremental index changes, while avoiding costly downtime.

Cloudera Search Feature Highlights

Cloudera Search is specifically designed to support business users with their quest to locate relevant data quickly and efficiently in Hadoop, for further processing and analysis. Cloudera Search is fully integrated with the CDH platform. Key features include:

Scalable, Reliable Index Storage in HDFS: integrates index storage and serving directly into HDFS

Batch Indexing via MapReduce: allows for index creation of data stored in HDFS and HBase as scalable and robust as MapReduce

Real-time Indexing at Collection: makes an event searchable as it is stored into Hadoop through near real-time indexing features powered by Apache Flume(TM)

Easy Interaction and Data Exploration via Cloudera Hue: provides a plug-in application for Hue and easy-to-install capabilities for standard Hue servers to query data and view result files, and enables faceted exploration.

Simplified Field Extraction and Cross-Platform Data Processing: allows for quick and easy field extraction of any data that is stored into HDFS using optimized Hadoop file formats, such as Apache Avro(TM), avoiding the pain that many standalone search solutions might impose, and promotes reusable configurations and processing activities with the new processing framework, Cloudera Morphlines

Unified Management and Monitoring with Cloudera Manager: provides a centralized management and monitoring experience that makes it as easy to deploy, configure, and monitor search services as it is to manage CDH deployments and other services on the Hadoop cluster

 Product Availability:

The first in the market to ship code, Cloudera Search is immediately available as a supplemental module for Cloudera Enterprise subscribers.


See previous articles

    

See next articles












Your podcast Here

New, you can have your Podcast here. Contact us for more information ask:
Marc Brami
Phone: +33 1 40 92 05 55
Mail: ipsimp@free.fr

All new podcasts