Big Data is coming for you. Software that
captures lots of data and uses it to make predictions has mostly been
the province of engineers skilled in arcane databases and statisticians
capable of developing complex algorithms. As the business gets bigger,
however, software makers are domesticating their products in the hope
they will prove attractive to a broader population.
Cloudera,
which offers a popular version of the open source database called
Hadoop, released software on Wednesday that makes it possible to run
queries from a more mainstream SQL programming language interface. SQL,
thanks to its adoption by Oracle, Microsoft and others, is known to
millions of business analysts. “This enables us to talk to a whole
other class of customer,” said Mike Olson, the chief executive of
Cloudera. “The knock against Hadoop was that it is too complex.”
There
is a reason for that. Hadoop is one of several so-called unstructured
databases that were created at Yahoo and Google, after those two
companies found they had previously unimaginable amounts of data about
activities like people’s Web-surfing habits. Put into databases designed
to handle this unstructured behavior, then analyzed, this information
was valuable for figuring out things like what advertisement to put in
front of each individual Web surfer. Now, with more commerce,
content and social behavior online, Hadoop-like systems are valuable to
mainstream corporations. Cloudera, which was formed by veterans of
Google, Yahoo and Oracle, was among the first to make a commercial
management product to go with Hadoop, which is an open source product.
Cloudera’s new SQL offering, named Impala, is based on an open source project called Dremel
that began inside Google. Mr. Olson said Google had released papers on
Dremel, but Cloudera was the first to make a public version. Like
Hadoop itself, Impala will be open source, and Cloudera will make money
from subscriptions to its management software. The Hadoop product was
also improved, Mr. Olson said, so complex queries could now be performed
up to 30 times faster.
This is not the only way companies are
trying to reach more Big Data customers. Last week Teradata released a
no-cost trial version of a combination database-analysis program that is
capable of handling traditional SQL queries as well as larger data
analysis work.
The product, which comes from Teradata’s
acquisition of Aster Data, has more than 50 analytical functions,
including social network analysis and fraud detection. The target
audience includes business analysts as much as highly trained data
scientists. It comes with tutorials, presumably in the hope that
prospective customers will love the test product enough to buy a
full-featured production version.
No comments:
Post a Comment