Informatica: Businesses struggling to transform Big Data from ‘art’ into ‘science’

Channel News

Making Big Data analysis available to enterprise without the need for costly experts is crucial in creating “mainstream adoption”

With an explosion in big data, firms outside the tech elite are struggling to make sense of the abstract mass of information and whittle it down to something useful.

According Girish Pancha, chief product officer of Informatica the challenge at the moment for big data is to let the technology trickle down to a larger number of companies.

“From an adoption perspective we still have a little bit to go,” Pancha told ChannelBiz.  “We are still at an early adopter stage.”

“It is still in the process of being converted from being an artform to a science.”

Despite the creation of tools such as Hadoop to deal with the data explosion which Pancha warns is growing from cloud, social networks and mobile computing, the level of expertise is still not quite there he warns.

“From a technology stage some things have been proven out so far such as basic cost effectiveness of Hadoop,” he says.

“It came out of Google five or six years ago and has been used by Facebook and other large internet companies.”

What has not been proven, however, is the ability to make the technology enterprise grade for the rest of the Global 2000 companies, he says, with the technology “immature” for certain use.

“The internet companies have a lot of highly paid technologists that can do this,”  Pancha continues.

“But if you want to get is adopted by IT, which is where it needs to be for most business, then the technology is somewhat immature.”

That is the focus of Informatica ‘s release of the Informatica Platform 9.5 which is partly aimed making Hadoop enterprise grade and available to businesses without a data scientist at close hand.

The access to staff skilled in using Hadoop has been a sticking point in pushing forward big data analytics according to the likes of IDC recently, and the complex system is in danger of making widespread use prohibitively expensive with customers needing access to vendor staff.

Pancha says that the key barrier to adoption is productivity on a “broader scale”.

“Today Hadoop is very much a programming language environment, you have to use things like Java or other Hadoop specific scripting languages.”

That’s okay if you want to prototype things but if you want to make sure that it can all be used by different staff in future then there are problems he says.
“It is still in the process of being converted from being an artform to a science.”

So far a lot has been done by data scientists which are “a little bit few and far between”, but within a year a so and average data engineer will be able to do that, he says.

“You won’t need these hard to find, high paid data scientists,” he says, “this will become mainstream very quickly.”

It may still be a little while before this happens of course, with many firms only really pushing prototype systems in development  at the moment.

When asked whether this will be down to skills being increased or due to better automation of software Pancha claims it is “both”.

He points to introduction of data warehouses, which Informatica specialised previously, as an example of how the technology can be taken up once it reaches tipping point.

“If you rewind there was the same question was around data warehouses – they did the same sort of stuff around transaction information.  A methodology got created.”

“At first data warehouses failed a lot because people were trying to do it on their own and reinvent the wheel, but along the way people said ‘here is how you model a data warehouse’, ‘here is how you would access the information.”

“It got codified along the way and people just take it for granted.  I think this is the same sort of thing.”

“Those who have been trained to deal with things like data warehousing will find this not too much of a stretch to deal with these new technologies and algorithms to provide the next level of value when it comes to interaction data.”

 

Read also :