Big Data

Why Social Analytics Are So Hard, But So Vital

By now everyone has heard the story of how Target in the US was able to figure out that a teenage girl was pregnant before her father (http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/).

social-media-pie

Controversy from that specific case aside, being able to engage with customers at that kind of depth is a holy grail for modern successful business. But many struggle, because while collecting customer data and storing it is relatively easy, making sense of the data in a way that benefits the business is a far bigger challenge.

From a marketing point of view, social analytics is incredibly useful for reasons beyond community management and engagement. Increasingly marketers need to be accountable for their investment decisions and the ROI from campaigns to justify their spending. The continued shift in media spend away from traditional to digital media demonstrates that marketers are increasingly wary of booking ads without a clearly defined way to analyse the impact of the campaign. Digital channels offer rich analytics, and so also offers the marketing team the kind of accountability that they’re looking for when working out how to measure the effectiveness of digital marketing efforts and in particular social media.

The tools on the market to collect useful data about your customers are engaging with you digitally are plentiful, and using that data in a constructive manner can provide a clear competitive advantage as you’re going to know more about your customers and what they want. It takes guesswork out of marketing and replaces it with meaningful information. That is, of course, if you’re aware of the objectives of the project before embarking on it. Many social initiatives fail from the outset because the organisation isn’t clear about what it is trying to achieve. Many marketers will convince themselves that they are driving brand ‘engagement’ because they are accumulating many followers and likes. But how is that engagement translating into leads, transactions and revenue?

That’s not to say that there is any kind of behaviour that shouldn’t be measured through social analytics, because with digital channels just about everything is measurable in some way. Brand metrics are important, but so too are direct response measures, and in reality these are easier to quantify. You need to make sure that you are capturing the right data, based on a set of predefined objectives that need to be created before embarking on a campaign.

After capturing the data, the next step is to understand how effective the social campaigns have been in engaging with the customers. This goes beyond measuring clicks, followers and fans and you’ll want insights around how your investment in social is driving leads, transactions and revenue. From there the goal should be to measure the results against the goals, being able to demonstrate to management that the campaign strategy is working, and then securing the necessary resources to make further investments into the strategy.

Organisations should also be aware that data collection and analysis needs to be an ongoing process. It’s not enough to run a data collection campaign once and leave it at that, as a smart analytics strategy is continuously being optimised and improved on. Equally, it’s important to check and double-check that the recording of data is accurate. Clean, accurate data is the only data of value to an organisation. Bad data is a PR nightmare.

Across the business social data and analytics offers benefits and making effective use of your customer’s data will drive a more efficient and effective operation across the entire organisation. It’s not necessarily easy to achieve, but the value it brings to the organisation is essential.

Written by Managing Director of InfoReady Tristan Sternson.

6 Common Misconceptions About Big Data

1. Does Big Data Change the Definition of Knowledge?
I believe Big Data deepens what knowledge means to us. Previously we were scratching the surface of customer/consumer insight. Big data allows, us through advanced learning techniques, to know more and leverage that knowledge to create more relevant marketing.

2. Are Claims to Objectivity and Accuracy Misleading?
What does this mean and what are the claims they are referring to? If this means that the outcome it is more objective and accurate then yes these are the benefits and outcomes of Big Data. Are these claims misleading? The old adage Rubbish in = Rubbish out still applies and if the data is of good quality then the knowledge from this data will be better.

3. Is Bigger Data Always Better Data?
There is a bigger danger with paralysis through analysis. There are the 4 V’s of Big Data Volume, Variety, Velocity and Veracity.
Volume and velocity shouldn’t be an issue with today’s tools and platforms with the advent of new storage and query technologies, this is less of an issue. While many organisations will not be able to leverage these due to limitations in skills, budgets, appetite for change etc. Variety is always an issue, as different types of data are being generated and the key is the ability to pull together and standardise to create a level playing field for analysis. Identifying Veracity of Big Data often relies on experience which is another skills issue. There just aren’t enough people qualified to operate in this space.

4. Taken Out of Context, Does Big Data Lose its Meaning?
Yes  as there are a various different definitions of what big data is and it will have different meaning for different people and organisations. Big Data is often hijacked and used as a buzzword.

5. Collection: Because it is Accessible Does it Make it Ethical?
There is a strong data privacy compliance environment in Australia which deals with this issue. At the end of the day we all have to put our customer hat on and think about how specific uses of big data would sit with us. There are obvious exceptions to this were the greater good of society (e.g. law enforcement through the police, ATO etc.) is concerned but this is also covered by existing legislation. Also all consumers are aware of marketing and expect organisations to use their information to promote products and services effectively allow organisations to do this by providing consent to them to use their personal and behavioural information.

6. Does Limited Access to Big Data Create New Digital Divides?
Not really. The belief is that if we know more about consumers then we can create more relevant engagements. It’s more of a continuum where there will be varying levels of effectiveness potentially based on their level of investment in this area.

DataCon is an evolution of the BI & Big Data Conference, which was held for the first time as part of the CeBIT Australia 2013 program and was among the best attended of the eight separate business conferences at the event. Headlining the top-line inerntational speakers at the event was Obama for America 2012 Chief Data Scientist Rayid Ghani, who presented on how the Presidential campaign’s data strategy evolved over its brief life, and how the campaign’s data team managed its experimental and production data strategies.

To read more about DataCon @ CeBIT Australia 2014 please visit: http://www.cebit.com.au/datacon-conference-2014

 

Nalini Kara

Product Director Data Services – Salmat Digital.

Nalini Kara (4)Nalini leads Salmat Digital’s Data Services team. Her experience as Data Analyst and Modeller goes back over 20 years. In that time she has worked in both Australia and the UK where she has successfully led teams in the development of bespoke segmentation frameworks as well as developing predictive models using large transaction and demographic data sets.

 

6 Big Data Predictions for 2014

Australia might be at the end of the mining boom, but it’s just kicking off the data mining boom. Big data is going to be a truly hot topic in 2014, with IDC predicting that spend on big data technologies worldwide will grow by 30 per cent over the next year. Interest in big data, and other what IDC calls “third-platform technologies” such as cloud, mobile, and social networking, will account for 89 per cent of the growth that will see total global IT spend top $2.1 trillion in 2014.

What’s most exciting about big data is that many organisations are not even coming close to utilising data to its fullest extent. As we move into 2014 big data is going to only grow as an opportunity for competitive advantage, but it first must address five key challenges that has been holding back its growth.

Big data will need to overcome the data challenges of business units outside of IT – The current idea that big data is solely the responsibility of the IT teams is one that organisations need to push out of the business in 2014. As Gartner’s prediction states; by the end of the decade technology spending outside of the IT department will become 90 per cent of total spend. Big data will be a major driver of that, as it offers clear and definable value to all parts of the business. To keep on top of the trend, organisations will want to work on educating different teams within the business on how to draw meaningful insights from big data and then execute on them.
cloud-hoskins
Big data will mandate that IT becomes part of all business units. Aside from enabling an organisations’ business units to access data, in 2014 we will see big data become a core part of those business unit’s functions, and as a result require IT staff of their own. Gartner’s prediction that more IT spend will be made in business units outside of IT than within the IT team will also start to come true here as these other business units start to recruit IT staff of their own.

Big data will need to start justifying itself via a business case – Most organisations are collecting data right now, but they’re not necessarily doing enough with it to articulate the value around it. In 2014 it will become more important for businesses to make a strong business case around the data they’re collecting in order to justify continued investment in big data solutions.

Big data has three unique selling points – volume, velocity, and variety. These three ‘V’s can be linked to a fourth – value, which occurs when business value is linked into each of the other ‘V’s. In other words, in 2014 the business case for big data will involve the collection of data on customer’s devices, preferences, activities, locations, and interests, and this data will be analysed in real time to ensure a positive customer experience. The velocity is critical because the value of information decreases sharply a short period after it’s gathered. It’s impossible to make an effective business case for big data without meeting each of these selling points, so much of the focus in 2014 for businesses needs to be on linking the three ‘V’s to find value for the organisation.

Big data will need to do more than manage social media – A lot of the energy around big data in 2013 and years past has been around social media; gathering interactions and using it to better target customers with tailored content. But there is much more to be done. How many retailers are making use of data gathering to understand how customers are moving through their isles, for instance? In 2014 we can expect to see big data’s role move well beyond social media for the innovative companies out there.

Big data will need to become more integrated with risk management practices – There are still too many instances of big data-based marketing campaigns backfiring and causing lasting damage to organisations and brands. For one example – Target’s now-famous marketing email that predicted a woman’s pregnancy before she had told her parents. A more recent example of a social campaign gone wrong was JP Morgan’s initiative to run a Twitter Q & A with the bank’s Vice Chairman, Jimmy Lee. Misunderstanding its social media profile entirely, the initiative drew over 6000 very angry tweets from consumers and JP Morgan was forced to cancel the Q & A. Big data can be used to measure community attitudes and assess risk profiles of marketing activities, and in 2014 it will be more important than ever for organisations to include the big data analytics as part of any due diligence process.

DataCon is an evolution of the BI & Big Data Conference, which was held for the first time as part of the CeBIT Australia 2013 program and was among the best attended of the eight separate business conferences at the event. Headlining the top-line inerntational speakers at the event was Obama for America 2012 Chief Data Scientist Rayid Ghani, who presented on how the Presidential campaign’s data strategy evolved over its brief life, and how the campaign’s data team managed its experimental and production data strategies.

To read more about DataCon at CeBIT Australia 2014 please visit: http://www.cebit.com.au/datacon-conference-2014

TristanThis opinion piece was written by Managing Director of Infoready Tristan Sternson, a leading business intelligence IT services provider. To read more about InfoReady please visit: infoready.com.au

 

 

Fifth Quadrant joins forces with Ambiata to put Customer Experience first

Sydney,  9th October 2013 – Fifth Quadrant, the industry leader in Customer Experience strategy, design and research has teamed with Ambiata, a leading provider of Big Data solutions, to assist organisations develop a strategy to implement holistic, enterprise-wide, data-driven customer experience solutions.

 

“The commoditisation of services over the last few decades has seen the customer put in pipelines, segmented, targeted, branded and given terms and conditions. It has been near impossible for organisations to treat customers as people and to personalise experiences for them. We also know that if organisations can personalise experiences, customers will spend at least 14% more with them. To remain viable organisations are going to need to reconsider their customer related platforms and analytics and give Customer Experience a seat at the C-level table. Mature Customer Experience organisations have well developed data analytics strategies and have Big Data Analytics on their roadmap,” says Dr Catriona Wallace, Customer Experience Futurist and CEO of Fifth Quadrant.

 

Fifth Quadrant and Ambiata have teamed to develop a consulting program to assist organisations in developing an omni-channel, data-driven customer experience strategy in order to achieve customer personalisation and deliver to a Segment of One (SoO). This strategy enables organisations to identify how to better leverage data-driven customer experience to deliver on business objectives and corporate strategic goals.

 

NRMA Motoring & Services (NRMA) is one organisation that has benefited from the consulting program. Fifth Quadrant and Ambiata recently undertook a diagnostic assessment of the organisation’s data assets across its motoring business and complementary Group businesses, mapping its current state landscape and optimal future state from a data asset perspective.

 

“NRMA Corporate Strategy has a particular focus on the use of data as a strategic valuable asset that can be leveraged to Grow Motoring, Grow Assistance and Grow Relationships and thereby add value to our members and customers. We are just laying the foundation for something really important for the NRMA in the years ahead,” says Romesh Lokuge, General Manager of Business Intelligence at NRMA.

 

“The majority of organisations see information management as a cost of operating their business, they don’t yet realise that data is in fact an asset that can be strategically leveraged to provide a personalised and consistently delightful experience for each of their customers that directly drives business growth,” says Dr Rami Mukhtar, Data Scientist and CEO of Ambiata.

 

About Fifth Quadrant

Established in 1998, Fifth Quadrant Pty Ltd is a Customer Experience Strategy, Design, Research and Analyst organisation. Based in North Sydney, Fifth Quadrant works with enterprise and government to provide management consulting; customer experience strategy development and execution; customer experience research, co-creation and design; advanced data analytics and modelling; education and consulting on Big Data and data strategies; diagnostic assessments of operations and technology; and improvement of operations; providing services across Australia, Asia and North America. Fifth Quadrant is the 2013 Telstra NSW Business of the Year. www.fifthquadrant.com.au

 

About Ambiata

Ambiata is on track to be in NICTA’s next wave of spin out companies. NICTA is Australia’s leading ICT research organisation. Ambiata’s customer personalisation service is built on NICTA’s world leading research in Machine Learning and Big Data analytics technologies. www.ambiata.com

 

Cloudera aims to hit US $ 1 billion not only in market cap but also in revenue

Mike Olson

Cloudera aims to hit US $ 1 billion not only in market cap but also in revenue

 

Those of you involved in Big Data, will be familiar with Cloudera, a leading brand in the Apache Hadoop-based software and services space.

But for others out there who don’t, here’s a quick description: Established in 2008, the Palo Alto, California, US-based Cloudera offers an integrated Big Data platform comprising software, support, training and professional services. This platform has open source Apache Hadoop software at its core and includes additional value-added software for enterprises to deploy and use the open source platform on critical business problems that they can attack with big data. Cloudera allows customers to store, process and analyse data reliably, securely and inexpensively, offering a data platform that enables enterprises and organisations to look at all their data — structured as well as unstructured.

The BigInsights Team lead by Chief Executive (CEO) Raj Dalal, Partner Haima Prakash and Chief Technology Officer (CTO) David Triggs caught up with Cloudera’s co-founder, Chairman and Chief Strategy Officer, Mike Olson on his recent visit to Sydney, Australia as part of BigInsights, Big Data Vendor Landscape Study. Over lunch, Mike delved in on several topics around Big Data, including Cloudera’s present and future strategies

In the coming weeks, BigInsights will be bringing you a three-part series article based on excerpts from this extended talk between the BigInsights’ team and Mike.

 

 

Cloudera’s Vision For The Future

 

Cloudera’s commitment to an open source data platform is absolute, but it will continue to innovate on top for administration, and even for business value over the long term, said Mike Olson, CSO of Cloudera.

“We believe we’ve got an opportunity not only to hit a billion dollars in market cap, but to cross through a billion dollars in revenue and that only happens if we’re delivering serious value to customers and they keep coming back.

“We want a differentiated product set that allows us to generate the revenue that permits us to invest back in the open source platform. The collection of late market entrants, and especially the larger companies that have stepped in, don’t have the representation in the open source community to really drive (on) that road map. Some of the emerging venture backed companies don’t have IP of their own,” he said.

Here are snapshots of some of the answers that Mike gave to questions posed by the BigInsights’ team:


 

Cloudera’s differentiation and leadership strategy in both direct and OEM markets worldwide?

 

Mike:

Our job is to deliver customer success. Storage, data processing, data analytics, that’s fundamentally what the CIOs want to be open source. They’re afraid of bad vendor behavior from decades of experience and they want to know that the substrate, just like the operating system that they rely on now, Linux, is insulated from bad vendor behaviors. So our commitment to an open source data platform is absolute…..

We saw early an opportunity to deliver a unique differentiated IP as Cloudera addressed common problems. We invented Cloudera Manager and we’ve been working on it for three years, it’s our IP. It’s best of breed and best in market. We recently introduced our BDR solution for backup and disaster recovery and Data Navigator which is basically audit logging and compliance reporting for data access.

We’ve innovated in security both in open source and in our management infrastructure. We think that the combination gives us the best and most capable platform. We’re the leader among those open source providers in the volume of software that we give away. No one writes and gives away more open source software for the Hadoop system than we do.

We think our strategy is the right one: A hybrid IP strategy aimed at customer success that allows us to craft these long lasting relationships. We’ve got an annual subscription business. If we can’t get you to come back every single year because you love the services and you’re profiting from your data, we’re doomed. So we think our customers are insulated from that bad vendor behavior as a result.”

 

 

Evolution of the Big Data market 

 

Mike:  What really has to happen is that we need applications focused on real business use cases. I love the technology, but nobody buys the technology. Everybody buys the solution to their business problems. We are now, by the way, seeing applications emerge that do exactly that. So Amdocs, for example, builds a churn management application that they sell into mobile providers on top of Cloudera’s platform and those mobile providers don’t actually realize that they’re buying Cloudera. That’s what we need to see happen broadly in the market.

 

 

Do you see Cloudera as a Hadoop company or going beyond that?

 

Hadoop has grown beyond what Google originally designed. The name is going to expand to cover the Big Data platform of the future; it’s just too great a name to abandon, right? But what we ship today is much larger than what Hadoop was when Facebook and Yahoo! and others collaborated on it the very earliest days…what Doug Cutting created.

 

Impala and Innovations in Big Data processing on Hadoop beyond MapReduce

 

Mike:  When we started we saw enormous opportunity in Big Data, beyond the software that then existed. When Google invented this new platform, it invented two things; a storage layer that could take any kind of data in enormous volume very, very cheaply, right? So out of storage infrastructure it delivered a new engine for analyzing that data. That engine was called MapReduce. You gang a whole bunch of computers together and you take advantage of all their discs, but you also take advantage of all of their CPUs and you push your analytic jobs down to run on those servers right on the data, you don’t need to move the data out. That was transformational and worked miracles for the web properties. But look, not every single business problem can be solved by MapReduce.

Cloudera announced Impala, which is a high performance, interactive, SQL engine running natively in a distributed way on your big data Hadoop infrastructure to leverage the investment in SQL and the broad knowledge of that language. Impala is just an engine that goes to the data. It doesn’t take advantage of any of the MapReduce infrastructure. [It is] an entirely separate scale-out database engine the way you would design it in 2013, a query processing engine, and we know how to build distributed query processors. So that’s what we’ve built.

We recently announced the availability of Cloudera Search. We took the SolrCloud, document indexing, and search engine, and we made it run in the same way in massive parallel on all of those servers — each of them looking at its own little fragment of the data. With our partner SAS, we’ve helped them redesign their numerical analysis engine so that it can run in a data parallel way and you wind up installing the SAS engine on all the nodes in a Hadoop cluster and now you can ask numerical analytic questions of a petabyte of data in no time at all.

The real insight here is the big scale-out store gives you a way to push different engines down to the data. Our vision is, you want five or 10 or 50 different engines that go visit the data…… so the real platform of the future is going to support a variety of ways for getting at the identical data. You’d like to be able to search for a data set of interest using Cloudera Search, and then use machine learning and analytics and MapReduce to produce a derived table that you then query by Impala, right?

 

Impala and differentiating against other SQL on Hadoop initiatives?

 

Mike: In the last four or five years we have driven innovation on the platform. We were the first vendor in this space, we were the first vendor with a Hadoop distribution, we were the first company to add HBase for NoSQL scale-out data delivery at web speed. We announced and delivered Impala to the market, we’re the only company today with Search, the only company integrating proprietary products from established vendors like SAS.

We have explained to our competitors and to the market at large what the Big Data platform of the future looks like, and we’re flattered by the fact that they’ve acknowledged that we’re right. So Stinger was not a forward-looking announcement, it was a reaction. The Drill announcement from MapR likewise was well aware of the work that we were doing at Cloudera. Our job is to continue to innovate on the platform to drive it forward, but more significantly to be sure that our customers are successful with Big Data. Having the coolest, most capable, most “feature-full” platform is of no value if we’re not solving meaningful business problems for C-level executives.

….we want to grow for a long time, to remain independent. We believe we’ve got an opportunity……..to cross through a billion dollars in revenue and that only happens if we’re delivering serious value to customers and they keep coming back. So, innovation yes, absolutely and we think we’ll continue to do that. More significantly though real business value for real important problems.

More in Part -2

DataCon takes place from 9 – 10 October 2013 at the Hilton Sydney.

Interview with Dr Eugene Dubossarsky from Contexti

Raj Dalal - Small

CeBIT Australia recently interviewed Chief Data Scientist, Eugene Dubossarsky of our Gold DataCon Sponsor, Contexti, on his thoughts on Big Data.


1. Can you tell us a little about your background and your current role?

As CDS of Contexti, I have two main jobs, each of which is a conversation. The first is the most important one, this is the conversation with the client. This is a delicate and ongoing conversation, where I need to do many things: help the client figure out what they want, which may also require gently educating them in the process, introducing them to the power of data analytics, and most importantly their own role and power in the process.

As a Data Scientist, my role is to let the data tell its story. A good engineer gives data its voice, and my job is to listen to that voice. We thus each have our role to do. My job is to explore the undiscovered country that is data, grab the nuggets of value, identify the hidden risks to avoid, and try to see how it all fits into a big picture, that allows a view of the future, a crystal ball that can help chart a way forward. The cool stuff you hear about machine learning, data visualisation, “big data” and other buzzwords all feature too. But these are just tools.

People, data, and the stories they tell – this is the main part of the job.


2. As we know the term Big Data can be an ambiguous term and mean a lot of different things to people, what is your take on Big Data?

It certainly did the job! This is the one buzz phrase that put “it” on the map, whatever we choose to call “it”. My only point is that “it” needs to be big enough to include “small data” (if we think of “big data” as terabytes), and what I call “tacit data”, which is usually the most important data, but lives in people’s heads rather than electronic data bases. Of course, getting tacit data out is possible and desirable, but that is an entirely different story for another time…
So, with regard to Big Data : For me, the “Big” is not just “Size” or “Speed”. Far more importantly : it is “Big” in terms of “Value”, “Credibility”, “Transformational Potential”. It may also be “small” in size, but require “Big” tools in terms of sophistication, computational power and effort on the part of engineers and scientists to realise this value.
My own term for the latter category is “Big Crunch” – the data itself may be small or medium, but “Big Data” tools make extracting the value possible. These are the techniques I find myself using the most actually.
Volume wise, actual “Big Data” terabytes in size is not the most valuable kind of analysis for most Australian companies. But “Big Crunch” certainly is.

3. Which industries do you see this type of analytics benefiting the most and why?

Name one that doesn’t. I am most excited about the growth of analytics in SMEs. In practice, analytics is vital for organisations facing real competition, real ongoing, disruptive change in their industry, real risks and real uncertainty. This is true for most privately-owned SMEs. Ironically, many organisations that can afford analytics most in Australia probably need it least, but this is where most of the buzz is in the industry.
For me, the real question is: “can your company truly afford to survive without analytics”? The answer for most quantitative hedge funds is “of course not”. I leave it for the reader to identify areas where this might not be the case.

4. What do you think are organisations biggest problems when trying to start a big data project?

Most people don’t realise what they are getting into with data science: it is if anything even more powerful than they thought, but they underestimate the amount of personal investment and change required to realise value.
The biggest misconceptions are around the very nature of analytics, and specifically this thing called “data science”, which is a far more helpful term that “big data”.
In essence, analytics should be about exploration, with engineering/building playing a supportive role, albeit a vital one.
Analytics is about exploring, not building. For a Scientist, data is a rich land of mysteries, and the process is a conversation. A scientists welcomes the unknown. For an engineer, data is a commodity, and the focus is on the tools that move and process it. The unknown is to be shied from and controlled, things must work perfectly. And this is necessary too, so that the scientist may play his part.

Engineers work on “projects” by the way. This model is less appropriate for scientists, as are conventional project management methodologies. Those are also a great way to kill the value of an analytics project, and I have seen this tragedy unfold more than once.

Organisations that get this achieve enormous benefits. Organisations that don’t will fail, dissolve the analytics function and start again, only to fail again because the key misconception has not been addressed.

The other major problem is executives underestimating how much personal investment they must make in analytics. Investing in analytics is like investing in a gym membership or an education : you are not paying to make something go away: you are paying to get a whole lot busier at something that will transform you fundamentally. The executive suite can no more outsource the analytics function than I can outsource my gym workout. I wish I could…

Nevertheless, the view persists that analytics is an IT function, primarily concerned with engineering (building, maintaining), that data is a commodity and that the whole thing has nothing or little to do with the lives of important people in the organisation – these are the biggest challenges to organisations coming to terms with big data.

5. How should an organisation go about even starting a Big Data project?

  1. It isn’t a project, it is an exploration.
  2. Invest in experimentation, not fixed projects. Accept that there may be no value at all in the first six months.
  3. The executive sponsor is the number one fan, supporter, client and leader of the analytics team
  4. “Invest in smarts” – hire smart people, bring in smart advisers, consultants, trainers.
  5. Don’t waste a cent on software until you know exactly what you need and why, having tried a great many things with open source. It is good enough to begin with, especially when you are still trying to figure out what to do with your data.
  6. Don’t be embarrassed that you have no idea what to actually do with your data, or how it leads to value. Just about everyone else is in the same boat.

6. What are some of the tools and technologies that can be employed for a big data project?

There are three levels to this, only one of which is actual “tools” ie IT products. The three levels are:

  1. Business applications – which often require very significant customisation, although they may also be quite similar to applications in other industries/organisations. Or they may be relatively well known things like customer retention for Telco or Insurance claims analysis.
  2. Conceptual tools – these are the things missing in the toolkits of most people with an IT background who make the transition to “data science”. This includes the whole kit bag of machine learning, statistics, visualisation, network analysis and lots of other mathematical/conceptual/computational tools, tricks, methods and maps. While these are indeed embodied in specific software, the conceptual/mathematical understanding of these tools, their applicability to real-world problems, and the ability to use them broadly and in new scenarios sets the true data scientist apart from a hacker.
  3. Software – this is the least important layer. If people are not sure what they need, they should start with something like R. And of courses there are many other open source tools out there. If they can see for themselves where open source tools are not up to the job, then they are “educated buyers”, with a clear need and agenda and should consider commercial tools. I have never seen any new data analytics function that did not need to come to terms with its own needs and data first, and where R was not a sufficient starting point.

About Contexti | Big Data Analytics

Contexti is a premier Big Data Analytics company.
We help customers drive growth, accelerate innovation and create competitive advantage.
With expertise in data-driven strategy, Hadoop and NoSQL technologies and advanced Data Science methods, we provide specialist consulting, training and managed services.
In short, we Create Value from Data™.

www.contexti.com

Contexti | Big Data Analytics will be leading a workshop at DataCon. Join John Zantey, VP & CTO Contexti at DataCon from 9 – 10 October 2013 at the Hilton Sydney.

Harper Reed, CTO, Obama for America Campaign 2012, delivers his keynote address at CeBIT 2013

Harper discusses how the use of Big Data analytics and Cloud services greatly assisted the Obama administration win the campaign. He goes on to discuss how organisations can get value from their data and what they need to do to get there.

Watch the video here

CSIRO enters commercial Big Data

Australia’s flagship government research agency – the CSIRO – has ramped-up its Big Data project work on behalf of both public and private sector clients, calling it the most revolutionary and potentially transformative IT technology to emerge in years.

The CSIRO increasingly offers its services on commercial terms, and is using its strength in high-end mathematics and computing to help Australian customers take advantage of Big Data capabilities.

Alan Dormer, the Government and Commercial Services theme leader within the CSIRO’s Digital and Productivity Services Flagship, says Australia is still a long way behind the US in extracting that potential that Big Data can offer. But it is a huge area of focus, and that Australia is following its usual path as an enthusiastic early adopter.

Mr Dormer will present at the CeBIT Global Conferences’ Big Data event being held in Sydney on October 31 – November 1. He will outline the Big Data impact on the economy and society, running through a host of project examples that the CSIRO has been engaged in.

Mr Dormer says sophisticated commercial and government users in Australia are well aware of the generic capabilities of Big Data projects. But until it is applied to a specific organisation, it is difficult to get a read on its value.

“It’s not the product that’s the issue. It’s the customer knowing what to do with it,” Mr Dormer said.

“We basically do research in this area (on behalf of clients.) So as people see what’s possible, they tend to get more ambitious in terms of (what they want to do with) Big Data,” he said.

“The appetite grows with the eating (for clients.) And we have the largest concentration of mathematicians in the country, so we have been pretty active in this space.”

The CSIRO projects are at the sharp end of research, rather than commercial implementations or consulting. These projects usually involve its officers writing applications specific to the project, and sometimes taking advantage of the supercomputing facilities at CSIRO.

Among the projects so the CSIRO has engaged with so far, the focus has been in government services (including looking how people interact with services,) financial services (including payment security and fraud) and at the disaster management area.

Customers have included the federal Department of Human Services , the United Nations and AusAid, and NSW Fire and Rescue.

Mr Dormer says the agency is now starting a focus on the retail sector, an area where he said Big Data will have an enormous impact. “It’s going to make a huge difference to the way that people deal with customers – and that includes clients, or citizens, depending on the organisation dealing with them.”

See Alan Dormer, Science Leader, Government and Commercial Services, CSIRO, at the Big Data Conference in Sydney on
31 October – 1 November 2012.

 

 

 

Splunk on Volume, Variety, Velocity

Australia’s corporate sector has taken to burgeoning Big Data market in the same fashion it has taken to other waves of new technology; as early innovative adopters, according to Daniel Miller, the local country manager of Nasdaq-listed big data veteran Splunk.

Big Data issues are not just about data volume. The variety of different kinds of data, and the speed at which it is growing are added complexities. It is unstructured and ballooning.

Consider that every 60 seconds Google serves more than 694,445 search queries, or that 600 videos are uploaded on YouTube videos, adding more than 25 hours of content, or that 168,000,000 emails are sent.

Or that every 60 seconds 695,000 status updates are published on Facebook: or 79,364 wall posts and 510,040 comments. And those numbers don’t include all the machine-generated data that results from each of those human creations.

Set up in Australia just two-and-a-half years ago with a single employee, Splunk has been one of the most active technology transfer drivers for Big Data in this country. It has quietly created a significant, fast-growing subsidiary with nine employees and a trajectory that will double head count again in the next 12 months.

“Australia has always been an early innovative adopter of new technology, and that has been no different with Big Data,” Miller said.

Splunk is bringing one of its senior commercial and technical experts to Australia to speak at the Big Data Conference in Sydney on October 31 to November 1, an event presented by the CeBIT Global Conferences group.

But where most of the Big Data vendors and services providers tend to be consultants on the Google-inspired Hadoop and Maproute systems, Splunk brings different products and different value propositions to its customers. It has its own distributed indexing architecture and its own search language.

Like the rest of the Big Data sector, Splunk focuses on unstructured data – but its real focus is in the massive volumes of machine-generated data. According to research group IDC, about 90 per cent of the data in today’s organisations is machine generated – by websites, applications, servers, networks, mobile devices and the like.

The whole value proposition of Big Data rests on the notion that these massive pools of unstructured data hold tremendous value, Mr Miller says. The trick to unlocking that value rests in being able to handle such massive data volumes, being able to handle lots of different types of data, and being able to handle the sheer speed at which new data is being generated.

Splunk’s Enterprise product collects, monitors, indexes and analyzes the machine data generated by IT applications and infrastructure–physical, virtual and in the cloud. This machine data is massive in scale and contains a definitive record of all transactions, systems, applications, user activities, security threats and fraudulent activity. This data is largely untapped; Splunk helps organisations unlock its value.

Mr Miller says the Splunk platforms differ from competitors in that they contain an in-built dashboard and tools, as well as specialist visualization features that simplify the trending and results information generated by vast pools of raw data.

As the largest information technology user in the country, the Federal Government is expected to become a big user of Big Data toots, and will be a focus for Splunk.

Mr Miller says local customers are often reluctant to talk about precisely what they are using Splunk tools for – sometimes because they are still in test “suck it and see” mode, and sometimes because they don’t want to reveal a competitive advantage.

But the company has made strong inroads in the University sector – especially as Big Data systems are well suited to development and research environments – and Mr Miller expects strong growth among public sectors customers in the coming year.

See the latest in Big Data Innovations and trends at the Big Data Conference in Sydney on 31 October – 1 November 2012.