Asking the RIGHT questions

Carla Gentry
Data ScientistEach time I talk to someone about analytics, I ask the same question: “What is your ultimate goal with this project?” Often it is to increase sales or reduce turnover. Of course, this isn’t usually what’s said; initially all I get is a panicked look that says “we can’t get what we want out of our database—it isn’t working right…how do we fix it??”

Typically, there is nothing wrong with the database: the master and all its clones are just as they were designed to be, the variables are entered correctly and the reporting functions are pulling exactly what they were coded and designed to pull.

So what’s the issue?

Example HR data — Every HRIS or ATS database contains different information (employee addresses, phone numbers, salaries, benefits, and the like), but how much of that information is connected? What I mean is: is there a unique quantifier that connects each table or database together? If I want to select all the people who might retire in the next 5 years out of a database, complete with demographic, sales and personal information, in order to create an organizational plan for this, can I accomplish this with my current database? By design relational databases are just that: “relational.” Therefore, everything should flow, if set up correctly in the very beginning.

Which brings us back to asking the right questions—which might look a bit like these:

  • What is it really that I want to be able to answer with my data collection?
  • Structured vs unstructured data, am I asking the right questions and giving them choices or offering a space to add comments (be careful of this)? Example a) agree b) disagree c( unsure —– Can I work with unstructured data?
  • Are you offering incentives to employees or candidates to “complete additional info” as to glean a more complete pictures of your customer? (5 dollar iTunes or Starbucks card)
  • Do I want to link social data to what I collect from employees? (3rd party sign in via Facebook, Twitter, Google+, etc)
  • Are you including IT in your business meetings?

A gap analysis usually reports on what is missing, but it doesn’t have to be this way (reactive and not proactive). If you ask the right questions initially, your design and results will reflect this. Know what you are trying to accomplish.

If you say “my database is broken,” but what you really mean is “I need to be able to sustain sales throughout the next five years; I’m concerned with increasing what we have in the pipeline,” well, ask for help. Make sure you have the correct data to indicate your sales reps selling habits, including seasonality. What data does your sales department have that can help you answer these and many other questions?

Do you have store performance, or other line of business performance data to help form a more three dimensional view of your candidates or employees? A data-centric view is so valuable, but unless you ask the right questions when collecting your data and setting up your database, you may end up trying to build a predictive model using only name, address and phone number! As Chief Engineer Scotty Montgomery of the USS Enterprise might say, “I can’t do it captain!”

Go well armed on your journey towards predictive analytics and remember to always ask the right questions!

Advertisements

Should We Be Lowering The Social Media Marketing Bar?

However after almost a decade of social networking, the gap between the “experts” and the average brand or marketer is widening, therefore I believe the current path isn’t resolving the complexities faced by marketers and is only serving to perpetuate the massive learning curve. Furthermore, I think that the majority will continue to be left behind after giving up, running out of time and resources, or keep on trying without realizing the promised results.

BundlePost

Should we Lower the Social Media BarYes, we should. Now let me explain…

In my recent post entitled Top 2015 Social Media Predictions – Disruptive Technologies I covered one of the important disruption areas to watch this year, that was General Social Media Marketing. In fact it was the number one item listed in my 2015 predictions. Specifically I was referring to making social media easier to implement, get results and be effective. The actual prediction was as follows:

“As social media marketing becomes more and more complex, new technology is required to make it easier, regardless of user experience, knowledge or skill. This is a requirement for the industry whose time has come.”

The Problem:

The social media marketing industry is incredibly complex. Marketers, brands and individuals are attending events and classes, reading articles and buying books at a massive pace, trying to understand what to do. At the same time a handful of social media speakers, authors and…

View original post 598 more words

Riding with the Stars: Passenger Privacy in the NYC Taxicab Dataset

Hmmm, interesting -> Applying Differential Privacy
So, we’re at a point now where we can agree this data should not have been released in its current form. But this data has been collected, and there is a lot of value in it – ask any urban planner. It would be a shame if it was withheld entirely.

In my previous post, Differential Privacy: The Basics, I provided an introduction to differential privacy by exploring its definition and discussing its relevance in the broader context of public data release. In this post, I shall demonstrate how easily privacy can be breached and then counter this by showing how differential privacy can protect against this attack. I will also present a few other examples of differentially private queries.

The Data

There has been a lot of online comment recently about a dataset released by the New York City Taxi and Limousine Commission. It contains details about every taxi ride (yellow cabs) in New York in 2013, including the pickup and drop off times, locations, fare and tip amounts, as well as anonymized (hashed) versions of the taxi’s license and medallion numbers. It was obtained via a FOIL (Freedom of Information Law) request earlier this year and has been making waves in the…

View original post 2,314 more words

Hands on with Watson Analytics: Pretty useful when it’s working

If I have one big complaint about Watson Analytics, it’s that it’s still a bit buggy — the tool to download charts as images doesn’t seem to work, for example, and I had to reload multiple pages because of server errors. I’d be pretty upset if I were using the paid version, which allows for more storage and larger files, and experienced the same issues. Adding variables to a view without starting over could be easier, too.

Gigaom

Last month, [company]IBM[/company] made available the beta version of its Watson Analytics data analysis service, an offering first announced in September. It’s one of IBM’s only recent forays into anything resembling consumer software, and it’s supposed to make it easy for anyone to analyze data, relying on natural language processing (thus the Watson branding) to drive the query experience.

When the servers running Watson Analytics are working, it actually delivers on that goal.

Analytic power to the people

Because I was impressed that IBM decided to a cloud service using the freemium business model — and carrying the Watson branding, no less — I wanted to see firsthand how well Watson Analytics works. So I uploaded a CSV file including data from Crunchbase on all companies categorized as “big data,” and I got to work.

Seems like a good starting point.

watson14Choose one and get results. The little icon in…

View original post 433 more words

The Four Horsemen Of The Cyber Apocalypse

These “Four Horsemen” point us to the components we can expect to see used by hackers in 2015: exploits in unpatchable systems; recycled malware hidden imperceptibly; and human error. Studying these harbingers could very well save us from a potential cyber catastrophe

How big data got its mojo back

Big data never really went anywhere, but as a business, it did get a little boring over the past couple years.

Gigaom

Big data never really went anywhere, but as a business, it did get a little boring over the past couple years.

Big data technologies (and not just Hadoop) proved harder to deploy, harder to use and were a lot more limited in scope than all the hype suggested. Machine learning became the new black as startups infused it into everything, but most often marketing and sales software. So much ink and breath were wasted trying to define (or disprove) the idea of data science, probably because the tools of the trade were still so foreign to most people.

But while the early days of the big data movement hinted at greatness, it’s probably fair to say they didn’t deliver — even if the resulting tools were very useful and very necessary to set the stage for things to come. And, realistically, many companies still haven’t adopted these technologies or these techniques.

sd2015

Things are changing…

View original post 400 more words

The beginning of the end for email

Perhaps the biggest sign yet of the change at hand comes from Germany, which has called for an “anti-stress regulation” that would, among other things, ban employers from contacting employees after hours. Chancellor Angela Merkel has criticized the law and stopped it from moving forward for now, but German leaders have long been concerned about the growing tendency for technology to allow work to encroach on employees’ private lives.

Fortune

Along with global warming, the Ebola virus, and gridlock politics, this year, for me at least, something far less life- and society-threatening also spiraled out of control: email.

It was long ago invented as something to make us more productive. But what productivity expert would ever say that it’s a good thing that instead of working, we now “answer email?” Or that on some days, I am wary to leave my desk to head into a meeting because it means taking my finger off the dike and knowing I will return to a flood of boldfaced new messages waiting patiently for my total attention?

Some people strive for “inbox zero.” But like many people, I now get so much spam and unsolicited pitches that if I were to adopt such a goal, I would spend the entirety of every workday doing nothing but deleting emails. To keep up with this…

View original post 527 more words

Using Gamification to Build Communities and Create Leads #SocialSelling

But let us take a step back here. Who are we trying to sell to? Let us not forget, you know the prospects and customers in your forecast. The people you are trying to speak to, are those who are “thinking” about buying. But we can go one better than that! If we have community, we can pull in those that haven’t thought about buying. Your competitors or prospects that might not even know they have a problem OR that your solution exists.

Blog by @Timothy_Hughes

Gamification is a very trendy term right now, what does it mean?

Screen Shot 2014-12-15 at 17.50.15The main purpose for employing Gamification is to incentify, engage and improve user engagement.  In devising a “game” you work out the particular behaviors you want and then give points when people exhibit those behaviors.

For example, at a recent conference I attended, an App was provided to download.  Gamification was “provided” by offering points for leaving speaker feedback, looking at the exhibitors details, etc etc.  The organisers then hoped that people’s natural competitive streak will mean they will compete (results published on a “Leaderboard”).  To get further up the Leaderboard you need to leave more speaker and session feedback, all good behaviors.

Gamification is for Trendy Marketing People it has no Place in Sales, Right?

Screen Shot 2014-12-15 at 17.56.34Recently had a demonstration from a Gamification company called Rise (previously called Leaderboard) and it got me thinking. This is all…

View original post 689 more words

Big Data and Intuition: The Future of Marketing

Technology isn’t only getting faster, it’s getting smarter. Computers are able to recognize and learn from patterns and make changes in real-time. Their improved analytic and decision-making abilities now allow them to outperform humans in areas such as medical diagnosis and customized marketing campaigns.

Gigaom

Technology isn’t only getting faster, it’s getting smarter. Computers are able to recognize and learn from patterns and make changes in real-time. Their improved analytic and decision-making abilities now allow them to outperform humans in areas such as medical diagnosis and customized marketing campaigns.

However, it’s hard for marketers to embrace data analysis when they’ve trusted their own gut to fuel decisions for so long. It’s a point of pride for many.  The problem is, the strategy frequently fails. A 20 year study of political pundits found that they were only as accurate as a coin toss, suggesting that successful “intuitive” decisions are often a lucky guess.

On the other hand, a McKinsey study found that companies who put data at the center of marketing and sales decisions improve marketing ROI by 15% – 20%. Data-driven personalization, in particular, can lift sales 10% or more. For example, Bank of America…

View original post 102 more words

Top 10 Steps to a Pragmatic Big Data Pipeline

As you know Big Data is capturing lots of press time. Which is good, but what does it mean to the person in the trenches ? Some thoughts … as a Top 10 List :

My missives

As you know Big Data is capturing lots of press time. Which is good, but what does it mean to the person in the trenches ? Some thoughts … as a Top 10 List :

[update 11/25/11 : Copy of my guest lecture for Ph.D students at the Naval Post Graduate School The Art Of Big Data is at Slideshare]

10. Think of the data pipeline in multiple dimensions than a point technology & Evolve the pipeline with focus on all the aspects of the stages

  • While technologies are interesting, they do not work in insolation and neither should you think that way
  • Dimension 1 : Big Data (I had touched upon this in my earlier blog “What is Big Data anyway“) One should not only look at the Volume-Velocity-Variety-Variability but also at the Connectedness – Context dimensions.
  • Dimension 2 : Stages – The degrees of separation as in collect…

View original post 1,011 more words