Asking the RIGHT questions

Carla Gentry
Data Scientist Each time I talk to someone about analytics, I ask the same question: “What is your ultimate goal with this project?” Often it is to increase sales or reduce turnover. Of course, this isn’t usually what’s said; initially all I get is a panicked look that says “we can’t get what we want out of our database—it isn’t working right…how do we fix it??”Typically, there is nothing wrong with the database: the master and all its clones are just as they were designed to be, the variables are entered correctly and the reporting functions are pulling exactly what they were coded and designed to pull.So what’s the issue?

Example HR data — Every HRIS or ATS database contains different information (employee addresses, phone numbers, salaries, benefits, and the like), but how much of that information is connected? What I mean is: is there a unique quantifier that connects each table or database together? If I want to select all the people who might retire in the next 5 years out of a database, complete with demographic, sales and personal information, in order to create an organizational plan for this, can I accomplish this with my current database? By design relational databases are just that: “relational.” Therefore, everything should flow, if set up correctly in the very beginning.

Which brings us back to asking the right questions—which might look a bit like these:

  • What is it really that I want to be able to answer with my data collection?
  • Structured vs unstructured data, am I asking the right questions and giving them choices or offering a space to add comments (be careful of this)? Example a) agree b) disagree c( unsure —– Can I work with unstructured data?
  • Are you offering incentives to employees or candidates to “complete additional info” as to glean a more complete pictures of your customer? (5 dollar iTunes or Starbucks card)
  • Do I want to link social data to what I collect from employees? (3rd party sign in via Facebook, Twitter, Google+, etc)
  • Are you including IT in your business meetings?

A gap analysis usually reports on what is missing, but it doesn’t have to be this way (reactive and not proactive). If you ask the right questions initially, your design and results will reflect this. Know what you are trying to accomplish.

If you say “my database is broken,” but what you really mean is “I need to be able to sustain sales throughout the next five years; I’m concerned with increasing what we have in the pipeline,” well, ask for help. Make sure you have the correct data to indicate your sales reps selling habits, including seasonality. What data does your sales department have that can help you answer these and many other questions?

Do you have store performance, or other line of business performance data to help form a more three dimensional view of your candidates or employees? A data-centric view is so valuable, but unless you ask the right questions when collecting your data and setting up your database, you may end up trying to build a predictive model using only name, address and phone number! As Chief Engineer Scotty Montgomery of the USS Enterprise might say, “I can’t do it captain!”

Go well armed on your journey towards predictive analytics and remember to always ask the right questions!

Data mining – Is it for everyone?

When I graduated from College, I spent years trying to convince everyone of the
wonders of data mining and Econometrics. Now it seems data mining is nothing
more than a buzz word, the Social Media gravy train that everyone is hopping on
board. Unfortunately for them and fortunately for me, analysis and data mining
are not for the faint of heart. Terabytes and yes, even Petabytes of data are no
joke and data CAN NOT be continuously updated without an ETL (extraction,
transformation and load) process of which, non one is talking about. Wait till
everyone find out how long it takes to load data and then to have it perfect is
an even bigger challenge (META DATA ROCKS). I feel confident that someone like
myself will always be in need but my concern at the moment is how much time will
be wasted while everyone figures out that data mining and analysis are not as
easy as you think, no matter what tools you have. Google Analytics is free but
not everyone is using it correctly, data mining will suffer the same pain….
Data will be lost forever because some novice user did something wrong in the
coding of the ETL or analysis process. But alas, all me and Analytical Solution
can do is wait for experience to win out over hype. Wish me luck!

Data_Nerd