Many data scientists say “you spend 90% of your time preparing your data for analysis – Yes!
But data science is a real and important discipline. One way it differs from statistics (which itself is an important tool needed by data scientists) is: data science involves a lot more programming, a lot more work on data architecture, a lot more tools, and a lot more domain/client empathy. Statisticians already do a lot of programming, but data scientists can end up doing even more. I would say one of the assumptions of data science is: there is a client (either real or imagined) that the data scientist is working for (similar to the customer role in agile development). Data scientists also tend to use a large number of tools (you can start with R, but depending on your client needs you may need to eventually work with many more tools). We feel that there is a significant gap in the teaching of the gestalt of data science that “Practical Data Science with R” fills.
See on www.win-vector.com