> A boot camp probably won't tell the process to make sure a linear model is the right choice.
Very true. On the other hand, it's a pleasant rarity when I see positions that appear to index more heavily on, "how well is this person able to conceptualize the problem and choose an appropriate method?" than "can this person do X?"
Lots of folks can do X; fewer can conceptualize a research question and choose the appropriate X; even fewer can carry out the X and communicate robustly what it means.
The latter two start to get into squishy territory, but also are where the value is. They also seem to get the least focus in advertising / recruiting / interviewing data scientists.
It reminds me of studying evaluation methods in planning. One that people are really familiar with (at least anecdotally) is cost-benefit analysis. Conceptually, it's very simple. The problem is that the costs and benefits that are hardest to measure are very often NOT measured. And they're very often the sorts of things that people find the most important. So, you end up with an answer that encodes a ratio of easily measured things rather than important things.
So too with data science. Easier to check whether someone can remember basic probability rules and carry out a linear regression than it is to diagnose whether someone can reason carefully about an amorphous business problem.
Very true. On the other hand, it's a pleasant rarity when I see positions that appear to index more heavily on, "how well is this person able to conceptualize the problem and choose an appropriate method?" than "can this person do X?"
Lots of folks can do X; fewer can conceptualize a research question and choose the appropriate X; even fewer can carry out the X and communicate robustly what it means.
The latter two start to get into squishy territory, but also are where the value is. They also seem to get the least focus in advertising / recruiting / interviewing data scientists.
It reminds me of studying evaluation methods in planning. One that people are really familiar with (at least anecdotally) is cost-benefit analysis. Conceptually, it's very simple. The problem is that the costs and benefits that are hardest to measure are very often NOT measured. And they're very often the sorts of things that people find the most important. So, you end up with an answer that encodes a ratio of easily measured things rather than important things.
So too with data science. Easier to check whether someone can remember basic probability rules and carry out a linear regression than it is to diagnose whether someone can reason carefully about an amorphous business problem.