A new paper uses flawed methods to predict likely criminals based on their facial features.
This is a very clear-cut example of a case of not being aware of the limitations of the data. The same problem affects the interpretation of data from any survey or experiment and any statistical method, machine learning to linear regression or t-tests. The design of the survey or experiment limits the range of validity of possible conclusions and the nature of such conclusions. No sophisticated analysis can correct for biased data unless the bias is measured. We cannot extract information that is not already in the data. We need to make assumptions, but this imposes limitations to what can be concluded. Less dangerous, but equally wrong interpretations of data analyses are a lot more frequent in the scientific literature than we usually realize. Look for parallels between this example and your own research or what you read. What are the data representative of? and meaningful for?