Data Cascades in Machine Learning

53 · Google AI Research · June 4, 2021, 5:05 p.m.
Nithya Sambasivan, Research Scientist, Google Research Data is a foundational aspect of machine learning (ML) that can impact performance, fairness, robustness, and scalability of ML systems. Paradoxically, while building ML models is often highly prioritized, the work related to data itself is often the least prioritized aspect. This data work can require multiple roles (such as data collectors, annotators, and ML developers) and often involves multiple teams (such as database, legal, or licen...