The Lean Data Scientist: Recent Advances toward Overcoming the Data Bottleneck: A taxonomy of the methods used to obtain quality datasets enhances existing resources.
In: Communications of the ACM, Jg. 66 (2023-02-01), Heft 2, S. 92-102
serialPeriodical
Zugriff:
The article offers insights on how to overcome the "data bottleneck" of obtaining data for machine-learning (ML) applications. Particular focus is given to a comprehensive taxonomy of ways to tackle this "data bottleneck." Methods discussed include dataset repurposing (using a preexisting dataset for a different task than it was originally constructed for), data augmentation (artificial inflation of the training set through the application of modifications) and multimodal learning (attempts to enrich the input to the learning algorithm).
Titel: |
The Lean Data Scientist: Recent Advances toward Overcoming the Data Bottleneck: A taxonomy of the methods used to obtain quality datasets enhances existing resources.
|
---|---|
Autor/in / Beteiligte Person: | SHANI, CHEN ; ZARECKI, JONATHAN ; SHAHAF, DAFNA |
Zeitschrift: | Communications of the ACM, Jg. 66 (2023-02-01), Heft 2, S. 92-102 |
Veröffentlichung: | 2023 |
Medientyp: | serialPeriodical |
ISSN: | 0001-0782 (print) |
DOI: | 10.1145/3551635 |
Schlagwort: |
|
Sonstiges: |
|