May 22, 2018  
2017-2018 Graduate Catalog 
2017-2018 Graduate Catalog [ARCHIVED CATALOG]

[Add to My Catalog]

INSD 5160 - Harvesting, Storing and Retrieving Data

3 hours

Provides an introduction to collecting, storing, managing, retrieving and processing datasets. Techniques for large and small datasets are considered, as both are needed in data science applications. Traditional survey and experimental design principles for data collection as well as script-based programming techniques for large-scale data harvesting from third party sources are covered. Data wrangling methodologies are introduced for cleaning and merging datasets, storing data for later analysis and constructing derived datasets. Various storage and process architectures are introduced with a focus on how approaches depend on applications, data velocity and end users. Emphasizes applications and includes many hands-on projects.

Prerequisite(s): None.

[Add to My Catalog]