DS4100
Updated 644 days ago
Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification to integrate disparate data sets. Shows how to parse data from files, XML, JSON, APIs, and structured data stores to construct analyzable data sets that are stored in databases. Introduces key concepts of algorithms and data structures, including divide-and-conquer, sorting and selection, and graph traversal. Provides understanding of complexity and run-time behavior of programs. Presents approaches for data anonymization and protecting data privacy. Teaches data shaping and manipulation techniques for data analysis. Shows how to assess and ensure quality of data. Introduces descriptive analysis of data through descriptive statistics and plotting. Teaches the R and Python programming languages.