In modern life the term ‘big data’ is often used to describe datasets that are too large to be manageable using standard software tools. However, in the context of our research, even a dataset on the order of billions of observations does not really qualify as ‘big data’. For example, it takes approximately 100 hours to assemble a single year of observational data from the Southern Great Plains (SGP) atmospheric observatory and its satellite sensors.
In addition to a wide array of in situ and remote sensing instruments, SGP is also home to several large-eddy simulation models that allow us to study aerosol and cloud processes under realistic conditions. This symbiotic interaction between modeling and observation is an important part of what makes the SGP observatory unique among ARM user facilities.
A student growth percentile is a measure of a student’s academic progress compared to their academic peers within their grade. It allows us to fairly compare students that enter school with different levels of achievement and demonstrate that a student can make high academic progress, even when they start out below the state average.
The SGP package includes a set of classes, functions and data that allows educators to conduct student growth percentile calculations, percentile growth projections and/or trajectories using large scale, longitudinal education assessment data. These analyses are based on a quantitative regression technique called conditional density estimation that is applied to a student’s achievement history and yields derived coefficient matrices. These matrices are then used to generate percentile growth predictions and/or trajectories.
Assuming that the state specific data sgpData has been prepared properly, conducting SGP analyses is very straight forward. In fact, most of the errors that we assist with revert back to improper data preparation. Thus the bulk of time and effort that we devote to SGP analyses is spent in data preparation rather than actual analysis.
A sgpData is a table of student assessments (i.e., test records) for a given student. It contains the test record, including a scale score for each content area that was assessed. The first column, ID, provides the student’s unique identifier. The next five columns, GRADE_2013, GRADE_2014, GRADE_2015, and GRADE_2016, provide the grades of each of the assessments taken by that student.
The sgpData table also contains information about the teacher for each test record. The row labeled sgpData_INSTRUCTOR_NUMBER provides the name of the instructor who taught each of the tests for that student. The sgpData_INSTRUCTOR_NUMBER column then enables the calculation of student growth percentiles based on student assessment scores from that instructor’s classroom.
The sgpData data sets contain 1 minute resolution surface meteorological observations from the Department of Energy’s Atmospheric Radiation Measurement (ARM) Southern Great Plains (SGP) Surface Meteorological Observation System (SMOS) stations located at many of the SGP Extended Facilities in southern Kansas and northern Oklahoma. This data set is transmitted to the ARM Data Center and is available via Data Discovery.