Data Management
Proper data management is an essential element in the data-driven decision-making process. Regardless of how elaborate and well-designed your research is, poor data management can lead to invalid and misleading interpretations of results. We can assist the development of data management protocols, codebook construction, merging of data sets, progamming/coding for index or scale construction, missing data analysis, etc. Each of these is described in more detail below.
Data Management Protocol Development. To ensure reliable, high-quality data are collected, we can assist with written guidelines for data entry procedures, the delineation of data security protocols, data coding schemes, coding reliability assessments, and the editing and cleaning data. As part of a well-designed data management system, this protocol should help with the training of staff and the management of data quality audits.
Codebook Construction. We can assist with the development of data codebooks, which are essential should questions arise regarding variable coding, as well as for analysts who need a concise listing and descriptions of the variables in a dataset. Codebooks often also contain information describing the study, sampling information, technical information (number of observations, number of records per observation, etc.), structure of the file, details about the data (e.g., where specific variables can be found, etc.), and often contain text of the items/questions and response categories.
Importing/Exporting Data Files. Assistance is available for the safe and efficient transfer and conversion of data files. Files often need to be imported and converted (e.g., from excel files, text files, or complex files of mixed, grouped, or nested data) for use with various relational databases or statistical programs. As some statistical programs are better suited for some tasks or techniques, we can also assist with the conversion of data formatting from one statistical program to another.
File Operations. We can work with multiple data sources by merging data files (i.e., files with same cases but different variables as well as vice versa), aggregating and weighting data, and transpose cases and variables (i.e., changing file structure, which is often needed when creating longitudinal data sets).
Metadata Programming. With proper documentation, we can assist with the programming of variable properties (e.g., variable labels, value labels for discrete variables, setting missing values, etc.), as well as document file properties.
Data Transformations. We can perform various common data transformation procedures, including the recoding of categorical variables, “binning” of scale variables, simple numeric transformations, arithmetic and statistical functions, coding and manipulation of string variables, setting date and time functions, etc.
Cleaning and Validating Data. We can prepare data validation reports, which help identify invalid values and duplicate entries. We can check invalid values to identify possible keystroke errors or errors in data entry. If no keystroke/coding errors are found, we offer advisement regarding whether to exclude invalid data from the analysis.
Special Data Processing. We can assist with special data processing needs e.g., conditional processing, looping, and repeating functions.
Scale/Index Construction. Also addressed in our research design services, we can often identify and recommend scales with well-established and documented psychometric properties for your research project. We can also adapt or slightly revise some those scales, if needed to meet your particular needs. We are also experienced with conceptualizing and operationalizing attitudinal and behavioral measures (i.e., scales and indices) as well as with assessments of measure reliability and validity.
Missing Data Analysis. If not dealt with properly, missing data can lead to statistical power issues, and/or biased and misleading results. We can help with the assessment and treatment of missing data, by assessing the randomness of missing data, are can recommend solutions e.g., “available case” methods, likelihood based ignorable analyses, and various multivariate techniques for imputing missing values or statistical estimates.