| 
 

|
Over the last 18 years,
Sociometrics has developed a comprehensive set of procedures for the preparation
and archiving of large social and behavioral science datasets that provide
users of secondary data with substantial added value and ensure the highest
data quality. Our data preparation and archiving systems utilize database
applications, statistical programs, and proprietary executable files that
automate most archiving processes, and establish data quality standards
that have been achieved by very few organizations.
Sociometrics is now
making available custom dataset preparation services to researchers and
data holders who wish to produce a fully documented version of their dataset.
Sociometrics' data preparation procedures are designe d
to produce datasets that are suitable for public sharing and distribution.
Data prepared with Sociometrics' standardized procedures are easy to understand
and use, and are accessible to researchers at all levels of experience.
We offer two levels of dataset preparation: 1) basic, which results in
a fully documented and easy to use dataset (see description of services
below); and 2) comprehensive, which adds variable search and retrieval
and data extract capabilities to a basic dataset. The costs of dataset
preparation can be estimated from the chart we provide below, which is
based on the number of variables and the level of preparation.
Dataset Submissions
Submissions should
conform to the following criteria:
- Dataset file(s)
should be available in one of the following formats:
SPSS Portable
SAS Unix Dataset and format library
Raw data file and SPSS or SAS Syntax Statements that define the raw
data file
Raw data file and a machine-readable codebook
- The dataset file
should include nominal variable labels and value labels for each variable,
especially where constructed or calculated variables have been added
to the dataset. Datasets with partially documented variables but complete
codebooks that define all variables will be evaluated for potential
submission.
- Paper or machine-readable
documentation should accompany the dataset, including: codebook, data
collection instrument, and reports detailing the sampling rationale
and study description. Additional documents are welcomed (e.g., publications).
Basic Data Preparation Services
- Data consistency
and relational data checks, investigator-specified recodes and other
data corrections
- Evaluation and/or
protection of data confidentiality
- Application of
Sociometrics substantive Topics and Types to each variable in a dataset.
Topic and Type codes define key substantive areas addressed in a dataset
(e.g., Race/Ethnicity, Childbearing, Contraception, Marriage and Cohabitation).
Substantive coding is used to index variables coded under the same topic
and facilitates searching and retrieving variables in large datasets.
Click
here to view a sample Topic and Type Code list from Sociometrics' Data
Archive on Adolescent Pregnancy and Pregnancy Prevention.
- Adding or editing
variable and value labels for consistency and accuracy
- Creation of new
raw data file in standard 80 character record format
- Creation of SPSS
and SAS Syntax Programs that define the raw data file with variable
names, variable labels, value labels, missing data declarations, recodes,
and formats.
- Creation of SPSS
Portable, SAS Dataset, or STATA Dataset
- Creation of SPSS
Univariate Frequency Output and Data Dictionary
- Creation of User's
Guide to the Machine-Readable Files
- Creation of PDF
and MSWord codebook, data collection instrument and other documentation
Comprehensive Data Preparation
Services 
- Linking of instrument
pages or codebook pages to variables
- Application of
Sociometrics' Search and Retrieval software interface that allows data
users to search an entire dataset, select variables for analysis, and
export a list of variables for use in Sociometrics' Data Extract Program.
Variable searching may be performed by keyword or substantive topic
and type.
- Application of
Sociometrics' Data Extract Program that uses output from the Search
& Retrieval program to produce SPSS or SAS syntax programs with selections
of variables from a dataset: syntax programs created by the Data Extract
program include all of the syntax required to define variables selected
via the Search & Retrieval program; syntactic elements include variable
names, variables labels, value labels, missing value declarations, recodes,
and formats.
Together, the
Search & Retrieval/ Data Extract programs allow users to peruse very
large datasets (or collections of datasets), select a smaller subset
of variables from a large dataset, and then create complete SPSS or
SAS syntax for the subset of selected variables. These two programs
facilitate analysis on personal computing systems by allowing the
user to create smaller extracts of very large datasets.
Cost Estimates
The cost of data
preparation will be determined during the time that services are discussed.
The cost of file preparation is calculated by considering the following
elements: the number of variables in the dataset; the file complexity;
the level of archiving desired (basic or comprehensive); whether existing
documentation such as codebooks, instruments, and methodological descriptions
is sufficient; and whether supporting documentation must be digitized
or scanned.
Contact Information
For further information
or to confirm or request an estimate, contact:
Sociometrics
650.949.3282
650.949.3299 (fax)
socio@socio.com
Sociometrics Corporation
170 State Street, Suite 260
Los Altos, California 94022
|