2/6/06
Review Question [Review questions based on, Lab, Lecture, & Reader.]
(A) Differentiate between measurement error and processing error.
(B) List four types of data entry errors.
(C) List four methods that can be used to mitigate data entry
problems.
(D) What is the function of an EpiData QES file?
(E) What (filename) extension is used to identify an EpiData data file?
(F) What type of information is contained in code books. (Be specific.)
(G) Why should you keep backup copies of data files off site?
(H) List elements of data management.
(I) What extension is used to identify permanent SPSS data file?
(J) Describe the nature of flat text (.txt) data files.
(K) What extension is used to identify SPSS syntax (command) files?
(L) What is "controlled data entry"?
(M) What is the most fool-proof method of creating a variable name in an
EpiData QES
file?
(N) What is the maximum number of characters for an EpiData variable name?
. . . for an SPSS variable name?
(O) Suppose you need to store data with values that ranged from -9 to 9. What
EpiData variable code would you use to create this variable?
(P) Identify two types of data controls created with CHK files.
(Q) When doing double entry and validation, why is it best to use
separate data entry people for your two files?
(R) What does it mean when a line in an SPSS syntax file begins with an "*"?
Print a hard copy of your validation report, and keep backup copies of your files.
19.1 Western Collaborative Group Study (WCGS). Data are a subset of a dataset from the WCGS on cardiovascular risk factors as reported by Selvin (1991, p. 4). A data is available by clicking here. Use the file naming convention wcgs*.* to name files. Save files to your hard drive with backup to a floppy. Use the following variable names and codes for your data:
VarName |
Type |
Length |
Code |
Description (Use labels for pre-coded data) |
ID |
numeric |
2 |
## |
identification number (as specified) |
CHOL |
numeric |
3 |
### |
serum cholesterol |
BEHAV |
test |
1 |
<A> |
behavior type |
19.2 Hospital duration data (HDUR). Data are from a study by Townsend et al (1979) looked at antibiotic utilization in general hosptials in Pennsylvania. A sample of these data are reported in Rosner (1990, p. 36.) and is available by clicking here. Print the data table in the link and create a data files for these data. Use the naming convention lastname_hdur*.* for each file (e.g., gerstman_hdur.qes). The file should create the following variables and labels.
VarName |
Type |
Length |
Code |
Description |
ID |
numeric |
3 |
### |
identification number (as specified) |
DUR |
numeric |
2 |
## |
duration of hospitalization |
AGE |
numeric |
1 |
## |
age |
SEX | numeric | 1 |
# |
sex Labels: 1 = male 2 = female |
TEMP | numeric | 5 |
###.# |
maximum body temp (degrees F) |
WBC | numeric | 2 |
## |
white blood cell count (x100 per dL) |
AB | numeric | 1 |
# |
In-hospital antibiotic use Value labels: 1 = yes, 2 = no |
CULT | numeric | 1 |
# |
whether a blood culture was taken Value labels: 1 = yes, 2 = no |
SERV | numeric | 1 |
# |
admitting service Value labels: 1 = medical 2 = surgical |
19.3 Cerebellar toxicity data, sample (TOX-SAMP). Data are the first 20 records from the toxicity study by Jolson et al. (1992). Click here for the data listing. See the HS267 Lab Manual for detailed instructions on how to create, check, validate, document, export, and import this data. Use the following variable names and codes for your data:
Var Name |
Type |
Length |
Code |
Units and Codes |
ID |
numeric |
5 |
<IDNUM> |
identification number (applied automatically) |
AGE |
numeric |
2 |
## |
age |
SEX |
numeric |
1 |
# |
Sex |
MANUF |
text |
1 |
<A> |
Drug manufacturer |
DIAG |
numeric |
1 |
# |
Diagnosis (type of cancer) |
STAGE |
numeric |
1 |
# |
Clinical stage: |
TOX |
logical |
1 |
# |
cerebellar toxicity |
DOSE |
numeric |
4 |
##.# |
drug dosage |
SCR |
numeric |
3 |
#.# |
serum creatinine |
WEIGHT |
numeric |
3 |
### |
body weight |
Key to Odd Numbered Problems Key to Even Numbered Problems (may not be posted)