Test 16 - Sample Dates
Purpose
Quickly summarize the distribution of sample dates in the test cohort. Useful to check temporal coverage and spot data collection gaps.
Required Inputs
WORK_DIR: working directory containing${WORK_DIR}/Samples/3.test_cohort.samples
How to Run
What This Test Does
- Reads
${WORK_DIR}/Samples/3.test_cohort.samplesand extracts the sample date (the script converts sampletimeto a year/month integer byint(time/100)). - Produces a count of samples per date and writes
${WORK_DIR}/samples_stats/by_date.tsv. - Generates an HTML plot
${WORK_DIR}/samples_stats/by_date.htmlusingplot.pyand a local HTML template.
Output Location
${WORK_DIR}/samples_stats/by_date.tsv${WORK_DIR}/samples_stats/by_date.html
How to Interpret Results
- Use the TSV to see the number of samples collected per month (or per date unit). Look for irregularities such as sudden drops or spikes indicating data collection problems or cohort selection artifacts.
Troubleshooting
- Missing
${WORK_DIR}/Samples/3.test_cohort.samples: ensure Test 03 ran successfully. plot.pyfailures: ensure the plot template${WORK_DIR}/tmp/plotly_graph.htmlexists or update the command to point to another template.
Files to inspect
${WORK_DIR}/samples_stats/by_date.tsv${WORK_DIR}/samples_stats/by_date.html