5. Add, View, and Delete Data Files via UI
5.1. Add, View, and Delete Data Files
A C(DCC) Administrator can add data files to a study in any status. First, log in to the Data Hub and select the study name from the table you wish to upload your data files to. Subsequent steps are slightly different depending on the study status.
5.1.1. Adding Files to a Draft Study
Scroll to the bottom of the page of the Study Details page and select the “Save & Continue” button. If you have incomplete information on the Study Details page and the “Proceed with Editing?” pop-up window appears, select the “Yes, Continue” button.
On the Add Study Data Files page, select the “Upload Study Data Files” button.
Browse for any study data files.
5.1.2. Adding Files to a Submitted or Approved Study
On the Study Overview tab, select the Data Files tab.
Click on the “Add Data Files” button on the right side of the page.
Browser for any study data files.
5.1.3. Add Data Files to a Rejected Study
C(DCC) Administrators must first convert a Rejected study to a Draft study to add data files. On the Study Overview tab, select the “Edit & Resubmit Study” button.
A pop-up window will appear warning the user that the study’s status will update to “Draft”, requiring resubmission. Select the “Yes, Continue” button.
Add files to the Draft study following the steps previously outlined.
5.1.4. Delete Data Files from a Study
Data files can be deleted from a study if the data file has not yet been shared with a Researcher. This means study data files in the status of Draft, Pending Approval, Rejected, Rejected Pending Confirmation and Approved Pending Confirmation can be deleted at any time.
For the data file that you wish to delete, click the trash icon under the “Action” column.
A confirmation pop-up window will appear prompting the C(DCC) Administrator to confirm their action. This will also indicate that if a metadata file is associated with this data file, it will be deleted as well. Click the “Delete” button.
The data file and metadata file (if it exists) are now deleted.
5.2. Submit Data Files
When new data files are added to a study, they will be in the “Draft” status. These files will not be visible to the Data Hub Data Administrators for review until they are submitted.
To submit files, select the “Submit Files” button.
A list of Draft files that will be submitted to the Data Hub Data Administrator will appear in a table. This list includes both data and metadata files.
Scroll down to the confirmation statement to agree and sign. Select the “Submit Files” button.
The “Submit Files” button will only be enabled if there are files in the “Draft” status - otherwise, it will be disabled and grayed out.
Note: The data and metadata files go through their own separate file validation and PHI/PII validation processes. If the file passes all validations, the file is automatically moved to the Approved tab.
5.3. How to Handle Duplicate Files
When a Data Coordinating Center C(DCC) Administrator uploads a file with the name of an existing file, a pop-up message will appear asking if they wish to overwrite that existing file or create a new version of it. If the C(DCC) Administrator clicks “Overwrite”, the system will replace the identified old file with the new file. The new file will move back to the “Draft” status so that it can be submitted to the Data Hub Data Administrator for review. If the C(DCC) Administrator clicks “Create New Version”, the system will create a new version of the file while keeping the existing version. There will also be an option to include version comments noting file differences between the older and new versions. These comments will also appear on the Data Files tab and would be later editable by the C(DCC) Administrator in the same program as the C(DCC) Administrator who uploaded the file.
The possible “Overwrite” or “Create New Version” options will be dependent on the status of the current file according to the table below:
Current File Status | Overwrite | New Version |
---|---|---|
Draft | Yes | |
Pending Approval | Yes | |
Approved - Pending Confirmation | Yes | Yes |
Approved | Yes | |
Rejected - Pending Confirmation | Yes | |
Rejected | Yes |
Note: When a data file is uploaded to the Data Hub, the system automatically renames the file to include the version number in the data file name. For example, if “COVID_transformcopy.csv” is uploaded to the Data Hub for the first time or has been overwritten, the Data Hub will update the file name to “COVID_transformcopy_v1.csv”.
To avoid confusion, C(DCC) users should not include their own version nomenclature in file names. The Data Hub will automatically handle data file versioning.
Data files in the “Approved” status cannot be overwritten since these files have already been shared with the Researchers.
Instead, when a C(DCC) Administrator uploads a duplicate file for a data file in the “Approved-Pending Confirmation” or “Approved” status, the system will allow the users to create a new version of the file. The system will append the version number to the end of the file name and this new file will be in Draft for the C(DCC) Administrator to submit.
Another example below shows a file in the “Pending Approval” status. When a duplicate file is uploaded, the C(DCC) Administrator has the option to “Overwrite” the existing file and include version comments.
The screenshots below show that the version comments are visible on the Data Files tab and can be edited later by the DCC Administrator in the same program as the DCC Administrator who uploaded the respective file.
5.4. CDE and PII/PHI File Validation
The Data Hub will run two types of validations when data files are uploaded to the system: CDE and PII/PHI validation. CDE validation will be performed on harmonized files only (file names ending in …transformcopy.csv). PII validation will be performed on all data files uploaded to the Data Hub.
The CDE validation will check that the harmonized data file contains all the CDE column headers as defined in the global codebook for each program. The CDE validation will also check that the data values for each of the Tier 1 CDEs match the predefined values in the global codebook. If the data file is missing any CDE column headers or if any of the data values do not match the expected values, the file will be flagged as non-CDE-compliant.
The Data Hub will also check that all files uploaded do not contain PII/PHI. If the system detects that the file contains PII/PHI, the data file will be flagged as non-PII/PHI-compliant.
Note: An hourglass icon will appear while the system is running the validation. For files where CDE validation is not applicable, a gray null icon will appear.
If data files are flagged, users can view the CDE and PII/PHI validation analysis by selecting the corresponding “View” button under the CDE Header, CDE Data, or PII/PHI columns.
CDE Header validation analysis -
CDE Data validation analysis -
In Data errors view only 300 errors would show up by default. The top part of the table will list the columns containing errors as well as the error count for that particular column. The bottom table will show the row number, column name, input value and valid input options for the data file.
PII/PHI validation analysis -
Note: Failed CDE or PII/PHI validations do not prevent the C(DCC) Administrator from submitting a study. They serve as warnings for the C(DCC) Administrator to review the outputs and decide to submit the existing files or replace them with updated files. The Data Hub Data Administrator will see these same validation outputs and may reject the submission due to what is found in these validations.
5.5. View Data Files
Select the study name that you wish to view from the My Studies page.
If you have selected a Draft study, you will land on the Study Details page. Continue to the Add Data Files page by selecting the “Save & Continue” button at the bottom of the page.
If you have selected a Submitted, Approved, or Rejected study, you will land on the Study Overview tab. Select the Data Files tab to see a list of data files that have been uploaded for the study.
Note: You may also view the metadata file by clicking the link under the “Metadata File” column if one exists. Remember metadata files are an optional upload altogether.
To view a specific data file, select the data file name under the “File Name” column, which will open the data file viewer window.
The selected file details will be opened in a pop-up viewer window. For convenience, when viewing the data file in the viewer window, there is an option to adjust a number of rows of the file content.
Note: By default, the file viewer window will display 20 rows of data.
To navigate through the file list, simply click the “Next File” or “Previous File” button at the footer.