7. Add Data Files via sFTP
When data files become available for a study, the C(DCC) Administrator must attach those files to the specific study. If a data file size exceeds 2GB, you must upload it using sFTP. For this approach, you will need to prepare a package of files that includes the study’s sFTP key, study data file(s), and a .yaml file. This package must be compressed into a single zip file in order to be uploaded to the Data Hub via an FTP application such as FileZilla.
Note: Only files compressed with ZIP compression will be processed. Other compression technologies such as RAR, 7zip, etc. are not supported at this time.
Step 1. Create a new folder on your desktop to store the sFTP key, data file(s), and .yaml file.
Step 2. Log into the Data Hub as a C(DCC) Administrator and, on the My Studies page, find the study you wish to upload data files to.
Step 3. Scroll right and click the key icon to begin downloading the sFTP key. The sFTP key will contain the study UUID in the file name.
Step 4. Store the sFTP key in the new folder you have created.
Step 5. Click this link to download and store the .yaml file in the same location as the sFTP key. This file is a text document template in which you must provide the study UUID, C(DCC) Administrator email, and the data file(s) names.
Step 6. Open the .yaml file and fill in the missing information. For the [study_id], copy and paste your study UUID (available in the file name of the sFTP key. You DO NOT need to open the sFTP key). For [files], copy and paste your data file(s) name(s). Note: DO NOT change the formatting of the text document (e.g. do not remove spaces). If you have more than one data file, list each file name on a separate row. For [dcc_admin_email], provide your C(DCC) Administrator email used to log in to the Data Hub.
To overwrite a file via SFTP, the YAML “versioning_option” should be “overwrite“ To create a new file version via SFTP, the YAML “versioning_option” should be ”version” To add versioning comments, enter the comments in the [comment] section of the YAML There are also options to add study level versioning and study level comments [study_level_versioning_option] and [study_level_comment]. These are applied if the options are not provided for the specific file.
Step 7. Save and close the .yaml file in the same location as your sFTP key and data file(s).
Step 8. Select the study key file, the .yaml file and data file(s), and compress them into a single .zip file. You can rename the zip file as you wish. Note: DO NOT zip the folder containing the files. The FTP application will not be able to read the content inside the folder. You must zip the documents by selecting the files.
Step 9. Open your FTP client application and provide the following information in the fields. Host: [sftp.radx-hub.nih.gov] Port: [22] Username: [Provisioned by the Data Hub Team] Password: [Provisioned by the Data Hub Team]
Step 10. Select Quickconnect to connect to the Data Hub server. Your status will update to indicate a successful connection to the server. If your access is denied, contact the Data Hub Team.
Step 11. Find and select the .zip file from your local drive and drag it to the COVID RADx Data Hub folder. An email confirming your successful or failed file upload will be sent to your inbox.
Step 12. Return to the C(DCC) Administrator Home page (My Studies page) and select the study you’ve added the files to by clicking on the study title. If you’ve selected an Approved, Rejected or Pending Approval study, select the Data Files tab to view the data files. If you’ve selected a Draft study, continue to the Study Data Files step to view the data files (as seen below). The newly added data files will appear as Draft.
Note: Newly added files will be in Draft status and will not be visible to the Data Hub Data Administrator for review until they are submitted.
7.1. Add Data Files to Multiple Studies
Follow the same steps until Step#8 for multiple studies. But instead of compressing the different files inside the folder of each study, compress all the folders for those studies.
Files inside the folder PHS000247
Zipped file showing folders