Pacuta SRA uploads to NCBI
Genome submission to NCBI
This post details the NCBI Genome Submission upload for the assembled Acropora pulchra genome. The github for that project is here. More information on genome submission on NCBI can be found here. Below is the following information for this submission.
Overview
- Genome submission: SUB14718394
- Submitting single genome
Submitter
- Jill Ashey; 120 Flagg Road, Kingston RI 02881
General info
- Not associated with existing bioproject or biosamples
- Release date: 2024-09-30
- Assembly date: 2024-07
- Assembly method: Hifiasm, run in JULY 2024
- Assembly name: Apul_v1.1
- Genome coverage: 100
- PacBio sequencing technology
- Only submitting pacbio reads
- Sample is the full genome
- It is the final version - will not be doing re-assembly
- De novo assembly
- Do not automatically trim or remove sequences identified as contamination
- Submission category: original
- Submission title: Acropora pulchra NCBI genome submission
Bioproject general info
Public description (4000 characters): This submission provides the genome assembly of Acropora pulchra, a scleractinian coral, from Moorea, French Polynesia.
Links
- OSF https://osf.io/y8963/
- Github https://github.com/hputnam/Apulchra_genome
Biosample type
- Invertebrate
- Sample name: Acropora_pulchra_JA
- Organism: Acropora pulchra
- Isolate: Coral host
- Isolation source: Coral reef
- collection date: 2022-10-23
- Location: French Polynesia: Moorea
- Tissue: sperm
- Developmental stage: adult
- Broad scale env context: coral reef [ENVO:00000150]
- Sex: hermaphrodite
Files
One or more chromosomes are still in multiple pieces and/or some sequences are not assembled into chromosomes
Command line upload
Make new file with only sequence data
cd /data/putnamlab/jillashey/Apul_Genome/ncbi
ln -s /data/putnamlab/tconn/repeats/apul_softmasked/apul.hifiasm.s55_pa.p_ctg.fa.k32.w100.z1000.ntLink.5rounds.fa.masked
Use FTP command line file upload to provide files. Activate FTP on the command line. Command prompt now changes to ftp>. Immediately login using the username and password from the NCBI submission portal.
ftp -i
open ftp-private.ncbi.nlm.nih.gov
USERNAME
PASSWORD
Go into the folder that NCBI provided from the NCBI submission portal. Make a new directory for the files and go into that folder. Put all the sequences from Andromeda into this new folder (they will still be on Andromeda after transfer is complete).
cd uploads/jillashey_uri.edu_XXXXX
mkdir apul_2024
cd apul_2024
mput *
local: apul.hifiasm.s55_pa.p_ctg.fa.k32.w100.z1000.ntLink.5rounds.fa.masked remote: apul.hifiasm.s55_pa.p_ctg.fa.k32.w100.z1000.ntLink.5rounds.fa.masked
227 Entering Passive Mode (130,14,250,5,196,151).
150 Opening BINARY mode data connection for apul.hifiasm.s55_pa.p_ctg.fa.k32.w100.z1000.ntLink.5rounds.fa.masked
226 Transfer complete
528682350 bytes sent in 7.06 secs (74889.03 Kbytes/sec)
Note: it takes at least 10 minutes for uploaded files to become available for selection within a submission.
Once transfers are complete, click select preload folder in the submission portal. Wait until all files have been uploaded and select the pacuta_2022 folder. Click submit!
Fasta contigs: apul.hifiasm.s55_pa.p_ctg.fa.k32.w100.z1000.ntLink.5rounds.fa.masked
Assignment
- Do any sequences belong to a chromosome? No
- Do any sequences belong to an organelle, eg mitochondrion or chloroplast? No
- Does any sequence belong to a plasmid? No
References
- Sequence Authors: Jill Ashey
- Reference
- Unpublished
- Title: Genome assembly and annotation of Acropora pulchra from Moorea, French Polynesia
- Reference authors: Jill Ashey, Trinity Conn, Ross Cunning, Hollie M. Putnam
Submission
BioProject: Processed
- PRJNA1162071 : Acropora pulchra genome sequencing (TaxID: 140239)
- Locus Tag Prefixes: ACE5DV
BioSample: Processed
- Successfully loaded
- SAMN43800006: Acropora_pulchra_JA (TaxID: 140239)
On 10/25/24, I received an email from NCBI saying that they have received my submission and assigned it an accession number. The submission has passed initial QC checks and will now be manually reviewed by the indexing staff. Once that is complete, the genome will be released immediately.
From NCBI:
We have assigned the following accession number to your submission:
SUBID BioProject BioSample Accession Organism
SUB14718394 PRJNA1162071 SAMN43800006 JBIQNU000000000 Acropora pulchra JA
Please cite the accession number JBIQNU000000000 like this:
This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank
under the accession JBIQNU000000000. The version described
in this paper is version JBIQNU010000000.
While you can cite the BioProject, we recommend you include the WGS
accession(s) to refer to the specific WGS assembly, especially for
BioProjects that include more than one genome assembly. Note that we
prefer not to change the BioProject/BioSample links after the WGS assembly
is released.
Raw PacBio sequencing data submission to NCBI
Gigabyte requires that our raw PacBio data also be stored on NCBI. I’m going to start a new SRA submission and link it to the existing BioProject (PRJNA1162071).
Overview
- Submission: SUB15071615
- Submitting raw PacBio Hifi reads
Submitter
- Jill Ashey; 120 Flagg Road, Kingston RI 02881
General info
- Associated with existing bioproject: PRJNA1162071
- Release date: Immediately
- Not associated with existing biosample
Biosample type
- Invertebrate
- Sample name: Acropora_pulchra_raw_reads_JA
- BioProject accession: PRJNA1162071
- Organism: Acropora pulchra
- Isolate: Coral host
- Isolation source: Coral reef
- collection date: 2022-10-23
- Location: French Polynesia: Moorea
- Tissue: sperm
- Developmental stage: adult
- Broad scale env context: coral reef [ENVO:00000150]
- Sex: hermaphrodite
SRA metadata
- Sample name: Acropora_pulchra_raw_reads_JA
- Library ID: m84100_240128_024355_s2
- Title: PacBio raw HiFi seqences of Acropora pulchra
- Library strategy (drop down options): WGS
- Library source (drop down options): Genomic
- Library selection (drop down options): size fractionation
- Library layout (drop down options): single
- Platform (drop down options): PACBIO_SMRT
- Instrument model (drop down options): Revio
- Design description: DNA was extracted by DNA Sequencing Center at Brigham Young University using the Qiagen Genomic Tip protocol and buffers (Qiagen Cat # 10223). The samples were ethanol (2x) precipitated post column elution, put in the -20°C freezer overnight and then spun for 30 minutes at 14K rcf the following day. Ethanol was removed and the DNA pellets were suspended in low TE buffer. The resulting DNA was cleaned prior to library prep with the PacBio SRE Kit (PacBio Cat # 102-208-300) to remove fragments under 25kb. Following extraction, DNA was sheared to ~17kb using a Diagenode Megaruptor (Diagenode Cat # B06010003) and checked on an Agilent Femto Pulse system (Agilent Part # M5330AA) to assess size. The DNA was then cleaned and concentrated post-shearing using a 1x AMPure bead cleaning (AMPure Cat #20805800). The DNA was then put into a library using the PacBio SMRTbell prep kit 3.0 (PacBio Cat # 102-141-700), following the instructions provided with the kit. The final sizing of the library was performed using the 35\% v/v dilution of AMPure PB beads (AMPure Part # 100-265-900). The single SMRTbell library was then sequenced using one 8M SMRT Revio Cell, and run for 29 hours on a PacBio Revio sequencer. Consensus accuracy circular consensus sequencing (CCS) processing was used to generate HiFi reads. – from methods section of paper
- File type: bam
- Reference assembly: unaligned
- File name: m84100_240128_024355_s2.hifi_reads.bc1029.bam
Files
Command line upload
Make new file with only sequence data
cd /data/putnamlab/jillashey/Apul_Genome/ncbi
ln -s /data/putnamlab/KITT/hputnam/20240129_Apulchra_Genome_LongRead/m84100_240128_024355_s2.hifi_reads.bc1029.bam
Use FTP command line file upload to provide files. Activate FTP on the command line. Command prompt now changes to ftp>. Immediately login using the username and password from the NCBI submission portal.
ftp -i
open ftp-private.ncbi.nlm.nih.gov
USERNAME
PASSWORD
Go into the folder that NCBI provided from the NCBI submission portal. Make a new directory for the files and go into that folder. Put all the sequences from Andromeda into this new folder (they will still be on Andromeda after transfer is complete).
cd uploads/jillashey_uri.edu_XXXXX
mkdir apul_raw_bam_2024
cd apul_raw_bam_2024
mput m84100_240128_024355_s2.hifi_reads.bc1029.bam
local: m84100_240128_024355_s2.hifi_reads.bc1029.bam remote: m84100_240128_024355_s2.hifi_reads.bc1029.bam
227 Entering Passive Mode (130,14,250,6,195,171).
150 Opening BINARY mode data connection for m84100_240128_024355_s2.hifi_reads.bc1029.bam
226 Transfer complete
38666252812 bytes sent in 536 secs (72146.58 Kbytes/sec)
Submit!
SAMN46708315 - assigned biosample number. SRA currently processing submission as of 2/6/24. Downloaded attributes file and will put on github repo.