<?xml version="1.0" encoding="UTF-8"?>
<eml:eml packageId="pmeirelles.11.1" system="knb" xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 eml.xsd"> <access authSystem="knb" order="allowFirst"> <allow><principal>public</principal>
<permission>read</permission>
</allow>
</access>
 <dataset> <title>Microbial community diversity and physical–chemical features of the Southwestern Atlantic Ocean</title>
 <creator id="1414428706618"> <individualName><salutation>Prof.</salutation>
<givenName>Fabiano</givenName>
<surName>Thompson</surName>
</individualName>
<organizationName>Federal University of Rio de Janeiro</organizationName>
<positionName>Assistent Professor</positionName>
<address><deliveryPoint>Av. Carlos Chagas Fo. S/N</deliveryPoint>
<deliveryPoint>CCS - IB -  Laboratory of Microbiology e SAGE-COPPE - BLOCO A (Anexo) A3 - sl 102</deliveryPoint>
<city>RIo de Janeiro</city>
<administrativeArea>RJ</administrativeArea>
<postalCode>21941-599</postalCode>
<country>Brazil</country>
</address>
<phone phonetype="voice">+55 21 3938-6567</phone>
<phone phonetype="fax">+55 21 3938-6567</phone>
<electronicMailAddress>CCS - IB -  Laboratory of Microbiology e SAGE-COPPE - BLOCO A (Anexo) A3 - sl 102</electronicMailAddress>
<onlineUrl>http://www.microbiologia.biologia.ufrj.br</onlineUrl>
</creator>
 <abstract><para>Microbial oceanography studies have dem- onstrated the central role of microbes in functioning and nutrient cycling of the global ocean. Most of these former studies including at Southwestern Atlantic Ocean (SAO) focused on surface seawater and benthic organisms (e.g., coral reefs and sponges). This is the first metagenomic study of the SAO. The SAO harbors a great microbial diver- sity and marine life (e.g., coral reefs and rhodolith beds). The aim of this study was to characterize the microbial community diversity of the SAO along the depth contin- uum and different water masses by means of metagenomic, physical–chemical and biological analyses. The microbial community abundance and diversity appear to be strongly influenced by the temperature, dissolved organic carbon, and depth, and three groups were defined [1. surface waters; 2. sub-superficial chlorophyll maximum (SCM) (48–82 m) and 3. deep waters (236–1,200 m)] according to the micro- bial composition. The microbial communities of deep water masses [South Atlantic Central water, Antarctic Intermedi- ate water and Upper Circumpolar Deep water] are highly similar. Of the 421,418 predicted genes for SAO metagen- omes, 36.7 % had no homologous hits against 17,451,486 sequences from the North Atlantic, South Atlantic, North Pacific, South Pacific and Indian Oceans. From these unique genes from the SAO, only 6.64 % had hits against the NCBI non-redundant protein database. SAO microbial communities share genes with the global ocean in at least 70 cellular functions; however, more than a third of pre- dicted SAO genes represent a unique gene pool in global ocean. This study was the first attempt to characterize the taxonomic and functional community diversity of differ- ent water masses at SAO and compare it with the micro- bial community diversity of the global ocean, and SAO had a significant portion of endemic gene diversity. Microbial communities of deep water masses (236–1,200 m) are highly similar, suggesting that these water masses have very similar microbiological attributes, despite the common knowledge that water masses determine prokaryotic com- munity and are barriers to microbial dispersal. The present study also shows that SCM is a clearly differentiated layer within Tropical waters with higher abundance of phototro- phic microbes and microbial diversity.</para>
</abstract>
<keywordSet><keyword>Metagenomics</keyword>
<keyword>South Atlantic Ocean</keyword>
<keyword>Microbial diversity</keyword>
<keyword>Water mass</keyword>
</keywordSet>
<coverage><geographicCoverage><geographicDescription>Southern western Atlantic</geographicDescription>
<boundingCoordinates><westBoundingCoordinate>-43.375</westBoundingCoordinate>
<eastBoundingCoordinate>-22.5</eastBoundingCoordinate>
<northBoundingCoordinate>-1.375</northBoundingCoordinate>
<southBoundingCoordinate>-32.25</southBoundingCoordinate>
</boundingCoordinates>
</geographicCoverage>
</coverage>
<contact id="1414428722469"><individualName><salutation>MsC</salutation>
<givenName>Pedro</givenName>
<surName>Meirelles</surName>
</individualName>
<organizationName>Federal University of Rio de Janeiro</organizationName>
<positionName>PhD Student</positionName>
<address><deliveryPoint>Av. Carlos Chagas Fo. S/N</deliveryPoint>
<deliveryPoint>CCS - IB -  Laboratory of Microbiology e SAGE-COPPE - BLOCO A (Anexo) A3 - sl 102, Cidade Universitária</deliveryPoint>
<city>Rio de Janeiro</city>
<administrativeArea>RJ</administrativeArea>
<postalCode>21941-599</postalCode>
<country>Brazil</country>
</address>
<phone phonetype="voice">+55 21 2562-6567</phone>
<phone phonetype="fax">+55 21 2562-6567</phone>
<electronicMailAddress>pedrommeirelles@gmail.com</electronicMailAddress>
<onlineUrl>http://pedromeirelles.com.br</onlineUrl>
</contact>
<methods><methodStep><description><section><title>Samples description</title>
<para>The 28 water samples were collected using Niskin bottles on a rosette with a coupled CTD in 12 different sites at SAO between 08/26/2010 and 02/05/2011 (Table 1; Fig. 1). We obtained eight samples at the Abrolhos platform break (2 replicates of samples OS8, OS10, OS11) on board of the RV Seward Johnson, four samples at the Campos Basin (2 replicates of each one) on board of RV Gyre and seven samples along the Northeast of the Brazilian oceanic realm on board of RV Antares from Brazilian Navy (Fig. 1), draw with the Matplotlib Basemap Toolkit (Hunter 2007; The Matplotlib Basemap Toolkit user’s guide). Part of the samples used in this study were on the frame of “Prediction</para>
<para>and Research Moored Array in the Tropical Atlantic” pro- ject (PIRATA) (Servain et al. 1998) and part from “Habitats Project—Campos Basin Environmental Heterogeneity.” Two water samples were collected at each site according to the major peak of fluorescence (CTD: SBE 9, Sea-Bird Electronics Inc.) and from the bottom at the Abrolhos plat- form break. The superficial chlorophyll maximum (SCM) was determined by CTD before sampling. Five different depths were sampled at the Campos Basin at core of the water masses Tropical Water, TA; South Atlantic Central Water, SACW; Antarctic Intermediate Water, AAIW and Upper Circumpolar Deep Water, UCDW. The core of the water masses was determined using the Optimum Multipa- rameter Analysis—OMP (Tomczak 1981). Samples were taken in the core of the water masses. The classification of the other samples according to the water mass was based on temperature and salinity classification. Five samples were collected from superficial water, and two samples were collected from deep water at two selected sites from the Northeast of Brazil.</para>
<para>At this portion of the Atlantic Ocean, water column physical structure is dominated by the warm, saline and nutrient-depleted tropical water (TW) carried by the Bra- zil Current at surface waters (Stramma and England 1999), also described as a maximum salinity water by Mémery et al. (2000) Below that, the cold and nutrient-rich South Atlantic Central Waters (SACW) flows in the picnoclinic region (Stramma and England 1999), with its average depth core at 300 m and a large variation of temperature and salinity. The Antarctic Intermediate Water (AAIW) aver- age depth core is around 800 m being relatively cold, less saltier and more oxygenated than the other water masses (Stramma and England 1999). The less oxygenated and nutrient-rich Upper Circumpolar Water (UCDW) flows below with an average depth core at 1,250 m (Stramma and England 1999).</para>
</section>
</description>
</methodStep>
<methodStep><description><section><title>Physical–chemical and biological analyses</title>
<para>All environmental parameters were analyzed by standard oceanographic methods. At least three replicates were ana- lyzed for each parameter. Temperature and salinity were evaluated using CTD. Chlorophyll a analyses were per- formed after vacuum filtration (max 0.20 cm of Hg) of 2 L of water. The filters (cellulose Millipore HAWP) were extracted overnight in 90 % acetone at 4 °C and analyzed by fluorimetry. Inorganic nutrients were also analyzed (Grasshoff et al. 1999): (1) ammonia by indophenol, (2) nitrite by diazotization, (3) nitrate by reduction in Cd–Cu column followed by diazotization, (4) total nitrogen by digestion with potassium persulfate following nitrate deter- mination, (5) orthophosphate by reaction with ascorbic acid, (6) total phosphorous by acid digestion to phosphate and (7) silicate by reaction with molybdate. Dissolved (DOC) and particulate (POC) organic carbon were ana- lyzed as described previously (Rezende et al. 2010).</para>
</section>
</description>
</methodStep>
<methodStep><description><section><title>Microbial abundance</title>
<para>Abundance was determined from three replicates of seawa- ter by flow cytometry with Sybr-green (Life Technologies), with minor modifications (Andrade et al. 2003) from three replicates.</para>
</section>
</description>
</methodStep>
<methodStep><description><section><title>Seawater metagenomic DNA extraction</title>
<para>Collected seawater was initially filtered by gravity in nets of 100 and 20 μm. Pre-filtered water was then filtered through Sterivex (0.22 μm) by positive pressure. In total, 4 l of seawater was filtered in each Sterivex filter. The microbes collected at Sterivex filters were preserved with SET buffer (20 % sucrose, 50 mM EDTA and 0.5 mM Tris–HCl). Metagenomic DNA extraction was performed using lysozyme (1 mg/ml final concentration) for 1 h at 37 °C as previously described (Bruce et al. 2012). Then, proteinase K (final concentration 0.2 mg/ml) and sodium dodecyl sulfate (SDS) (final concentration 1 %) were added and incubated at 55 °C with gentle agitation for 60 min. The lysate was rinsed into a new tube with 1 ml of SET buffer. Organic extraction was performed with one volume of phenol:chloroform:isoamyl alcohol (25:24:1) to clean up the metagenomic DNA. The DNA precipitation was per- formed with ethanol and 3 M sodium acetate (0.3 M final) at −20 °C overnight. One metagenome library was pre- pared for each Sterivex and pyrosequenced subsequently.</para>
</section>
</description>
</methodStep>
<methodStep><description><section><title>Pyrosequencing</title>
<para>Metagenome sequencing was performed using 454 pyrose- quencing technology using a 454 GS Junior instrument (Margulies et al. 2006). Shotgun libraries were generated with 500 ng of the whole metagenome samples sheared into fragments by nebulization. End-repair and adaptor ligation were performed using GS FLX Titanium kit (Roche). Qual- ity control and quantification were done with the use of Agilent 2100 Bioanalyzer (Agilent Technologies) and TBS 380 Fluorometer (Turner Biosystems), respectively. After the libraries construction, approximately 106 molecules of each metagenome were denatured and amplified by emul- sion PCR.</para>
</section>
</description>
</methodStep>
<methodStep><description><section><title>Metagenomic data analysis</title>
<para>Raw data were processed to exclude duplicate, low-quality and short sequences (&lt;100 bp) using PrinSeq and possible contaminations (e.g., human sequences) using DeconSeq</para>
<para>(Schmieder and Edwards 2011a, b). The sequences assign- ment was conducted using the MG-RAST server (Meyer et al. 2008), using the cutoff parameters: expect value less than 1 × 10−5 and 60 % of minimum identity. Taxonomic annotation was done using the GenBank database, and the functional annotation was done using the SEED database, which includes sequences of all annotated genomes. All abundance plots were draw using the ggplot2 and reshape R packages (Wickham 2007, 2009; R: a language and envi- ronment for statistical computing). Diversity indexes and the pairwise correlations panel between the environmental parameters and microbial abundance were calculated and draw using the vegan R package (Oksanen et al. 2005) and customized functions (Borcard et al. 2011). The cluster analysis was performed using the APE R package (Paradis et al. 2004). The cluster analysis and the panel correlations were done using the Pearson correlation. Differences in abundance were calculated with Kruskal–Wallis Test and Dunn’s multiple comparisons test in Graphpad. The net- work was based on the homologous protein-coding genes. In order to visualize the possible gene flow between the metagenomes from the different depths/environments and from the different samples, we drew a network where the nodes represent each sample and the edges the percentage of the shared protein-coding genes. First, the genes were predicted using the FragGeneScan with 0.1 % of error rate (Rho et al. 2010). The size of all metagenomes (i.e., number of sequences) resulting from gene prediction (amino acids) were standardized by the smaller metagenome after joining the different replicates of the same sample. Sequences were randomly removed to have all metagenomes the same size. We determined the percentage of homologous proteins by means of the BLASTP + (version 2.2.27) (Altschul et al. 1990), using all possible combinations of all metagen- omes as queries against all metagenomes as subject (best hits with e-value lower than 1 × 10−5). The link threshold between two samples (e.g., A and B) is the average of hits of sample A against B and B against A. The network was built using the Networkx (Hagberg et al. 2008) and drawn with the matplotlib (Hunter 2007), both in the Python package.</para>
<para>In order to define a core metagenome for the SAO, we compared all samples to one another. The consecutive BLAST using the surface metagenome as the primary data- base was done based on the network result that the surface samples are more related. Genes with e-value hits lower than 10−5 remained in the database to the next BLAST against the other metagenomes. The KEGG Orthology (KO) identification was also used to study the shared func- tions in different depths. We also compared the metagen- omes of the SAO to public metagenomes available in MG- RAST server from other oceans (Table S1). To identify the unique genes from the SAO, we ran a BLASTP + (version 2.2.27) using the predicted genes from metagenomes gen- erated in the present study as queries and the predicted genes of the public metagenomes as subjects.</para>
</section>
</description>
</methodStep>
</methods>
</dataset>
 </eml:eml>
