Wednesday, October 2, 2013

Alex and Konrad's SRP 6 Rough Draft

FILTERING WATER OFF THE STREETS
Can water that has been contaminated with oils and dirt be filtered and used for drinking or watering plants?

Konrad Brayer and Alex Goodwin
Mr. Oz’s Physical Science

Sonoran Science Academy Davis Monthan

September 25, 2013











   
Southern Arizona is a barren desert, known for its sparse rainfall  averages.  Of the meager ten inches of rain Arizonans get a year, nearly half of it is wasted in urban  areas by running down streets into gutters where it cannot be salvaged.  Because water is so valuable, finding a way to harvest it from the streets,  will help  civilians all across southern Arizona have more water for important natural uses like watering plants and basic consumption.
The water that runs down streets is very toxic.Asphalt is used as a binder or seal for roads. It is what gives streets their color, and is extremely sticky alone. Asphalt contains high amounts of petroleum,sulfur,arsenic, and mercury. It is the biggest reason water off the street cannot be filtered and used for anything. (Environmental Protection Agency Nonpoint Source Control Branch, 2010) The other reason is that with cars driving on roads, our streets become covered in leaking oils and gases, as well as exhaust. Plus, our roadways are littered with trash and wastes.
Water is actually the cause of most of these toxins even coming out of the roads. When it rains, water acts as a plunger and sucks up the oils off the road. The cause, in part, is rain. Instead of changing the source, which would require every road to be dug up and replaced with a new substance, humans can easily filter the water AFTER it is collected from the streets.
Many attempts have been made to curb the damage done by lost water from urban runoff. In 1972, the Clean Waters Act was passed, starting a national campaign to prevent water pollution as urban runoff. As a result, the Arizona Department for Environmental Quality was formed and today they oversee the activities from the Clean Waters Act (Waters, et al., 2011). Along with Cochise and Maricopa, Pima county has
taken initiative and started public anti-water pollution campaigns like Clean Water Starts with Me and other water management movements (Waters, et al., 2011).
Other than city water management, everyday citizens can help as well. Small things like over irrigating, not littering, and washing cars at car washes rather than at home can make a big difference in the cleanliness of urban runoff.  However, while the efforts of citizens and city governments are beneficial, they come far  from solving the problem. The most effective way to solve the problem is to filter the water is to filter the urban runoff at the source.
Although water is being wasted by flowing down the streets, in many other places water is harvested whenever it rains. These gutters, most commonly found urban areas and surburban neighborhoods, filter water into cisterns and water barrels. This water is usable due to the fact that these “water catchers” have built in filters. A filter is a device that leaves out impurities and lets only one substance through. Most water filters are made of a charcoal or sponge-like material called sand blocks, an extremely rough fabric that only water can penetrate. These are normal filters, but is unlikely that they will work with streets containing oily runoff.
Water is an extremely valuable resource in Tucson. It is vital to the native plants, animals, and people. Much of this water is wasted in the city in a process called urban runoff, where water is lost by flowing down streets. In the areas of streets containing chemicals (Every where in the city with a paved street) with stronger dirts and oils, it will most likely require an entirely new type of filter to render this water usable for watering plants, or even drinking.









CITATIONS

Waters, S. et al. (2011). When it Rains it Runs Off: Runoff and Urbanized Areas in Arizona. Arizona Cooperative Extension. Retrieved from http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCwQFjAA&url=http%3A%2F%2Fcals.arizona.edu%2Fpubs%2Fwater%2Faz1542.pdf&ei=ADEzUsbiELOl2AW_o4H4Dg&usg=AFQjCNGflIVfuFt7eFI99YTIYGBEiUOgqg&sig2=u7SDZ8ab17UI9M67cZIPGA&bvm=bv.52164340,d.b2I
Arizona Department of Environmental Quality (2009). Arizona’s 2009 Annual Nonpoint Source Annual Report: Nonpoint Source Program July 1, 2008 – June 30, 2009. Phoenix, AZ: State of Arizona. Retrieved from Arizona Department of Environmental Quality website: http://www.azdeq.gov/environ/water/watershed/download/NSP_Annual_Report09-PA.pdf

Arizona Department of Environmental Quality (2009). Fact Sheet: Fish Consumption Advisories – April 2009. Phoenix, AZ: State of Arizona. Retrieved from Arizona Department of Environmental Quality website:http://www.azdeq.gov/environ/water/assessment/download/fish-0409.pdf

City of Clarksville, IN (2009). What is Stormwater? Clarksville, IN: City of Clarksville. Retrieved from www.clarksvillesw.com/residents.html

DeFrancesco, Donna and Robyn Baker (2008). Landscape Watering by the Numbers. N.p.: Park & Co.

Environmental Protection Agency Nonpoint Source Control Branch (2010). USEPA Nonpoint Source Fact Sheets. Washington, DC: Government Printing Office. Retrieved from Environmental Protection Agency website: http://www.epa.gov/owow/nps/facts/











Friday, September 20, 2013

Road Hazards

Every day, Tucsonans encounter dangerous road hazards all across the city. These hazards may seem like just road inadequacies, but they can have permanent damage in more ways than you think.
One common road hazard is the infamous pothole. Potholes are large crater-like holes in older streets that can cause cars to stall out or experience tire issues. Potholes most commonly are formed by road fatigue or ice wedging in winter months. I have experienced potholes in my time in areas around the University of Arizona.
Another common road hazard in Tucson and Arizona is roadkill. Roadkill is essentially killed animals that have been run over by other vehicles. These unsightly dangers can cars to swerve off the road and crash. I have experienced roadkill in Tucson, but usually more out of the center of town.
Finally, one hazard that is especially common in the Tucson area is the dust storm. Dust storms are large masses of floating dust caused by active winds. Dust storms impede driver vision to only several feet and can make driving difficult. Again, being a a native Tucsonan, I have experienced several intense dust storms.

Friday, September 13, 2013

SRP-4 Background Research Sources

 Hey everybody, it's Alex. If you didn't know yet, my friend Konrad and I are working on our science fair project together. If you're interested about his side of the project, visit him at ///////

Here are the sources for our project. They all come from the source with the URL at the bottom. The one on the top is the most vital to our project and has given lots of useful information. Also, all of the sources below it are from the first source

Waters, S. et al. (2011). When it Rains it Runs Off: Runoff and Urbanized Areas in Arizona. Arizona Cooperative Extension. Retrieved from http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCwQFjAA&url=http%3A%2F%2Fcals.arizona.edu%2Fpubs%2Fwater%2Faz1542.pdf&ei=ADEzUsbiELOl2AW_o4H4Dg&usg=AFQjCNGflIVfuFt7eFI99YTIYGBEiUOgqg&sig2=u7SDZ8ab17UI9M67cZIPGA&bvm=bv.52164340,d.b2I
Arizona Department of Environmental Quality (2009). 
Arizona’s 2009 Annual Nonpoint Source Annual Report: 
Nonpoint Source Program July 1, 2008 – June 30, 2009. 
Phoenix, AZ: State of Arizona. Retrieved from Arizona 
Department of Environmental Quality website: http://
www.azdeq.gov/environ/water/watershed/download/
NSP_Annual_Report09-PA.pdf

Arizona Department of Environmental Quality (2009). Fact 
Sheet: Fish Consumption Advisories – April 2009. Phoenix, AZ: 
State of Arizona. Retrieved from Arizona Department of 
Environmental Quality website: http://www.azdeq.gov/
environ/water/assessment/download/fish-0409.pdf

City of Clarksville, IN (2009). What is Stormwater? Clarksville, 
IN: City of Clarksville. Retrieved from www.clarksvillesw.
com/residents.html

DeFrancesco, Donna and Robyn Baker (2008). Landscape 
Watering by the Numbers. N.p.: Park & Co.

Environmental Protection Agency Nonpoint Source Control 
Branch (2010). USEPA Nonpoint Source Fact Sheets. 
Washington, DC: Government Printing Office. Retrieved 
from Environmental Protection Agency website: http://

www.epa.gov/owow/nps/facts/


Here our vocabulary words cited below:

Runoff. 2013. In Dictionary.com Retrieved September 16, 2013 from http://dictionary.reference.com/browse/runoff

Non Point Source Pollution. 2013. In United States Environmental Protection Agency. Retrieved September 16, 2013 from http://www.epa.gov/owow/nps/qa.html

Pesticide. 2013. In Dictionary.com. Retrieved September 16, 2013, from http://dictionary.reference.com/browse/pesticide?&o=100074&s=t

Sediment. 2013. In Dictionary.com. Retrieved September 16, 2013 from http://dictionary.reference.com/browse/sediment?s=t

Wetland. 2013. In SLAC. Retrieved September 16, 2013, from http://www-group.slac.stanford.edu/esh/environment/stormwater/p_definitions.htm

















Thursday, September 5, 2013

SRP Possible Research Question 3

Is there a way to make a fail-proof algorithm to solve a Rubix Cube? The purpose is to see if there is a way to solve a rubix cube using a formula rather than solving it randomly. This question is testable because the hypothesis can be easily proved or disproved if the algorithm works or not. There are also instructions that describe how to make these types of algorithms so there will be no indefinite answers. This is repeatable because others can follow the steps or use the algorithm to see if the hypothesis was proved or disproved. The question is specific because it completely describes the purpose of the idea, but is concise because it does not go in depth about the creation of the algorithm.This article led me to choose this: http://lar5.com/cube/downloads.html. This question should be tested because some areas of math are universal, and the algorithm for the Rubix Cube could be used in other areas of math like engineering.

SRP Possible Research Question 2

Does engaging in cell phone conversations affect reaction time? The purpose is to see if cell phone usage affects reaction time.This project is easy to test and repeat because testing the subjects for reaction time is straightforward and a cell phone and test subjects. This question is specific because it is only focusing on testing the reaction time with cell phones. It is concise because it is under 25 words long. This article led me to this question: http://www.usc.edu/CSSF/History/2004/Projects/J0312.pdf. This question could be tested because it could be substantial evidence to help stop texting and driving.

SRP Possible Research Question 1

Does a high stress situation affect test scores on high-intensity exams that require remembering and recalling figures from before? The purpose is to see if stress can effect difficult tests. This question is testable because the patients can be given the test, put under high stress,and then given a similar test. The results of the two tests would be compared. This is repeatable because the same tests can be taken and test subjects can be relatively easy to find. This question is specific because it includes all parts of a science question and puts details into the variables. The question is concise because it is under 25 words. This question was chosen based off of this article: Princeton University (2013, August 29). Poor concentration: Poverty reduces brainpower needed for navigating other areas of life.This article should be worked on because it could show if stress actually does affect test scores negatively.

Friday, August 30, 2013

Poverty Lowers Brainpower?


Believe it or not, but scientists at Princeton University have discovered that being in poverty can actually lower brainpower. Recent studies have shown that the intense mental stress from having to deal with surviving can distract people from thinking more creatively and even prevent them from using more inventive means to help them escape poverty. This finding was shown during an experiment involving two groups of people solving hypothetical financial problems that got progressively harder. The first group (the "poor" group) performed equally well on solving these problems compared to the "rich" group, but when the problems became severe, the poor group's results dipped. In India, a study justified the results of the previous experiment. A group of 464 sugar farmers took tests before and after harvest, where farmers are poor before the harvest but rich after it. When given a financial test similar to the rich/poor group tests, the overall results showed that the farmers post-harvest performed much better than pre-harvest. Based on the results, don't become poor! It could start something much worse than just a lower income.

Princeton University (2013, August 29). Poor concentration: Poverty reduces brainpower needed for navigating other areas of life.

  1. How could these finding be applied to help people dealing with financial problems?
  2. Is there any way to escape poverty once it has hit? According to the article, it sounds like there isn't much to do once it sets.
  3. Is there any way to prevent slipping into the poverty mindset? If so, how?


Here is the voki presentation of this: http://www.voki.com/pickup.php?scid=8507831&height=267&width=200














Poor Concentration: Poverty Reduces Brainpower Needed for Navigating Other Areas of Life

Aug. 29, 2013 — Poverty and all its related concerns require so much mental energy that the poor have less remaining brainpower to devote to other areas of life, according to research based at Princeton University. As a result, people of limited means are more likely to make mistakes and bad decisions that may be amplified by -- and perpetuate -- their financial woes.

Share This:

103
Published in the journal Science, the study presents a unique perspective regarding the causes of persistent poverty. The researchers suggest that being poor may keep a person from concentrating on the very avenues that would lead them out of poverty. A person's cognitive function is diminished by the constant and all-consuming effort of coping with the immediate effects of having little money, such as scrounging to pay bills and cut costs. Thusly, a person is left with fewer "mental resources" to focus on complicated, indirectly related matters such as education, job training and even managing their time.
In a series of experiments, the researchers found that pressing financial concerns had an immediate impact on the ability of low-income individuals to perform on common cognitive and logic tests. On average, a person preoccupied with money problems exhibited a drop in cognitive function similar to a 13-point dip in IQ, or the loss of an entire night's sleep.
But when their concerns were benign, low-income individuals performed competently, at a similar level to people who were well off, said corresponding author Jiaying Zhao, who conducted the study as a doctoral student in the lab of co-author Eldar Shafir, Princeton's William Stewart Tod Professor of Psychology and Public Affairs. Zhao and Shafir worked with Anandi Mani, an associate professor of economics at the University of Warwick in Britain, and Sendhil Mullainathan, a Harvard University economics professor.
"These pressures create a salient concern in the mind and draw mental resources to the problem itself. That means we are unable to focus on other things in life that need our attention," said Zhao, who is now an assistant professor of psychology at the University of British Columbia.
"Previous views of poverty have blamed poverty on personal failings, or an environment that is not conducive to success," she said. "We're arguing that the lack of financial resources itself can lead to impaired cognitive function. The very condition of not having enough can actually be a cause of poverty."
The mental tax that poverty can put on the brain is distinct from stress, Shafir explained. Stress is a person's response to various outside pressures that -- according to studies of arousal and performance -- can actually enhance a person's functioning, he said. In the Science study, Shafir and his colleagues instead describe an immediate rather than chronic preoccupation with limited resources that can be a detriment to unrelated yet still important tasks.
"Stress itself doesn't predict that people can't perform well -- they may do better up to a point," Shafir said. "A person in poverty might be at the high part of the performance curve when it comes to a specific task and, in fact, we show that they do well on the problem at hand. But they don't have leftover bandwidth to devote to other tasks. The poor are often highly effective at focusing on and dealing with pressing problems. It's the other tasks where they perform poorly."
The fallout of neglecting other areas of life may loom larger for a person just scraping by, Shafir said. Late fees tacked on to a forgotten rent payment, a job lost because of poor time-management -- these make a tight money situation worse. And as people get poorer, they tend to make difficult and often costly decisions that further perpetuate their hardship, Shafir said. He and Mullainathan were co-authors on a 2012 Science paper that reported a higher likelihood of poor people to engage in behaviors that reinforce the conditions of poverty, such as excessive borrowing.
"They can make the same mistakes, but the outcomes of errors are more dear," Shafir said. "So, if you live in poverty, you're more error prone and errors cost you more dearly -- it's hard to find a way out."
The first set of experiments took place in a New Jersey mall between 2010 and 2011 with roughly 400 subjects chosen at random. Their median annual income was around $70,000 and the lowest income was around $20,000. The researchers created scenarios wherein subjects had to ponder how they would solve financial problems, for example, whether they would handle a sudden car repair by paying in full, borrowing money or putting the repairs off. Participants were assigned either an "easy" or "hard" scenario in which the cost was low or high -- such as $150 or $1,500 for the car repair. While participants pondered these scenarios, they performed common fluid-intelligence and cognition tests.
Subjects were divided into a "poor" group and a "rich" group based on their income. The study showed that when the scenarios were easy -- the financial problems not too severe -- the poor and rich performed equally well on the cognitive tests. But when they thought about the hard scenarios, people at the lower end of the income scale performed significantly worse on both cognitive tests, while the rich participants were unfazed.
To better gauge the influence of poverty in natural contexts, between 2010 and 2011 the researchers also tested 464 sugarcane farmers in India who rely on the annual harvest for at least 60 percent of their income. Because sugarcane harvests occur once a year, these are farmers who find themselves rich after harvest and poor before it. Each farmer was given the same tests before and after the harvest, and performed better on both tests post-harvest compared to pre-harvest.
The cognitive effect of poverty the researchers found relates to the more general influence of "scarcity" on cognition, which is the larger focus of Shafir's research group. Scarcity in this case relates to any deficit -- be it in money, time, social ties or even calories -- that people experience in trying to meet their needs. Scarcity consumes "mental bandwidth" that would otherwise go to other concerns in life, Zhao said.
"These findings fit in with our story of how scarcity captures attention. It consumes your mental bandwidth," Zhao said. "Just asking a poor person to think about hypothetical financial problems reduces mental bandwidth. This is an acute, immediate impact, and has implications for scarcity of resources of any kind."
"We documented similar effects among people who are not otherwise poor, but on whom we imposed scarce resources," Shafir added. "It's not about being a poor person -- it's about living in poverty."
Many types of scarcity are temporary and often discretionary, said Shafir, who is co-author with Mullainathan of the book, "Scarcity: Why Having Too Little Means So Much," to be published in September. For instance, a person pressed for time can reschedule appointments, cancel something or even decide to take on less.
"When you're poor you can't say, 'I've had enough, I'm not going to be poor anymore.' Or, 'Forget it, I just won't give my kids dinner, or pay rent this month.' Poverty imposes a much stronger load that's not optional and in very many cases is long lasting," Shafir said. "It's not a choice you're making -- you're just reduced to few options. This is not something you see with many other types of scarcity."
The researchers suggest that services for the poor should accommodate the dominance that poverty has on a person's time and thinking. Such steps would include simpler aid forms and more guidance in receiving assistance, or training and educational programs structured to be more forgiving of unexpected absences, so that a person who has stumbled can more easily try again.
"You want to design a context that is more scarcity proof," said Shafir, noting that better-off people have access to regular support in their daily lives, be it a computer reminder, a personal assistant, a housecleaner or a babysitter.
"There's very little you can do with time to get more money, but a lot you can do with money to get more time," Shafir said. "The poor, who our research suggests are bound to make more mistakes and pay more dearly for errors, inhabit contexts often not designed to help."
Share this story on FacebookTwitter, and Google:
Other social bookmarking and sharing tools:

Story Source:
The above story is based on materials provided by Princeton University. The original article was written by Morgan Kelly.
Note: Materials may be edited for content and length. For further information, please contact the source cited above.

Journal Reference:
  1. A. Mani, S. Mullainathan, E. Shafir, J. Zhao. Poverty Impedes Cognitive FunctionScience, 2013; 341 (6149): 976 DOI: 10.1126/science.1238041

 APA

 MLA
Princeton University (2013, August 29). Poor concentration: Poverty reduces brainpower needed for navigating other areas of life.ScienceDaily. Retrieved August 30, 2013, from http://www.sciencedaily.com­/releases/2013/08/130829145125.htm
Note: If no author is given, the source is cited instead.

Search ScienceDaily

Number of stories in archives: 141,873

Find with keyword(s):






















http://www.sciencedaily.com/releases/2013/08/130829145125.htm

Sunday, August 25, 2013

Genetic Alteration of Papaya Genomes (Highlighted Article)



The papaya plant has many nutritional and medicinal purposes, including the production of papain, an enzyme essential in the digestion of meat. Recently, scientists have genetically altered the papaya genome and named their creation SunUp. In addition, the papaya genome has been traced to an old genome related to the cabbage plant known as the Arabidopsis genome. Furthermore, because the Arabidopsis genome split from the papaya genome once 72 million years ago, the more recent papaya genome can be traced back to the Arabidopsis, which could help give clues about fruit and tropical tree genomes millions of years ago. The creation of the SunUp papaya has also made the papaya more virus-resistant, and increasing its attraction to fruit seed dispersal by animals. These alterations have made the genome more tree-like rather than its original form, and the changes done to the papaya have begun to arise suspicion about its safety.
While the SunUp papaya has many benefits, countries are reluctant to import it in fear that it might be suitable for human consumption or it could become an invasive species. The results in this study, however, prove the safety of the SunUp papaya and show that it is safe for import and export. In just several years, the SunUp papaya could become the next big fruit.


Ming, R. "The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus)." Nature 452, 991-996 (24 April 2008) | doi:10.1038/nature06856; Received 6 September 2007; Accepted 22 February 2008

  • Are there any other plants that can be traced back to other genomes? If so, which ones?

  • When will the SunUp papaya be mass produced?

  • Are there any major sponsors funding this?


Here is the voki presentation of this:  http://www.voki.com/pickup.php?scid=8478023&height=267&width=200













http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2836516/












Nature
Author Manuscript
NIH Public Access

The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus)

Ray Ming, Shaobin Hou, [...], and Maqsudul Alam

Abstract


Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3× draft genome sequence of ‘SunUp’ papaya, the first commercial virus-resistant transgenic fruit tree1 to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far25, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica's distinguishing morpho-physiological, medicinal and nutritional properties.
Papaya is an exceptionally promising system for the exploration of tropical-tree genomes and fruit-tree genomics. It has a relatively small genome of 372 megabases (Mb)6, diploid inheritance with nine pairs of chromosomes, a well-established transformation system7, a short generation time (9–15 months), continuous flowering throughout the year and a primitive sex-chromosome system8. It is a member of the Brassicales, sharing a common ancestor with Arabidopsis about 72 million years ago9. Papaya is ranked first on nutritional scores among 38 common fruits, based on the percentage of the United States Recommended Daily Allowance for vitamin A, vitamin C, potassium, folate, niacin, thiamine, riboflavin, iron and calcium, plus fibre. Consumption of its fruit is recommended for preventing vitamin A deficiency, a cause of childhood blindness in tropical and subtropical developing countries. The fruit, stems, leaves and roots of papaya are used in a wide range of medical applications, including production of papain, a valuable proteolytic enzyme.
A total of 2.8 million whole-genome shotgun (WGS) sequencing reads were generated from a female plant of transgenic cultivar SunUp, which was developed through transformation of Sunset that had undergone more than 25 generations of inbreeding10. The estimated residual heterozygosity of SunUp is 0.06% (Supplementary Note 1). After excluding low-quality and organellar reads, 1.6 million high-quality reads were assembled into contigs containing 271 Mb and scaffolds spanning 370 Mb including embedded gaps (Supplementary Tables 1 and 2). Of 16,362 unigenes derived from expressed sequence tags (ESTs), 15,064 (92.1%) matched this assembly. Paired-end reads from 34,065 bacterial artificial chromosome (BAC) clones provided alignment to an fingerprinted contig (FPC)-based physical map (Supplementary Note 2). Among 706 BAC end and WGS sequence-derived simple sequence repeats on the genetic map, 652 (92.4%) could be used to anchor 167 Mb of contigs or 235 Mb of scaffolds, to the 12 papaya linkage groups in the current genetic map (Supplementary Fig. 1).
Papaya chromosomes at the pachytene stage of meiosis are generally stained lightly by 4′,6-diamidino-2-phenylindole (DAPI), revealing that the papaya genome is largely euchromatic. However, highly condensed heterochromatin knobs were observed on most chromosomes (Supplementary Fig. 2), concentrated in the centromeric and pericentromeric regions. The lengths of the pachytene bivalents that are heavily stained only account for approximately 17% of the genome. However, these cytologically distinct and highly condensed heterochromatic regions could represent 30–35% of the genomic DNA11. A large portion of the heterochromatic DNA was probably not covered by the WGS sequence. The 271 Mb of contig sequence should represent about 75% of the papaya genome and more than 90% of the euchromatic regions, which is similar to the 92.1% of the EST and 92.4% of genetic markers covered by the assembled genome and the theoretical 95% coverage by 3× WGS sequence12.
Gene annotation was carried out using the TIGR Eukaryotic Annotation Pipeline. The assembled genome was masked based on similarity to known repeat elements in RepBase and the TIGR Plant Repeat Database, plus a de novo papaya repeat database (see Methods). Ab initio gene predictions were combined with spliced alignments of proteins and transcripts to produce a reference gene set of 28,629 gene models (Supplementary Table 3). A total of 21,784 (76.1%) of the predicted papaya genes with average length of 1,057 base pairs (bp) have similarity to proteins in the non-redundant database from the National Center for Biotechnology Information, with 9,760 (44.8%) of these supported by papaya unigenes. Among 6,845 genes with average length 309 bp that had no hits to the non-redundant proteins, only 515 (7.5%) were supported by papaya unigenes, implying that the number of predicted papaya-specific genes was inflated. If the 515 genes with unigene support represent 44.8% of the total, then 1,150 predicted papaya-specific genes may be real, and the number of predicted genes in the assembled papaya genome would be 22,934. Considering the assembled genome covers 92.1% of the unigenes and 92.4% of the mapped genetic markers, the number of predicted genes in the papaya genome could be 7.9% higher, or 24,746, about 11–20% less than Arabidopsis (based on either the 27,873 protein coding and RNA genes, or including the 3,241 novel genes)2,13, 34% less than rice3, 46% less than poplar4 and 19% less than grape5 (Table 1).

Table 1

Statistics of sequenced plant genomes
Comparison of the papaya genome with that of Arabidopsis sheds new light on angiosperm evolutionary history in several ways. Considering only the 200 longest papaya scaffolds, we found 121 co-linear blocks. The papaya blocks range in size from 1.36 Mb containing 181 genes to 0.16 Mb containing 19 genes (a statistical, rather than a biological, lower limit); the corresponding Arabidopsis regions range from 0.69 Mb containing 163 genes to 60 kilobases (kb) containing 18 genes. Across the 121 papaya segments for which co-linearity can be detected, 26 show primary correspondence (that is, excluding the effects of ancient triplication detailed below) to only one Arabidopsis segment, 41 to two, 21 to three, 30 to four, and only 3 to more than four.
The fact that many papaya segments show co-linearity with two to four Arabidopsis segments (Fig. 1, and Supplementary Figs 3 and 4) is most parsimoniously explained if either one or two genome duplications have affected the Arabidopsis lineage since its divergence from papaya. Although it was suspected that the most recent Arabidopsis genome duplication, α14, might affect only a subset of the Brassicales15, previous phylogenetic dating of these events15 had suggested that the more ancient β-duplication occurred early in the eudicot radiation, well before the ArabidopsisCarica divergence. This incongruity is under investigation.

Figure 1

Alignment of co-linear regions from Arabidopsis (green), papaya (magenta), poplar (blue) and grape (red)
In contrast, individual Arabidopsis genome segments correspond to only one papaya segment, indicating that no genome duplication has occurred in the papaya lineage since its divergence from Arabidopsis about 72 million years ago5. The lack of relatively recent papaya genome doubling is further supported by an L-shaped distribution of intra-EST correspondence for papaya (not shown). However, multiple genome/subgenome alignments (see Supplementary Methods) reveal evidence in papaya of the ancient ‘γ’ genome duplication shared with Arabidopsis and poplar that is postulated to have occurred near the origin of angiosperms14. Indeed, both papaya (with no subsequent duplication) and poplar (with a relatively low rate of duplicate gene loss) suggest that γ was not a duplication but a triplication (Fig. 1), with triplicated patterns evident for about 25% of the 247 Mb comprising the 200 largest papaya scaffolds.
This is most probably an underestimate that will increase as papaya contiguity is improved. Triplication in papaya and poplar corresponds closely to the triplication suggested by an independent analysis of the grape genome5.
A few hundred papaya chromosomal segments were aligned using BLASTZ to their one to four syntenic regions in Arabidopsis, and the results examined visually using the Genome Evolution (GEvo) viewer16. The orthologous region of grape was also included5, making the alignment a six-way comparison. One example is given in Supplementary Fig. 5: a 500 kb segment of papaya, its four 60 kb syntenic, orthologous Arabidopsis segments and the 400 kb orthologous segment of grape.
For the homologous Arabidopsis segments that are discernibly co-linear (by MC-SCANNER) to the 200 longest papaya scaffolds, 34.8% of Arabidopsis genes in any one segment correspond to a papaya gene, whereas only 24.8% of papaya genes in any one segment correspond to an Arabidopsis gene. Moreover, the Arabidopsis homologous segments contain fewer genes, on average only about 57.9% of the number in their papaya counterparts.
Papaya provides a useful outgroup necessary to detect subfunctionalization. Supplementary Fig. 6 is a GEvo screenshot of a blastn alignment illustrating subfunctionalization of conserved non-coding sequences (CNSs)17 upstream of two syntenic, duplicate Arabidopsis genes and their single papaya orthologous gene. The α-duplicated genomes within Arabidopsis are perfect for CNS discovery18.
Comparative analysis of the papaya and Arabidopsis 5′ untranslated regions showed that only 14% of orthologous promoter pairs exhibit significantly higher levels of sequence identity than random comparisons (Supplementary Figs 7 and 8). Although some highly conserved promoters show substantial conservation across much of their length, sequence similarity for most orthologous papaya promoters is indistinguishable from background.
Global analysis of all inferred protein models from papaya, Arabidopsis, poplar, grape and rice clusters the 208,901 non-redundant protein sequences into 39,706 similarity groups, or ‘tribes’19, 11,851 of which contain two or more genes (see Supplementary Methods). Tribes with multiple genes in a species typically correspond to families or subfamilies of genes; however, tribes may also contain just one gene (‘singleton tribes’). In papaya, 25,312 gene models were classified into 12,958 tribes, 5,669 of which were specific to papaya (Supplementary Table 4). Of the papaya-specific tribes, 5,314 were singleton tribes. EST support was markedly lower for genes in papaya-specific tribes (below 14%) than in tribes that included genes from at least one other taxon (72.4%).
To investigate the smaller number of genes in papaya, we compared tribe membership from each of the five sequenced angiosperm species (Supplementary Table 5). Among the 6,726 tribes that contain genes from both Arabidopsis and papaya, 3,595 contain equal numbers of genes from both species. However, tribes with more Arabidopsis genes outnumber those with more papaya genes by more than 2:1 (2,153:979). The trend of smaller number of papaya genes is widespread across tribes of all sizes and major functional categories (Supplementary Table 6 and Supplementary Fig. 9).
We then examined membership in the 815 tribes with members identified as being likely transcription factors in the Arabidopsis transcription factor database (http://arabidopsis.med.ohio-state.edu/AtTFDB/). This set includes 2,897 genes in Arabidopsis and 2,438 in papaya (a ratio of 1.19:1). The details of tribe membership are illustrated for 25 exemplar families and superfamilies (Fig. 2), where most transcription-factor tribes have fewer genes in papaya than Arabidopsis Some transcription-factor tribes had more genes in papaya, specifically RWP-RK, MADS-box, Scarecrow, TCP and Jumonji gene families. Interestingly, the difference in MADS protein family size appears to be due to expanded numbers for half of the 36 MADS tribes. The other 18 MADS tribes had fewer papaya genes, including 14 that were not found in papaya.

Figure 2

Comparison of gene numbers in transcription-factor tribe or related tribes from Arabidopsis and papaya
Assuming that a generalized angiosperm could potentially require only the types and minimal numbers of genes that are shared among divergent plant species, we examined each of the tribes shared among the five angiosperms with sequenced genomes. The number of genes required in a minimal flowering plant is based on the observed minimum number of genes across each of the shared tribes (Table 2). When the smallest observed number is taken for each evolutionarily conserved tribe, a minimal angiosperm genome of 13,311 genes is estimated. Papaya has the smallest number of genes for more tribes than any other sequenced taxon (4,515, or 76% of 5,925 shared tribes), reinforcing the notion that papaya has fewer genes than any angiosperm sequenced so far.

Table 2

Deduced potential minimal angiosperm gene number based on species with smallest number of genes for each tribe
Only 55 nucleotide-binding site (NBS)-containing R genes were identified in papaya; about 28% of the 200 NBS genes in Arabidopsis20 and less than 10% of the 600 NBS genes in rice21. Resistance proteins also have a carboxy-terminal leucine-rich repeat (LRR) domain. These NBS-containing R-gene families can be subdivided into three classes: NBS–LRR, toll interleukin receptor (TIR)–NBS–LRR, and coiled-coil (CC)–NBS–LRR on the basis of their amino-terminal region. Papaya NBS–LRR outnumbered both TIR–NBS–LRR and CC–NBS–LRR genes, in contrast to both poplar (with more CC–NBS–LRR genes4) and Arabidopsis (with more TIR–NBS–LRR). More than 50% of the NBS-type R genes were clustered in about eight scaffolds, indicating that resistance gene evolution may involve duplication and divergence of linked gene families.
Homologues for genes involved in cellulose biosynthesis are present in papaya and Arabidopsis, with more cellulose synthase genes in poplar, perhaps associated with wood formation. Papaya has at least 32 putative β-glucosyl transferase (GT1) genes compared with 121 in Arabidopsis identified using sequence alignment. A total of 38 and 40 cellulose synthase-related genes (GT2) were identified in papaya using the 48 poplar and 31 Arabidopsis genes as queries, respectively. These genes include 11 cellulose synthase (CesA) genes, the same number as in Arabidopsis but 7 fewer than in poplar. Putative cellulose orientation genes (COBRA) were more abundant in Arabidopsis (12) than in papaya (8).
Papaya also has a similar complement though fewer genes for cell-wall synthesis than Arabidopsis. Papaya and Arabidopsis, respectively, have 6 and 12 callose synthase genes (GT2); 15 and 15 xyloglucan α-1,2-fucosyl transferases (GT37); 5 and 7 β-glucuronic acid transferases in familes GT43 and GT47; and 27 and 42 in GT8 that includes galacturonosyl transferases, associated with pectin synthesis.
The cell wall of plants is capable of both plastic and elastic extension, and controls the rate and direction of cell expansion22. Despite fewer whole-genome duplications, papaya has a similar number of putative expansin A genes (24) as Arabidopsis (26) and poplar (27), and more expansin B genes (10) than Arabidopsis (6) and poplar (3).
In contrast to expansion-related genes, papaya has on average about 25% fewer cell-wall degradation genes than Arabidopsis, in some cases far fewer. For example, papaya and Arabidopsis, respectively, have 4 and 12 endoxylanase-like genes in glycoside hydrolase family 10 (GH10); 29 and 67 pectin methyl esterases (carbohydrate esterase family 8); 28 and 69 polygalacturonases (GH28); 15 and 49 xyloglucan endotransglycosylase/hydrolases (GH16); 18 and 25 β-1,4-endoglucanases (GH9); 42 and 91 β-1,3-glucanases (GH17); and 15 and 27 pectin lyases (PL1).
A semi-woody giant herb that accumulates lignin in the cell wall at an intermediate level between Arabidopsis and poplar, papaya generally has intermediate numbers of lignin synthetic genes, fewer than poplar but more than Arabidopsis despite fewer opportunities for duplication in papaya. Poplar, papaya and Arabidopsis have 37, 30 and 18 candidate genes for the lignin synthesis pathway, respectively4,23, with papaya having an intermediate number of genes for the PAL, C4H, 4CL and HCT gene families, and only one COMT and two C3H genes. In contrast, poplar has three C3H genes, which are presumed to convert p-coumaroyl quinic acid to caffeoyl shikimic acid, whereas there are two in papaya and one in Arabidopsis. Papaya, Arabidopsis and poplar each have two genes in the family CCoAOMT, which are presumed to convert caffeic acid to ferulic acid4. Compared with these other plants, papaya has the fewest genes in the CCR gene family (1 gene) and the most in the F5H (4 genes) and CAD gene families (18 genes), which all mediate later steps of the lignin biosynthesis pathway.
More starch-associated genes in papaya, a perennial, may be due to a greater need for storage in leaves, stem and developing fruit than in Arabidopsis, an ephemeral that stores oil in the seed. Papaya and Arabidopsis, respectively, have 13 and 6 putative starch synthase (GT5) genes; 8 and 3 starch branching genes; 6 and 3 isoamylases (GH13); and 12 and 9 β-amylases (GH14). Early unloading of fruit sugar in papaya is probably symplastic24, with five genes for sucrose synthase/sucrose phosphate synthase (GT4); seven are reported for Arabidopsis. Five acid invertase (GH32) sequences were found in papaya whereas 11 have been reported in Arabidopsis. Papaya has at least seven putative neutral invertase (GH32) genes; Arabidopsis has six. Wall-associated kinases (WAK) are thought to be involved in the regulation of vacuolar invertases, with 17 in Arabidopsis and only 10 in papaya. Arabidopsis and papaya have 14 and 7 hexose transporters, respectively. The greater number of genes for sugar accumulation in Arabidopsis may reflect recent genome duplications.
Papaya has undergone particularly striking amplification of genes involved in volatile development. Papaya and Arabidopsis, respectively, have 18 and 8 genes for cinnamyl alcohol dehydrogenase; 2 and 1 genes for cinnamate-4-hydroxylase; 9 and 3 genes for phenylalanine ammonia lyase; and 24 and 3 limonene cyclase genes.
Papaya ripening is climacteric, with the rise in ethylene production occurring at the same time as the respiratory increase25. Papaya and Arabidopsis, respectively, have similar numbers of genes involved in ethylene synthesis, with four each for S-adenosyl methionine synthase (SAM synthase); 8 and 13 for aminocyclopropane carboxylic acid (ACC) synthase (ACS); 8 and 12 for ACC oxidase (ACO); and 42 and 64 for ethylene-responsive binding factors (AP2/ERF).
Because papaya grows in tropical climates where daily light/dark cycles do not change much over the year, we can ask if more or fewer light/circadian genes are required to synchronize with the environment. In fact, there are fewer light/clock genes in the papaya genome (49% and 34% of poplar and Arabidopsis, respectively; Supplementary Table 7). However, among the core circadian clock genes, the pseudo-response regulators (PRRs; Supplementary Fig. 10) have expanded in poplar compared with Arabidopsis, and the papaya PRR7 cluster has seemingly duplicated with the recent poplar salicoid-specific genome duplication4 (Supplementary Fig. 11). Against the backdrop of fewer overall genes, the parallel expansion of the PRRs is consistent with circadian timing being important in papaya.
The PAS–FBOX–KELCH genes control light signalling and flowering time; however, the only papaya orthologue (ZTL) lacks an obvious KELCH domain compared with Arabidopsis and poplar, which have five and one KELCH domains, respectively (Supplementary Fig. 10). In fact, the papaya genome contains fewer KELCH domains (37 compared with 130 and 74 in Arabidopsis and poplar, respectively). In contrast, there are three constitutive photomorphogenic 1 (COP1) paralogues in the papaya genome compared with only one in Arabidopsis (Supplementary Tables 7 and 8). A similar expansion has been noted in moss (Physcomitrella patens), which has nine COP1 paralogues that are hypothesized to aid in tolerance to ultraviolet light (Supplementary Fig. 12)26. Both KELCH domains and the WD-40 of the COP1 family form β-propellers and play a role in light-mediated ubiquitination. There is not a general expansion of WD-40 genes in papaya (173 compared with 227 in Arabidopsis). Perhaps papaya has developed an alternative way of integrating light or timing information specific to day-neutral plants, such as a strict adherence to the diel light/dark cycle that is better served by the COP-mediated system.
Sex determination in papaya is controlled by a pair of primitive sex chromosomes, with a small male-specific region of the Y chromosome (MSY)8. The physical map of the MSY is currently estimated by chromosome walking to span about 8 Mb (ref. 27). Two scaffolds in the current female-genome sequence align to the X chromosome physical map based on BAC end sequences, spanning 4.5 Mb and including 254 predicted protein-encoding genes, of which 75 (29.5%) have EST support (Supplementary Table 9 and Supplementary Fig. 13). If adjusted for the percentage of unigene validation for other genes (48.0%), the estimated number of genes in the X-specific region would be 156. The average gene density would be one gene per 19.5 kb, lower than the estimated genome average of one gene per 14.3 kb. By contrast, among seven completely sequenced MSY BACs totalling 1.2 Mb, a total of four expressed genes were found on two of the BACs14,28. The somewhat lower-than-average gene density in the X-specific scaffolds is accompanied by more repetitive DNA (58.3%) than the genome-wide average, perhaps because this region is near the centromere28. Re-analysis of the repetitive DNA content of the MSY BACs, to include the new papaya-specific repeat families identified herein, increased the average repeat sequence to 85.6%, with 54.1% Gypsy and 1.9% Copia retro-elements (Supplementary Table 10). This compares with an earlier estimate of 17.9% using the Arabidopsis repeat database alone28.
The SunUp genome has presented an opportunity to analyse transgene insertion sites critically. Southern blot analysis was key in the initial identification of transgenic insertion fragments and was performed with probes spanning the entire 19,567-bp transformation vector used for bombardment (Supplementary Fig. 14). Among the identified inserts were the functional coat-protein transgene conferring resistance to papaya ringspot virus, which was found in an intact 9,789-bp fragment of the transformation plasmid, and a 1,533-bp fragment composed of a truncated, non-functional tetA gene and flanking vector backbone sequence. The structures of the coat-protein transgene and tetA region insertion sites were determined from cloned sequences. Southern analysis also confirmed a 290-bp non-functional fragment of the nptII gene originally identified by WGS sequence analysis (Supplementary Fig. 15). Five of the six flanking sequences of the three insertions are nuclear DNA copies of papaya chloroplast DNA fragments. The integration of the trans-genes into chloroplast DNA-like sequences may be related to the observation that transgenes produced either by Agrobacterium-mediated or biolistic transformation are often inserted in AT-rich DNA29, as is the chloroplast DNA of papaya and other land plants. Four of the six insert junctions have sequences that match topoisomerase I recognition sites, which are associated with breakpoints in genomic DNA transgene insertion sites and transgene rearrangements29. The presence of these inserts was confirmed by high-throughput MUMmer30 analysis for each region of the transformation vector. Evidence for the presence of other transgene inserts is not conclusive (Supplementary Note 3).
Its lower overall gene number notwithstanding, striking variations in gene number within particular functional groups, superimposed on the average approximate 20% reduction in papaya gene number relative to Arabidopsis, may be related to key features of papaya morphological evolution. Despite a closer evolutionary relationship to Arabidopsis, papaya shares with poplar an increased number of genes associated with cell expansion, consistent with larger plant size; and lignin biosynthesis, consistent with the convergent evolution of tree-like habit. Amplification of starch-synthesis genes in papaya relative to Arabidopsis is consistent with a greater need for storage in leaves, stem and developing fruit of this perennial. Tremendous amplification in papaya of genes related to volatile development implies strong natural selection for enhanced attractants that may be key to fruit (seed) dispersal by animals and which may also have attracted the attention of aboriginal peoples. This also foreshadows what we might expect to discover in the genomes of other fragrant-fruited trees, as well as plants with striking fragrance of leaves (herbs), flowers or other organs.
Arguably, the sequencing of the genome of SunUp papaya makes it the best-characterized commercial transgenic crop. Because papaya ringspot virus is widespread in nearly all papaya-growing regions, SunUp could serve as a transgenic germplasm source that could be used to breed suitable cultivars resistant to the virus in various parts of the world. The characterization of the precise transgenic modifications in SunUp papaya should also serve to lower regulatory barriers currently in place in some countries.

Methods Summary

Gene annotation

Papaya unigenes from complementary DNA were aligned to the unmasked genome assembly, which was then used in training ab initio gene prediction software. Spliced alignments of proteins from the plant division of GenBank, and transcripts from related angiosperms, were generated. Gene predictions were combined with spliced alignments of proteins and transcripts to produce a reference gene set. Detailed descriptions are given in Methods.

Supplementary Material


Acknowledgments

We thank X. Wan, J. Saito and A. Young at the University of Hawaii for technical assistance; C. Detter at the DOE Joint Genome Institute; F. MacKenzie, O. Veatch and T. Uhm at the Hawaii Agriculture Research Center; L. Li, W. Teng, Y. Wu, Y. Yang, C. Zhou, N. Wang, P. Wang and D. Fei at the Tianjin Biochip Corporation, Tianjin Economic-Technological Development Area, Tianjin; and R. Herdes, L. Diebold, R. Kim, A. Hernandez, S. Ali and L. Bynum at the University of Illinois at Urbana-Champaign. This papaya genome-sequencing project was given support by the University of Hawaii and the US Department of Defense grant number W81XWH0520013 to M.A., the Maui High Performance Computing Center to M.A., the Hawaii Agriculture Research Center to R.M. and Q.Y., and Nankai University, China, to L.W. Other support to the papaya genome project included the United States Department of Agriculture T-STAR program; a United States Department of Agriculture–Agricultural Research Service cooperative agreement (CA 58-3020-8-134) with the Hawaii Agriculture Research Center; the University of Illinois; the National Science Foundation Plant Genome Research Program; and Tianjin Municipal Special Fund for Science and Technology Innovation Grant 05FZZDSH00800. We thank P. Englert, former chancellor of the University of Hawaii, for initial infrastructure support of the research.

Footnotes


Author Information The papaya WGS sequence is deposited at DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank under accession number ABIM00000000. The version described in this paper is the first version, ABIM01000000. The GenBank accession numbers of the papaya ESTs are EX227656–EX303501. This paper is distributed under the terms of the Creative Commons Attribution-Non-Commercial-Share Alike licence, and is freely available to all readers at www.nature.com/nature.

Reprints and permissions information is available at www.nature.com/reprints.

Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.

Supplementary Information is linked to the online version of the paper at www.nature.com/nature.

Article information

Nature. Author manuscript; available in PMC 2010 March 11.
Published in final edited form as:
PMCID: PMC2836516
NIHMSID: NIHMS48187
1Hawaii Agriculture Research Center, Aiea, Hawaii 96701, USA
2Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
3Advanced Studies in Genomics, Proteomics and Bioinformatics, University of Hawaii, Honolulu, Hawaii 96822, USA
4TEDA School of Biological Sciences and Biotechnology, Nankai University, Tianjin Economic-Technological Development Area, Tianjin 300457, China
5Tianjin Research Center for Functional Genomics and Biochip, Tianjin Economic-Technological Development Area, Tianjin 300457, China
6Key Laboratory of Molecular Microbiology and Technology of the Ministry of Education, College of Life Sciences, Nankai University, Tianjin 300071, China
7Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland 20742, USA
8Department of Molecular Bioscience and Bioengineering, University of Hawaii, Honolulu, Hawaii 96822, USA
9Plant Genome Mapping Laboratory, University of Georgia, Athens, Georgia 30602, USA
10Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
11Department of Tropical Plant and Soil Sciences, University of Hawaii, Honolulu, Hawaii 96822, USA
12Waksman Institute of Microbiology and Department of Plant Biology and Pathology, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA
13Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
14Department of Biology, Indiana University, Bloomington, Indiana 47405, USA
15USDA-ARS, Pacific Basin Agricultural Research Center, Hilo, Hawaii 96720, USA
16Department of Biochemistry and Biophysics, 2128 TAMU, Texas A&M University, College Station, Texas 77843, USA
17The Institute for Genomic Research, Rockville, Maryland 20850, USA
18W.M. Keck Center for Comparative and Functional Genomics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
19Department of Molecular Sciences, University of Tennessee, Memphis, Tennessee 38163, USA
20Leeward Community College, University of Hawaii, Pearl City, Hawaii 96782, USA
21Wicell Research Institute, Madison, Wisconsin 53707, USA
22Department of Horticulture, Michigan State University, East Lansing, Michigan 48824, USA
23Department of Horticulture, University of Wisconsin, Madison, Wisconsin 53706, USA
24Department of Biology, Duke University, Durham, North Carolina 27708, USA
25Department of Plant Sciences, University of California, Davis, California 95616, USA
26Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland 20742, USA
27Maui High Performance Computing Center, Kihei, Hawaii 96753, USA
28Departments of Cell and Developmental Biology, Biochemistry and Plant Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
29Applied Biosystems, 850 Lincoln Centre Drive, Foster City, California 94404, USA
30Department of Microbiology, University of Hawaii, Honolulu, Hawaii 96822, USA
Correspondence and requests for materials should be addressed to M.A. (alam/at/hawaii.edu ) or L.W. (wanglei/at/nankai.edu.cn )
*These authors contributed equally to this work.

References

1. Gonsalves D. Control of papaya ringspot virus in papaya: a case study. Annu Rev Phytopathol. 1998;36:415–437. [PubMed]
2. The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. [PubMed]
3. International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature. 2005;436:793–800. [PubMed]
4. Tuskan GA, et al. The genome of black cottonwood, Populus trichocarpa (Torr & Gray) Science. 2006;313:1596–1604. [PubMed]
5. Jaillon CO, et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449:463–467. [PubMed]
6. Arumuganathan K, Earle ED. Nuclear DNA content of some important plant species. Plant Mol Biol Rep. 1991;9:208–218.
7. Fitch MMM, Manshardt RM, Gonsalves D, Slightom JL, Sanford JC. Virus resistant papaya plants derived from tissues bombarded with the coat protein gene of papaya ringspot virus. Bio/technology. 1992;10:1466–1472.
8. Liu Z, et al. A primitive Y chromosome in papaya marks incipient sex chromosome evolution. Nature. 2004;427:348–352. [PubMed]
9. Wikström N, Savolainen V, Chase MW. Evolution of the angiosperms: calibrating the family tree. Proc R Soc Lond B. 2001;268:2211–2220. [PMC free article] [PubMed]
10. Storey WB. Papaya. In: Ferwerda FP, Wit F, editors. Outlines of Perennial Crop Breeding in the Tropics. H Veenman & Zonen; Wageningen: 1969. pp. 389–408.
11. Li L, et al. Genome-wide transcription analyses in rice using tiling microarrays. Nature Genet. 2006;38:124–129. [PubMed]
12. Lander ES, Waterman MS. Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics. 1988;2:231–239. [PubMed]
13. Hanada K, Zhang X, Borevitz JO, Li WH, Shiu SH. A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection. Genome Res. 2007;17:632–640. [PMC free article] [PubMed]
14. Bowers JE, Chapman BA, Rong J, Paterson AH. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature. 2003;422:433–438. [PubMed]
15. Schranz ME, Mitchell-Olds T. Independent ancient polyploidy events in the sister families Brassicaceae and Cleomaceae. Plant Cell. 2006;18:1152–1165. [PMC free article] [PubMed]
16. Lyons E, Freeling M. How to usefully compare homologous plant genes and chromosomes as DNA sequence. Plant J. 2008;53:661–673. [PubMed]
17. Inada DC, et al. Conserved noncoding sequences in the grasses. Genome Res. 2003;13:2030–2041. [PMC free article] [PubMed]
18. Thomas BC, Rapaka L, Lyons E, Pedersen B, Freeling M. Arabidopsis intragenomic conserved noncoding sequence. Proc Natl Acad Sci USA. 2007;104:3348–3353. [PMC free article] [PubMed]
19. Wall PK, et al. PlantTribes: a gene and gene family resource for comparative genomics in plants. Nucleic Acids Res. 2008;36:D970–D976. [PMC free article] [PubMed]
20. Meyers BC, Morgante M, Michelmore RW. TIR-X and TIR-NBS proteins: two new families related to disease resistance TIR-NBS-LRR proteins encoded in Arabidopsis and other plant genomes. Plant J. 2002;32:77–92. [PubMed]
21. Zhou T, et al. Genome-wide identification of NBS genes in japonica rice reveals significant expansion of divergent non-TIR NBS-LRR genes. Mol Genet Genomics. 2004;271:402–415. [PubMed]
22. Fry SC. Primary cell wall metabolism: tracking the careers of wall polymers in living plant cells. New Phytol. 2004;161:641–675.
23. Ehlting J, et al. Global transcript profiling of primary stems from Arabidopsis thaliana identifies candidate genes for missing links in lignin biosynthesis and transcriptional regulators of fiber differentiation. Plant J. 2005;42:618–640. [PubMed]
24. Zhou LL, Paull RE. Sucrose metabolism during papaya (Carica papaya) fruit growth and ripening. J Am Soc Hortic Sci. 2001;126:351–357.
25. Paull RE, Chen NJ. Postharvest variation in cell wall-degrading enzymes of papaya (Carica papaya L.) during fruit ripening. Plant Physiol. 1983;72:382–385. [PMC free article] [PubMed]
26. Richardt S, Lang D, Reski R, Frank W, Rensing SA. PlanTAPDB, a phylogeny-based resource of plant transcription-associated proteins. Plant Physiol. 2007;143:1452–1466. [PMC free article] [PubMed]
27. Yu Q, et al. Low X/Y divergence of four pairs of papaya sex-liked genes. Plant J. 2008;53:124–132. [PubMed]
28. Yu Q, et al. Chromosomal location and gene paucity of the male specific region on papaya Y chromosome. Mol Genet Genomics. 2007;278:177–185. [PubMed]
29. Sawasaki T, Takahashi M, Goshima N, Morikawa H. Structures of transgene loci in transgenic Arabidopsis plants obtained by particle bombardment: junction regions can bind to nuclear matrices. Gene. 1998;218:27–35. [PubMed]
30. Kurtz S, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12. [PMC free article] [PubMed]

Figure 1

An external file that holds a picture, illustration, etc.
Object name is nihms48187f1.jpg

Alignment of co-linear regions from Arabidopsis (green), papaya (magenta), poplar (blue) and grape (red)
‘Vv chr16r’ is an unordered ultracontig that has been assigned to grape chromosome 16. Triangles represent individual genes with transcriptional orientations. Several Arabidopsis regions belong to previously identified duplication segments (α3, α11, α20, β6, γ7, shown to the right)23. The whole syntenic alignment supports four distinct whole-genome duplication events: α, β within the Arabidopsis lineage, an independent duplication in poplar, and γ which is shared by all four eudicot genomes. Co-linear regions can be grouped into three γ sub-genomes based on Camin–Sokal parsimony criteria.

Figure 2

An external file that holds a picture, illustration, etc.
Object name is nihms48187f2.jpg

Comparison of gene numbers in transcription-factor tribe or related tribes from Arabidopsis and papaya
Most transcription factors are represented by fewer genes in papaya than Arabidopsis. Transcription-factor names are given, with values after the names corresponding to: number of tribes with genes assigned to transcription factor group, number of tribes with smaller counts in papaya than Arabidopsis, number of tribes with equal counts in papaya and Arabidopsis, number of tribes with larger counts in papaya, and number of tribes with zero members in papaya. Supporting data are provided in Supplementary Table 8.

Table 1


Statistics of sequenced plant genomes
Carica papayaArabidopsis thalianaPopulus trichocarpaOryza sativa (japonica)Vitis vinifera
Size (Mbp)372125485389487
Number of chromosomes95191219
G + C content total (%)35.335.033.343.036.2
Gene number24,74631,114*45,55537,54430,434
Average gene length (bp per gene)2,3732,2322,3002,8213,399
Average intron length (bp)479165379412213
Transposons (%)51.9144234.841.4
*The gene number of Arabidopsis is based on the 27,873 protein-coding and RNA genes from The Arabidopsis Information Resource website (http://www.arabidopsis.org/portals/genAnnotation/genome_snapshot.jsp) and recently published 3,241 novel genes6.

Table 2


Deduced potential minimal angiosperm gene number based on species with smallest number of genes for each tribe
Carica papayaArabidopsis thalianaPopulus trichocarpaOryza sativa (japonica)Vitis viniferaShared tribesMinimal gene number
Shared tribes with minimum4,5153,5971,5483,6573,5975,92513,331
Number of unique tribes5,7082,9506,33813,0033,567
Number of conserved tribes lost or40511328429175
missing from each species