Small open reading frames associated with morphogenesis are hidden in plant genomes
It is likely that many small ORFs (sORFs; 30–100 amino acids) are missed when genomes are annotated. To overcome this limitation, we identified ∼8,000 sORFs with high coding potential in intergenic regions of the Arabidopsis thaliana genome. However, the question remains as to whether these coding sORFs play functional roles. Using a designed array, we generated an expression atlas for 16 organs and 17 environmental conditions among 7,901 identified coding sORFs. A total of 2,099 coding sORFs were highly expressed under at least one experimental condition, and 571 were significantly conserved in other land plants. A total of 473 coding sORFs were overexpressed; ∼10% (49/473) induced visible phenotypic effects, a proportion that is approximately seven times higher than that of randomly chosen known genes. These results indicate that many coding sORFs hidden in plant genomes are associated with morphogenesis. We believe that the expression atlas will contribute to further study of the roles of sORFs in plants.