Large libraries of base-modified RNA for Nanopore sequencing
Large libraries of base-modified RNA for Nanopore sequencing
Disciplines
Chemistry (70%); Computer Sciences (10%); Nanotechnology (20%)
Keywords
-
Microarray Photolithography,
Nucleic Acid Chemistry,
RNA synthesis,
RNA base modifications,
Nanopore sequencing,
Neural Network Training
RNA is the intermediate link between genetic information (stored on DNA) and the product of the expression of genes (proteins) and, like DNA, is a nucleic acid, a biopolymer composed of four different building blocks, or bases, (A, C, G and U) joined together in a very specific order called a sequence. But it has become increasingly evident that RNA also contains a rich diversity of other bases, which play very different roles in the cellular environment. Identifying new bases, beyond the standard genetic alphabet, is the only way through which scientists can explore the RNA world, but it is a difficult endeavor, labor- intensive and very low throughput. A new sequencing technique has recently emerged, called Nanopore sequencing, which can read and sequence RNA directly and has the potential to identify a very wide range of biologically-relevant base modifications. But in order for it to become a reality, software and programs used to identify RNA bases must be trained to recognize new elements. For this computer training to occur and be thorough and efficient, complex libraries of RNA sequences containing base modifications need to be prepared and the only synthetic approach that can yield large amounts of unique RNA sequences at very high throughput is microarray synthesis. Microarray synthesis produces hundreds of thousands of nucleic acid sequences in parallel, on the same platform, each sequence being strictly confined to a specific spot on the single glass surface. In our lab, we use UV light to control the synthesis of DNA and RNA sequences in a process called photolithography. In this project, we plan to introduce four new RNA bases commonly found in nature: m6A, m5C, inosine and pseudouridine into RNA sequences using microarray photolithography. In particular, we plan to carefully introduce new bases in all possible sequence contexts so as help computer algorithms to accurately call a new base regardless of its neighboring chemical environment. To do so, we will employ special RNA building blocks (phosphoramidites) corresponding to each of the four new RNA bases. We will then perform sequencing of the RNA libraries, starting with unmodified RNA and gradually increasing the sequence complexity with one and up to four base modification mixed with canonical A, C, G and U. Sequencing and data analysis will be performed by collaborators at the Center for Genomic Science of the Italian Institute of Technology in Milan. With the simultaneous identification of 8 different RNA bases, the trained algorithms and programs used to analyze Nanopore sequencing runs will become freely available to the bioinformatics community and largely applicable to any Nanopore RNA sequencing experiments. In return, sequencing will inform on the accuracy and fidelity of the chemical synthesis of RNA by microarray photolithography, in particular the insertion, deletion and substitutions rates, which will be crucial in improving and optimizing our fabrication protocols by understanding and learning on the largest sources of errors.
The research project has allowed us to shed some important light on the chemistry of DNA photolithography, with significant developments in improving the quality of complex DNA sequence libraries. But more importantly, the project has allowed for the development of complex RNA sequence libraries, which is unique, with sequence lengths that have surpassed our previous barriers. From a limited 30-nt length, we can now make RNA oligonucleotides 100-nt long, of high quality, and using two new sets of RNA nucleotides. As part of the project, we have also developed the chemistry to synthesize modified RNA nucleotides. Those modifications are found throughout nature on DNA and RNA, and we are only starting to uncover the role and the function of these modifications. Our synthesis approach can now make RNA libraries that incorporate those biologically relevant modifications. A significant milestone in the project has been the securing of a fruitful collaboration with Oxford Nanopore Technologies, manufacturer of the powerful, third-generation sequencing technique called Nanopore sequencing. The R&D team at ONT has been able to sequence our RNA libraries, which provides direct evidence of the quality and the fidelity of our synthetic RNA. ONT has also been able to sequence base modifications integrated within the RNA libraries. With this proof-of-principle now established, further collaboration will continue and will focus on more complex, higher-throughput sequencing of synthetic RNA. This project has been fundamental in developing the chemistry of RNA photolithography. While not all objectives have been reached within the duration of the project, further work as the direct continuation of this project is ongoing.
- Universität Wien - 100%
- Tommaso Leonardi, Italian Institute of Technology - Italy
- Adrien Léger, EMBL Outstation Hinxton
Research Output
- 43 Citations
- 9 Publications
- 10 Datasets & models
-
2024
Title Accelerated, high-quality photolithographic synthesis of RNA microarrays in situ. DOI 10.1126/sciadv.ado6762 Type Journal Article Author Kekić T Journal Science advances -
2024
Title Nonaqueous Oxidation in DNA Microarray Synthesis Improves the Oligonucleotide Quality and Preserves Surface Integrity on Gold and Indium Tin Oxide Substrates. DOI 10.1021/acs.analchem.3c04166 Type Journal Article Author Ibañez-Redín G Journal Analytical chemistry Pages 2378-2386 -
2022
Title Sequence-dependent quenching of fluorescein fluorescence on single-stranded and double-stranded DNA DOI 10.1039/d2ra00534d Type Journal Article Author Lietard J Journal RSC Advances Pages 5629-5637 Link Publication -
2023
Title Enzymatic Synthesis of High-Density RNA Microarrays. DOI 10.1002/cpz1.667 Type Journal Article Author Lietard J Journal Current protocols -
2022
Title Simple synthesis of massively parallel RNA microarrays via enzymatic conversion from DNA microarrays DOI 10.1038/s41467-022-31370-9 Type Journal Article Author Schaudy E Journal Nature Communications Pages 3772 Link Publication -
2022
Title Sequence-dependence of Cy3 and Cy5 dyes in 3' terminally-labeled single-stranded DNA DOI 10.1038/s41598-022-19069-9 Type Journal Article Author Kekic T Journal Scientific Reports Pages 14803 Link Publication -
2023
Title Dipodal Silanes Greatly Stabilize Glass Surface Functionalization for DNA Microarray Synthesis and High-Throughput Biological Assays. DOI 10.1021/acs.analchem.3c03399 Type Journal Article Author Das A Journal Analytical chemistry Pages 15384-15393 -
2022
Title An 8-bit monochrome palette of fluorescent nucleic acid sequences for DNA-based painting DOI 10.1039/d2nr05269e Type Journal Article Author Kekic T Journal Nanoscale Pages 17528-17533 Link Publication -
2023
Title A Canvas of Spatially Arranged DNA Strands that Can Produce 24-bit Color Depth. DOI 10.1021/jacs.3c06500 Type Journal Article Author Kekić T Journal Journal of the American Chemical Society Pages 22293-22297
-
2026
Link
Title Cy3 sequence depende DOI 10.5281/zenodo.18305034 Type Database/Collection of data Public Access Link Link -
2026
Link
Title CSO and gold slides DOI 10.5281/zenodo.18310386 Type Database/Collection of data Public Access Link Link -
2026
Link
Title Cy3 sequence depende DOI 10.5281/zenodo.18305035 Type Database/Collection of data Public Access Link Link -
2026
Link
Title Enzymatic DNA synthesis DOI 10.5281/zenodo.18312475 Type Database/Collection of data Public Access Link Link -
2026
Link
Title QR scans DOI 10.5281/zenodo.18310212 Type Database/Collection of data Public Access Link Link -
2026
Link
Title PrOM RNA DOI 10.5281/zenodo.18304823 Type Database/Collection of data Public Access Link Link -
2026
Link
Title PrOM RNA DOI 10.5281/zenodo.18304822 Type Database/Collection of data Public Access Link Link -
2026
Link
Title Greenscale DOI 10.5281/zenodo.18304639 Type Database/Collection of data Public Access Link Link -
2023
Title RGB output DOI 10.5281/zenodo.8395197 Type Database/Collection of data Public Access -
2022
Title Fluorescein sequence dependence DOI 10.5281/zenodo.18305080 Type Database/Collection of data Public Access