TGD diary: Structure and function of tRNA in braid picture

The recent beautiful results (for a popular summary see [pwpop]) about programming of bio-molecular self assembly combined with the earlier model for the pre-biotic evolution inspire interesting insights about the role of braiding in translation. According to the TGD based model of pre-biotic evolution [prebio], 3-code should have resulted as a fusion of 1- and 2- codes to 3-code involving fusion of tRNA₁ and tRNA₂ to tRNA. Second hypothesis is that during RNA era the function of tRNA₂ was to generate RNA₂ double strand from single RNA strand and that amino-acids catalyzed this process. The considerations that follow strongly suggest that tRNA₁ was involved with a non-deterministic generation of new RNA sequences essential for the evolution. After the establishment of 3-code these two process fused to a deterministic process generating amino-acid sequences. RNA era could still continue inside cell and play an important role in evolution.

A. Structure of tRNA molecule

The structure of tRNA- although more complex than that of hairpin- has much common with that of hairpins. Therefore it is interesting to look this structure from the point of view of TGD. For instance, one can find whether the notions of braiding, anomalous em charge and quark color could provide additional insights about the structure and function of tRNA. The shape of the tRNA molecule [tRNA] in 2-D representation is that of cruciform.

tRNA molecule can be seen as single RNA strand just as hairpin. The five stems are double strands analogous to the necks of the hairpin. Strand begins at 5^' end of the acceptor stem directed upwards. The second strand of acceptor stem continues as a toehold ending to 3^' end of tRNA. The toehold has at its end ACC to which the amino-acid (rather than conjugate DNA) attaches.
tRNA molecule contains three arms with hairpin structure. A arm containing the anticodon is directed downwards. D and T arms are horizontal and directed to left and right. Between T arm and A arm there is additional variable hairpin like structure but with highly degenerate loop is degenerate. It has emerged during evolution.
The structure of tRNA minus anticodon depends on anti-codon which conforms with the fact T and D arms are related to the binding of amino-acid so that their nucleotide composition correlates with that of anticodon.

B. Wobble base pairing

The phenomenon of wobble base pairing [wobble] is very important. There are only about 40 tRNA molecules instead of 61 which means that one-to-one map between mRNA nucleotides and tRNA conjugate nucleotides is not possible. Crick suggests that so called wobble base pairing resolves the problem. What happens that the first nucleotide of anticodon is either A, G, U, or I(nosine) [inosine]. The base-pairings for third nucleotide are {A-U, G-C, U-{A,G}, I-{U,A,C}. The explanation for the non unique base pairing in the case of U is that its geometric configuration is quite not the same as in ordinary RNA strand. I is known to have 3-fold base pairing.

Minimization of the number of tRNAs requiring that only three mRNA codons act as stopping signs predicts that the number of tRNAs is 40.

It is convenient to classify the 4-columns of code table according to whether all four codons code for same amino-acid ((T,C,A,G)→ X), whether 4-column decomposes into two dublets: [(T,C),(A,G)]→ [X,Y], or whether it decomposes to triplet and singlet ([(T,C,A),G]→ [ile,met]). There are also the 4-columns containing stop codon: [(U,C),(A,G)]→ [(tyr,tyr),(stop,stop)] and [(U,C),A,G]→ [(cys,sys),stop,trp]. Mitochondrial code has full A-G and T-C symmetries whereas for vertebrate nuclear code 3 4-columns break this symmetry.
Consider first 4-columns for which the doublet symmetry is broken. [tyr,tyr, top,stop] column must correspond to first tRNA nucleotide which is A or G (tyr). The absence of anti-codons containing U implies stop codon property. For [cys,sys,stop,trp] one must have A,G and C but U is not allowed. ile-met column can correspond to tRNAs with I and C as the first nucleotide.
For 4-columns coding for two doublet amino-acids the minimal set of first tRNA codons is {A,G,U}. For completely symmetric 4-columns the minimal set of tRNA codons is {I,U}. Thus {A,G,U,I} would replace {A,G,U,C}.
There are 9 completely symmetric 4-columns making 18 tRNAs, 5 doublet pairs making 15 tRNAs, ile-met giving 2 tRNAs, and the columns containing stopping codons giving 5 tRNAs. Altogether this gives 18+15+2+5= 40. Also the deviations from the standard code can be understood in terms of the properties of tRNA.

C. Wobble base pairing in TGD framework

Consider first the interpretation of wobble base pairing in TGD framework assuming the braiding picture and the mapping of nucleotides to quarks. The completely symmetric 4-columns correspond to unbroken isospin and matter-antimatter asymmetries. 4-columns decomposing into doublets result from the breaking of matter-antimatter asymmetry at quark level. ile-met column corresponds to the breaking of both symmetries. The base pairings of I obviously break both symmetries.

The non-unique based pairing of U and I means that they cannot correspond to a unique quark or anti-quark in braiding U pairs with both A and G so that the braid strands starting from these RNA nucleotides must both be able to end to tRNA U. Hence tRNA U is not sensitive to the isospin of the quark. This non-uniqueness could relate to the assumed anomalous geometric character of the binding of U codon to tRNA sequence. The braid strands beginning from U, A, and C must be able to end up to I so that I can discriminate only between {U,C,A} and G.

D. Anomalous em charge and color singletness hypothesis for tRNA

One can test also whether the vanishing of anomalous em charge of tRNA leads to testable predictions. One can also try understand translation process in terms of the braiding dynamics. One must distinguish between the states of tRNA alone and tRNA + amino-acid for which braidings are expected to be different.

Before continuing it must be made clear that braiding hypothesis is far from being precisely formulated. One question is whether the presence of the braiding could distinguish between matter in vivo and vitro. For instance, the condition that anomalous em charge is integer valued or vanishing for DNA hairpins in vivo gives strong condition on the loop of the hairpin but or hairpins in vitro there would be no such conditions. Second point is that amino-acids and I and U in tRNA₁ could carry variable anomalous em charge allowing rather general compensation mechanism.

D.1 tRNA without amino-acid

The minimal assumption is that braiding hypothesis applies only to the stem regions of tRNA in this case. In this case the strands can indeed begin from strand and end up to conjugate strand. The possibility of color singletness and vanishing of total anomalous em charge are automatically satisfied for the stem regions as a whole in absence of non-standard base pairings. In general the acceptor stem contains however G*U base pair which is matter-antimatter asymmetric but breaks isospin symmetry and gives unit anomalous charge for the acceptor stem. Also other stems can contain G*U , U*G pairings as also P*G and L*U pairings (P and L denote amino-acids Pro and Leu). The study of some concrete examples [tRNAseqs] shows that single G*U bond is possible so that anomalous em charge can be non-vanishing but integer valued for double strand part of tRNA. Suppose that a given amino-acid can have anomalous of any codon coding for it. If P in G*P pair has the anomalous em charge of the codon CCG, G*P pair has vanishing anomalous em charge. If L corresponds to CUA the value of anomalous em charge is integer.
The anomalous em charge in general fails to vanish for the loops of hairpins. For the braids possibly associated with the loops of tRNA the strands can only end up to tRNA itself or nuclear membrane. If there are no braid strands associated with these regions, there is no color or anomalous em charge to be canceled so that the situation trivializes. On the other hand, in the case of tRNA I and U associated with the first nucleotide of the anticodon of tRNA can have a varying value of anomalous em charge. Therefore integer valued em charge and color singletness become possible for tRNA. tRNA can also contain aminoacids. If the aminoacids can carry a varying anomalous em charge with a spectrum corresponding to its values for DNA codons coding it, also they could help to stabilize tRNA by canceling the anomalous em charge.

D.2 tRNA plus amino-acid

Amino-acyl tRNA synthetase, which is the catalyst inducing the fusion of amino-acid with ACC stem [tRNA], could have braid strands to both amino-acid and tRNA and have regions with opposite anomalous em charges compensating separately that of amino-acid and of the active part of tRNA. The required correlation of amino-acid with anticodon would suggest that both D and T loops and A-loop are included. The simplest option is however that the anticodon is connected by braid to amino-acid so that braiding would define the genetic code at the fundamental level and the many-to-one character of genetic code would reflect the 1-to-many character of amino-acid-quark triplet correspondence. This hypothesis is easy to kill: for the portion of catalyst attaching to a given portion of DNA strand amino-acids and codons should have opposite anomalous em charges: Q_a(amino)=-Q_a(codon).
After the catalysis involving reduction of hbar amino-acid and tRNA would form a system with a vanishing net anomalous em charge but with a braiding structure more complex than that before the fusion.
In the translation process the braiding structure of tRNA- amino-acid system should re-organize: the braid strands connecting anticodon with amino-acid are transformed to braid strands connecting it to mRNA codon with a subsequent reduction of hbar of braid strands bringing tRNA into the vicinity of mRNA. In the transcription the anticodon-codon braiding would be replaced with amino-acid-mRNA braiding forcing formation of the amino-acid sequence. It will be later found that the simpler option without this step corresponds to the earlier hypothesis according to which amino-acids acted originally as catalysts for the formation of RNA double strand.
tRNA is basically coded by genes which suggests that the general symmetries of the genetic code apply to to the variants of tRNA associated with same anticodon. Hence the variants should result from each other by isospin splits and modifications such as permutations of subsequent nucleotides and addition of AT and CG pairs not changing overall color and isospin properties. Also anomalous base pairs X*Y can be added provide their net anomalous em charge vanishes.
tRNA has a complex tertiary (3-D) structure [tertiary] involving base pairing of distant nucleotides associated with the roots of the stem regions where tRNA twists sharply. This pairing could involve formation of braid strands connecting the nucleotides involved. The reduction of Planck constant for these strands could be an essential element of the formation of the tertiary structure.

E. Triplet code as a fusion of singlet and doublet codes?

In [prebio] I have discussed the hypothesis that the standard 3-code has emerged as a fusion of 1-codes with 4 1-codons and 2-code with 16 2-codons. It is interesting to see whether this model is consistent with the braid picture.

E.1 tRNA as fusion of tRNA₁ and tRNA₁

The earlier proposal was that the fusion of 1- and 2-code to 3-code meant (at least) the fusion of tRNA₁ and tRNA₂ to form a more complex tRNA of 3-code. This process would have involved fusion of 1- and 2-anticodons of tRNA. The visual inspection of tRNA shows that tRNA₁ and tRNA₂ could have been simple RNA hairpins during pre-biotic evolution. The variable loop associated with the T arm has indeed emerged during evolution and its function is believed to relate to the stability of tRNA [tRNA]. For instance, the anomalous em charge of the variable loop could compensate for the net em anomalous charge of amino-acid-tRNA system.

tRNA₁ is identifiable as a piece of tRNA extending from 5^' end to the first nucleotide (wobble nucleotide) of the anticodon. tRNA₂ would contain at its 5^'-end 2-codon and plus T arm and second half of the acceptor stem. The simpler structure of D-arm (in particular, the stem involves only 3 codon pairs) conforms with this view.

The emergence of tRNA anticodon as a fusion of 1-anticodon and 2-anti-codons could explain the wobble base pairing. The inverse assignment {U→ A, C→ G, {A,G}→ U, {U,A,C}→ I} deduced from the the number 40 of tRNAs and assigning unique 1-codon to only G could be interpreted as a non-deterministic correspondence generating new RNA sequences from existing ones.

E.2 The change of the role of amino-acids in the transition from pre-biotic to biotic evolution

In [prebio] it was proposed that during RNA era amino-acids catalyzed the replication of 2-RNA to its conjugate and that at some state the role of amino-acids and 2-anti-codons changed and instead of conjugate of 2-RNA strand amino-acid sequence was generated. In braiding picture this transition could be understood as a phase transition changing the dynamics of braiding.

Before the transition the amino-acid-2-anticodon braid generated in the formation of tRNA₂- amino-acid complex was replaced with 2-anticodon-RNA braid and amino-acid catalyzing the formation of RNA-conjugate strand pair.
In the transition a new step emerged: amino-acid began to form a braid with RNA codon and amino-acid sequence instead of conjugate RNA strand was generated in the process. Note that the number of amino-acids could have been larger than 16 before the transition since several amino-acids could have catalyzed same pairing of 2-codon with its 2-anticodon.

Contrary to the assumption of the original more complex model [prebio], tRNA₁ and tRNA₂ would have acted on same RNA sequences. Before the transition to 3-code tRNA₂ and amino-acids would have been responsible for the formation of double strands of RNA (tqc at RNA level requires the presence of double strands). tRNA₁ would have taken care of non-deterministic generation of new RNA sequences driving the evolution during RNA era. There is evidence that centrosomes have their RNA based code and this code might correspond to 2-codon code and involve also the non-deterministic 1-code.

The objection is that the resulting RNA sequences contain A, G, U, and I and are analogous to conjugates of RNA sequences rather than being proper RNA sequences. A possible way out of the problem is to build a conjugate of this sequence using tRNA₂. The problem is that if I base pairs with A,T, or C, ne obtains only the codons T,C,A. If U pairs with A and G as in the case of 1-code, also G is obtained. The presence of G*U pairs in tRNA₂ suggests that these pairings were indeed present. The presence of I in the tRNA₁ induced RNA sequences might prevent their interpretation as genuine RNA sequences, which would imply conjugation symmetry of RNA.

The objection is that the resulting RNA sequences contain A, G, U, and I and are analogous to conjugates of RNA sequences rather than being proper RNA sequences. A possible way out of the problem is to build a conjugate of this sequence using tRNA₁ again. Since I pairs with A,T, or C and U with A and G and G with G and A with U all nucleotides appear in the resulting sequence. The anomalous G*U base pairs in tRNA could be seen as remnants of RNA era. The presence of I in the tRNA₁ induced RNA sequences might prevent their interpretation as genuine RNA sequences, which would imply conjugation symmetry of RNA.

There is an additional argument supporting the idea that the coding of amino-acids emerges only after the formation of 3-code. If the 2-code would have coded for amino-acids before the fusion of the codes, the fusion should have involved also the fusion of corresponding RNA sequences in order to guarantee that the resulting 3-RNA sequence still codes for the amino-acids coded by 2-RNA sequences plus some new ones. This kind of fusion is not too plausible although I have considered this possibility in the earlier model [prebio].

F. Was the counterpart of cell membrane present during RNA era?

Topological quantum computation should have taken place already during RNA era. This suggest that the counterpart of the cell membrane was present already at that time. Quite recently it was reported that DNA duplexes length of 6 to 20 base pairs can join to longer cylinders which in turn form liquid crystals and that the liquid crystal phase separates from the phase formed by single DNA strands. Long strands had been already earlier known to form liquid crystals. This encourages to think that also RNA duplexes are able to self-organize in this manner so that the analog of cell nucleus containig RNA double helices as genetic material could have existed already during RNA era.

The nuclear membranes could have consisted of either ordinary RNA or its variant consisting of A,T,G,I produced by tRNA₁. The latter option would allow to distinguish between coding RNA and RNA used as building block of various structures. The sequences consisting of 30 RNA base pairs would correspond to the thickness of cell membrane and to the codon of M₆₁ code. Lipid layer of thickness 5 nm would correspond to roughly 16 base pairs and to the codon assignable to M₁₇.

For a more detailed exposition and background see the chapter DNA as Topological Quantum Computer of "TGD as a Generalized Number Theory".

TGD diary

Friday, January 25, 2008

Structure and function of tRNA in braid picture

No comments: