Organic Codes: metaphors or realities?


Marcello Barbieri
Dipartimento di Morfologia ed Embriologia
Via Fossato di Mortara 64, 44100 Ferrara, Italy


ABSTRACT

Coding characteristics have been discovered not only in protein synthesis but in various other natural processes, thus showing that the Genetic Code is not an isolated case in the organic world. Other examples are the Sequence Codes, the Adhesion Code, the Signal Transduction Codes, the Splicing Codes, the Sugar Code, the Histone Code, and probably more. These discoveries however have not had a significant impact because of the widespread belief that organic codes are not real but metaphorical entities. They are supposed to lack arbitrariness and codemakers, the two qualifying features of real codes. Here it is shown that the arbitrariness issue can be solved on an experimental basis, while the codemaker issue is dependent on our theoretical description of the cell and can only be solved by a new concept. In order to appreciate the reality of the organic codes, in short, it is necessary to have not only a more critical evaluation of the experimental data but also a new theory of the living system.


Introduction

From time immemorial it has been thought that codes, or conventions, exist only in the world of culture. The discovery of the genetic code, in the 1960s, came therefore as a bolt from the blue, but the reaction was rather strange. The discovery of one organic code should have suggested that there could be more in nature, but what happened was the exact opposite. The genetic code was immediately declared a frozen accident, and any mention of other organic codes was ignored. Edward Trifonov, for example, has shown since the 1980s that there are at least three sequence codes in addition to the classic triplet code, but in vain. The situation started to change only in the late 1990s. In 1996, Redies and Takeichi described an Adhesive Code in the development of the nervous system, and in the year 2000, Gabius provided evidence for a Sugar Code, while Strahl, Allis, Turner and colleagues discovered a Histone Code.
These announcements, however, have barely raised an interest. Today, the existence of other organic codes is no longer ignored as it was in the past, but it is not seen as anything special. This response may appear surprising, but is not unfounded. It is the natural consequence of a widespread and deep-seated belief that all organic codes, including the genetic code, are only useful metaphors, not real entities. Molecular biology has borrowed many words from ordinary language, because they have an intuitive appeal and avoid long periphrases, but they are not meant to be literally true. The genetic code itself is given the name "code" only because this term is metaphorically appropriate, but deep down most biologists are convinced that it is nothing more than a good metaphor. And this for two basic reasons. The real codes that we are familiar with have two outstanding features: they are arbitrary rules, and they are made by a codemaker. These are the key entities: arbitrariness and codemaking. No code can be a real code without these qualifying features, and most biologists are convinced that organic codes simply do not have them. This is the crucial point: why do people believe that organic codes do not have those two qualifying features?


The codes' fingerprints

A code is a set of rules which establish a correspondence between two independent worlds. The Morse code, for example, is a correspondence between combinations of dots and dashes with the letters of the alphabet, and - in the same way - the genetic code is a correspondence between combinations of nucleotides and amino acids. From the point of view of the definition, there is no difference between them. Why then do people believe that the Morse code is real and the genetic code is not? One reason, as we have seen, is arbitrariness. We know that the Morse code is arbitrary because we have built it ourselves, and we are certain that there is no necessary link between dots and dashes and the letters of the alphabet. But ask a biologist if the same arbitrariness exists between nucleotides and amino acids, and you are likely to get a very different response. Many would deny it out of hand, others would say that the two codes are not comparable, and some would reply that we still need more data.
One of the most common arguments against the arbitrariness of the genetic code is the determinism of protein synthesis. Every single step of the translation process is perfectly deterministic, in the sense that a chain of nucleotides is translated into a chain of amino acids with a precise sequence of reactions. This is the most popular argument, probably because it has a strong intuitive appeal, and yet it is not a valid one. The same determinism, in fact, is present even when cultural codes are implemented. When the mental image of an apple is formed in the visual cortex and we pronounce the word "apple", there is a precise chain of neurological reactions between the two mental images. A neurologist would say with no hesitation that the neural connection between the visual area and the speech area of the brain is perfectly deterministic, and yet the connection was established by a linguistic code which is perfectly arbitrary. The implementation of the rules of a code, in short, is deterministic in all codes, even in the cultural ones. The arbitrariness comes in only when a code is created or modified, not when it is implemented.
We need therefore positive evidence in order to obtain reliable conclusions, and it is the very definition of the codes that tells us what to look for. Since a code is a bridge between two independent worlds, an organic code necessarily requires organic molecules that perform two independent recognition steps. These are the "adaptors", the name that Francis Crick proposed for the molecules that today we call transfer RNAs. All codes need molecules that perform equivalent functions, and so all these molecules can be called adaptors. The adaptors are catalysts which have two different recognition sites, and what qualifies them as adaptors is the fact that there is no necessary connection between the two sites. The site which recognises the objects of one world can be associated with any of the sites that recognise the objects of the other world, and this means that a connection can only be established by an arbitrary choice, by a "natural convention". The adaptors, in short, are the "fingerprints" that reveal the presence of an organic code.
In the case of the genetic code, it has been possible to prove that the nucleotide site is independent from the amino acid site by actually changing the rules of the code in vitro, and a similar experiment has been performed in vivo by some micro-organisms. This should have settled the arbitrariness issue for good, but ingrained opinions are hard to die, and so we still hear the claim that the association between nucleotides and amino acids is not arbitrary, because some regularities have been discovered in the genetic code. This is true, but it has nothing to do with arbitrariness, and in fact regularities also exist in cultural codes. In the Morse code, for example, the most frequent letters of the alphabet are associated with the simplest combinations of dots and dashes, but nobody would dream to conclude that the Morse code is not a true code for that.
In the case of the genetic code, furthermore, there are also other factors in favour of its arbitrariness. The number and the types of the amino acids, for example, could have been different, because many other amino acids exist in nature, and the same is true for the nucleotides. In the genetic code, in short, we find arbitrariness not only in the rules of the code, but also in the choice of the objects which are coded by those rules. And this is perfectly equivalent to what happens in the linguistic codes, where arbitrariness exists not only in the rules of grammar, but also in the number and in the type of letters which are chosen to make up an alphabet.
The arbitrariness of the organic codes, in conclusion, can be demonstrated by a variety of experimental facts, and above all by the existence of adaptors (it was the presence of adaptors in signal transduction and in splicing, that allowed me to conclude, in 1998, that these processes are based on organic codes). Arbitrariness alone, however, is not enough, because it could be the result of an extraordinary number of coincidences. A real code requires arbitrariness and codemakers, and the existence of a codemaker is an issue where theory plays an even greater role than experiments. It is also to theory, therefore, that we need to turn our attention.


The Third Party

The extraordinary thing about codes is that they require a new entity. In addition to energy and information they require meaning. For centuries, meaning has been regarded as a spiritual or a transcendental entity, but in reality it is a perfectly natural entity because we can define it with an operative procedure just as we do with all physical quantities. Meaning is an object which is related to another object by a code. The meaning of the word apple, for example, is the mental object of the fruit which is associated to the mental object of that word by the code of the English language. More in general, a cultural meaning is always a mental object which is associated to another mental object by a convention. But the operative definition of meaning need not be restricted to the mental world because it applies equally well to the organic world. The meaning of a combination of dots and dashes is a letter of the alphabet, in the Morse code. And in the same way, the meaning of a combination of three nucleotides is usually an amino acid, in the genetic code (from which it follows that the meaning of a gene is usually a protein).
We are well aware that it is man who gives meaning to mental objects - in the realm of the mind he is the codemaker - but this does not mean that a code of correspondence between two independent worlds must be produced by a conscious activity. The only logical necessity is that the codemaker is an agent which is ontologically different from those worlds, because if it belonged to one of them the two worlds would no longer be independent. A code, in other words, requires three entities: two independent worlds and a codemaker which belongs to a third world (from a philosophical point of view this is equivalent to the triadic system proposed in semiotics by Charles Peirce).
The problem is that the cell is described as a dualistic system of genes and proteins, genotype and phenotype, software and hardware, and in a dualistic system there is no third party that can act as a codemaker. This is why I proposed, in 1981, that the cell is not a duality of genotype and phenotype but a trinity made of genotype, phenotype and ribotype. The ribotype was defined as the ribonucleoprotein system of the cell, and it was underlined that it represents a new cell category. As phenotype is the seat of metabolism and genotype the seat of heredity, so ribotype is the seat of genetic coding.
It is an experimental fact that the genetic code is implemented by ribonucleoproteins, and this strongly suggests that the ribotype is the codemaker of the genetic code, but it does not prove it. Only a theory can establish the ontological status of the ribotype as an independent cell category. We have therefore before us two very different concepts: the cell as a duality (the genotype-phenotype theory) or the cell as a trinity (the ribotype theory). The problem is how to choose between them.


The origin-of-life metaphors

The evaluation of theories is a complex affair, in general, but there are theories which can be illustrated by metaphors, and in these cases the metaphors should be discussed first, because their intuitive appeal often takes priority over rational thinking. In our case, a theory of the cell can be illustrated by a metaphor on the origin of life, because the nature and the origin of a system are two faces of the same problem. If the cell is a duality of genotype and phenotype, for example, the problem of the origins is understanding whether it was the genes or the proteins which came first. The genotype-phenotype theory, in other words, corresponds to the-chicken-and-the-egg metaphor on the origin of life. In this framework, it doesn't even make sense to speak of three categories, and so the ribotype theory had to be illustrated by a totally different metaphor. More precisely, by the-cell-as-a-city metaphor, where the proteins of the cell are compared to the houses of a city, and the genes to their blueprints (Barbieri, 1981, 1985). In this framework, it is the chicken-and-the-egg problem that makes no sense, because it would be equivalent to asking if it was the houses or the blueprints which came first, and either answer would be wrong. What came first was a third party, the inhabitants, i.e. the intermediaries between houses and blueprints in a city which correspond to the intermediaries between proteins and genes in a cell.
Our theories of the cell are illustrated therefore by different metaphors on the origin of life, and it may be worthwhile to examine them in some detail. As a matter of fact, as soon as we take a closer look at the-chicken-and-the-egg metaphor, we realise that there is something wrong with it. The egg and the chicken are not the two faces of one duality. They are two dualistic systems in different stages of development. Each one of them is a complete genotype-phenotype entity, and it is pure fiction to say that one represents the genotype and the other stands for the phenotype.
We do indeed need a better metaphor, and the-cell-as-a-city does have a certain intuitive appeal. But this metaphor has not become anything like as popular as the-chicken-and-the-egg, and it is highly instructive to understand why. The crucial point is that in a city only the inhabitants are alive, whereas houses and blueprints are not. The city metaphor, in other words, implies that genes and proteins are molecular artifacts, just as blueprints and houses are human artifacts. And this seems a preposterous idea. How can one accept that genes and proteins, the very molecules of life, are inanimate manufactured objects? That probably explains why the ribotype theory has not attracted the attention of the origin-of-life people. And yet it has never been proved that the preposterous idea is false. It may be interesting therefore to take a look at it.

Copymakers and codemakers

There was a time when atoms did not exist. They came into being within giant stars, and were scattered all over the place when those stars exploded. There was a time when molecules did not exist. They originated from the combination of atoms on a variety of different places such as comets and planets. There was a time when polymers did not exist. They were produced when molecules joined together at random and formed chains of subunits. There was a time when all the polymers of our planet were random molecules, but that period did not last forever. At a certain point, new types of polymers appeared. Some molecules started making copies of polymers, and for this reason I call them copymakers. Other molecules made coded versions of the copies, and I refer to them as codemakers. On the primitive Earth, the copymakers could have been RNA-replicases and the codemakers could have been transfer-RNAs, but other possibilities exist, and so here we will use the generic terms of copymakers and codemakers. All that matters, for our purposes, is the historical fact that copymakers and codemakers came into being and started producing copied molecules and coded molecules.
Now let us take a look at these new polymers. The formation of a random chain of subunits is accounted for by the laws of thermodynamics and does not require any new physical quantity. But when a copymaker makes a copy of that chain, something new appears: the sequence of subunits becomes information for the copymaker. In a similar way, when a codemaker takes a chain of monomers of one kind to produce a chain of monomers of a different kind, something new appears: the second chain becomes the meaning of the first one. It is only the act of copying that creates information, and it is only the act of coding which creates meaning. Information and meaning, in other words, appeared in the world when copymakers and codemakers came into existence and started functioning.
The appearance of copied polymers and coded polymers was a major event also for another reason. Up to that point, all molecules formed on the primitive Earth had one thing in common: their structure was entirely determined by the assembly properties of their atoms, i.e. from within. In the case of copied and coded polymers, in contrast, the order of the subunits was determined by external templates, i.e. from without. In everyday language, we distinguish between natural and artificial products in a straightforward way: the objects which are formed spontaneously are natural, while those which are shaped by external agents are artificial. And that is precisely the distinction that exists between random polymers on one hand and copied or coded polymers on the other. I conclude therefore that copied molecules (genes) and coded molecules (proteins) are indeed, in a very deep sense, artificial molecules. They are artificial because they are produced by external agents, because their primary structure is determined from without and not from within, because their production involves outside processes based on information and meaning.
There was a time when the world was inhabited only by natural molecules, but that period did not last forever. At a certain point copied and coded molecules appeared, and the world became also inhabited by artificial molecules. By artifacts made by nature. And that was not just another step toward life. It was the appearance of the very logic of life because, from copymakers and codemakers onward, all living creatures have been artifact-makers. In a very fundamental sense, we can define life itself as artifact-making.


The handicapped replicator

The cell-as-a-city metaphor suggests that proteins and genes are artificial molecules, and we have just seen that, deep down, that is precisely what they are. The metaphor also suggests that modern cells are to primitive cells what large cities are to small villages, and this is not an unreasonable analogy. Modern eukaryotic cells, for example, contain millions of ribosomes, like the inhabitants of large cities, while prokaryotic cells have only hundreds or thousands of ribosomes, like the inhabitants of villages.
The metaphor can also be extended to earlier stages of evolution. If the origin of the first cells is likened to the origin of the first villages, we can compare the age of precellular evolution to the period of history in which villages did not exist. The interesting point is that this metaphor allows us to take a closer look at today's most popular model on precellular evolution: the model of the naked gene as the first replicator (Dawkins, 1976).
Dawkins has readily admitted that genes are not doing any replication, but since they code for the molecules that replicate them, he finds it legitimate to call them "replicators" in order to avoid long periphrases. Michael Ghiselin (1997) has pointed out that this is confusing the "obiect" with the "agent" of replication, but Dawkins' use of the word has stuck, and today most biologists seem to be taking for granted that genes are replicators. This is why I have avoided that word altogether and I have used the term copymakers. The distinction between copymakers and copies is still alive and well, and so there is no danger of confusing what is copied with what does the copying. Whatever one's choice of words, however, the real point is the substance, not the terminology.
The substance of the replicator model is that all that matters in life is information, and all that matters in evolution is the replication of information with occasional mistakes. But at the heart of life there are two fundamental entities, not one. Information and meaning are two independent entities, copying and coding are two independent processes, and the codemaker between genes and proteins must be a third party because otherwise there would be no real code. The replicator model is not wrong, but incomplete (or handicapped), because what matters in life is replication and coding, not replication alone (I prefer to speak of copying and coding, but the message is the same). The replicator model would be right if the cell were a Von Neumann automaton where the hardware is completely described by the software, and information is really everything, but nature has not taken that path. And probably for very good reasons, because that path was seriously undermined by the error catastrophes.
One could still argue, however, that a "naked gene" phase should have preceded a phase of "copying-and-coding", and this is where the cell-as-a-city metaphor can help us. The metaphor suggests that before cities there were villages, that before villages there were humans living in the open, that before humans there were ancestral hominids, and so on. The point is that in all stages there were "agents" not just "objects". There has never been a time in precellular evolution in which copied molecules (genes) could exist without copymakers, or coded molecules (proteins) without codemakers. It was copymakers and codemakers which came first, because they were the first "agents" in the history of life. The first molecules of the ribotype world were produced by random processes and the chances of getting copymakers or codemakers (for example, RNA-replicases or transfer-RNAs) were not substantially different. Any one could have appeared before the other, without making much difference. What did make a difference was the appearance of both of them because only their combination created a renewable link between genes and proteins. It was a ribotypic system containing copymakers and codemakers that started life, because that was the simplest possible lifemaker, i.e. the simplest agent. Admittedly, a naked gene would have been a simpler system but it would not have been an agent, and that makes all the difference. As Einstein once remarked, "things should be made as simple as possible, but not simpler".


Conclusion

There are experimental facts (the adaptors) and theoretical concepts (the ribotype) which show that organic codes have the two qualifying features of all real codes (arbitrariness and codemakers). But adaptors and ribotype are still largely ignored, and so it is not surprising that most biologists continue to believe - in perfect good faith - that organic codes do not really exist out there. Which is rather reassuring, in a way, because it shows that even in this age of high technology what we see in nature is what our theories allow us to see.


BIBLIOGRAPHY


Barbieri M. (1981). The Ribotype Theory on the Origin of Life. Journal of Theoretical Biology, 91, 545-601.
Barbieri M. (1985). The Semantic Theory of Evolution. Harwood Academic Publishers, London and New York.
Barbieri M. (1998). The Organic Codes. The basic mechanism of macroevolution. Rivista di Biologia-Biology Forum, 91, 481-514.
Barbieri M. (2001). The Organic Codes. The birth of semantic biology. Pequod, Ancona. (new edition to be published by Cambridge University Press).
Gabius H.J. (2000). Biological Information Transfer Beyond the Genetic Code: The Sugar Code. Naturwissenschaften, 87, 108-121.
Gamble M.J. and Freedman L.P. (2002). A coactivator code for transcription. Trends in Biochemical Sciences, 27 (4), 165-167.
Gamow G. (1954). Possible relation between deoxyribonucleic acid and protein structure. Nature, 173, 318.
Jenuwein T. and Allis D. (2001). Translating the Histone Code. Science, 293, 1074-1080.
Khorana H.G. et al. (1966). Polynucleotide synthesis and the genetic code. Cold Spring Harb. Symp. Quant. Biol. 31, 39-49.
Niremberg M.W. and Matthaei J.H. (1961). The dependence of cell-free protein synthesis in E.coli upon naturally occurring or synthetic polyribonucleotides. Proc. Nat. Acad. Sci. USA, 47, 1588-1602.
Niremberg M. W. et al. (1966). The RNA code and protein synthesis. Cold Spring Harb. Symp. Quant. Biol., 31, 11-24.
Readies C. and Takeichi M. (1996). Cadherine in the developing central nervous system: an adhesive code for segmental and functional subdivisions. Dev. Biology, 180, 413-423.
Strahl B.D. and Allis D. (2000). The language of covalent histone modifications. Nature, 403, 41-45.
Trifonov E.N. (1988). Codes of nucleotide sequences. Math. Biosciences, 90, 505-517.
Trifonov E,N. (1989). The multiple codes of nucleotide sequences. Bulletin of Mathematical Biology, 51, 417-432.
Trifonov E.N. (1999). Elucidating Sequence Codes: Three Codes for Evolution. Annals of the New York Academy of Sciences, 870, 330-338.
Turner, B.M. (2000). Histone acetylation and an epigenetic code. BioEssay, 22, 836-845.


[Home Page -Papers]