Semantics and transcription factors
To an extremely high degree of fidelity, every cell in your body contains the same DNA. The most complicated object in the known universe (your brain) contains exactly the same DNA as your arse. This seems sort of weirdly obvious and ordinary but you should stop and think about it until it blows your mind as is only right and proper. The difference between brain and arse is not what DNA is available in the cells, but which chunks are being actively read from and written to. The process of turning DNA into the actual moving parts of a cell (the proteins) is called transcription and it’s largely governed by a set of proteins called transcription factors. Lexicon exists because we think you can control this process, which genes get read and which don't, with small molecules.
Building the first programs at Lexicon has therefore meant a lot of thinking about transcription factors and how to target them. This is a wild world of potentially incredibly important drug targets which we (biotech, pharma, academics) have typically struggled to drug directly. Most transcription factors lack a suitable small molecule binding site, making it difficult to identify and optimise chemical matter. Many also lack any stable 3D structure in solution, these are usually referred to as intrinsically disordered proteins (IDPs) and they are the class I'll be discussing in this post.
To over-simplify the structure-function paradigm, the difference between a protein and a random string of amino acids is that a protein has a 3D structure which determines its function. When a ‘normal’ protein (used here to mean anything not intrinsically disordered) achieves any sort of global flexibility we call it ‘denatured’ and this usually implies a catastrophic loss of function. A protein which can flit across multiple 3D states in solution is legitimately different enough from how we think about most proteins to be called something else, hence IDPs as a class. Nothing controversial so far. However, nomenclature in science really matters and the phrase ‘intrinsically disordered’ captures almost nothing of what it is that makes these proteins interesting. Intrinsic disorder implies that these proteins are inherently unstructured (literally disordered) which is absolutely not the case. Instead, they occupy a range of heterogeneous states. They nearly always display secondary structure and compactness in solution, whilst retaining a high degree of conformational flexibility. This allows them to further fold into any number of more ordered tertiary structures to exert a desired function depending on the specific biological context.
A better way of capturing what we know about the behaviour of these proteins would be to describe them as transiently ordered. Transient order is (imho) a fundamentally better description of what matters, which is when and how they function. The inherent flexibility of these proteins makes them multifunctional, context-dependent, sensors and regulators. Their capacity to adapt their structure (and therefore their function) makes them extremely useful as central nodes across crucial regulatory hubs.* By encoding one-to-many functionality at key nodes, you can significantly compress the size of the genome required to build an incredibly complex organism with hundreds, or perhaps thousands, of unique cell types - depending on how you measure this sort of thing, eukaryotic genomes contain roughly 2x as many intrinsically disordered protein regions (IDPRs) than prokaryotes. It also explains why transiently ordered proteins in humans are both overrepresented in the nucleus generally and as transcription factors - the master regulators of transcription. This centrality also makes them some of the best validated and most frustrating targets across multiple areas of disease biology, from the prototypical MYC in cancer to NFkB in inflammation.
The significance of all this for Lexicon is that targeting transiently ordered transcription factors is hard but potentially of enormous importance and two key aspects of their behaviour give us a potential advantage. First, they have a higher degree of ordered, 3D structure when they are in contact with other proteins or with chromatin. Secondly, their conformational flexibility makes them uniquely good at forming new protein-protein interactions. Lexicon is aiming to drug these targets by working with the transient nature of their biology, not against it - more on that in future posts.
*As a side note, that these proteins are so clearly distinct from the generic understanding of “DNA sequence -> encodes protein -> encodes function” poses something of a problem for any attempt to understand biology from DNA sequence alone. These proteins are social creatures, and their behaviour in a cell is dependent on the dense milieu in which they find themselves, not from sequence alone.
Further reading:
Babu et al, Chemical Reviews, 2014 - a brilliant review of the wider topic
Uversky, Protein Science, 2002 and 2013 - early paper + more recent update establishing terminology and discussing various conformational states
Wright & Dyson, Nat Rev Mol Cell Biol. 2015 - intrinsically disordered proteins and their functions
Bushweller, Nature Reviews Cancer (2019) - transcription factors as drug targets in oncology