Wing Ki (Catherine) Wong

PhD in Bioinformatics, Scientist in Antibody Development

Fundamental Analysis

Fundamental analysis inspects the financial health of the company. Here I put together a notebook with web scrapers and some simple financial ratios to analyse the cash flow, income statement and balance sheet, of companies within the same industry. 

 

Stock modelling basics

The stock market is undoubtedly one of the most challenging datasets to model. I had a look at some common packages, technical indicators, trading strategies and evaluation tools. This notebook contains a number of snippets that helped me get started.

A crash course in Medicinal Chemistry

In January 2019 – April 2019, I undertook an internship at a pharmaceutical company, UCB, working with the Computer-aided drug discvoery team. Coming from a non-chemistry background, the first few weeks involve picking up the theories and terminology in basic medicinal chemistry. Here I put together a list of key concepts that have been useful to get me started.

Pharmacokinetics

 E+S \overset{Binding}\rightleftharpoons ES \overset{Catalysis}\rightleftharpoons E+P

 

where E = enzyme, S = substrate, P = product.

Theories

Michaelis-Menten equation:
 V = \frac{V_{max}[S]}{K_m+[S]}

 

Turning it into an equation with a slope:
\frac{1}{V} = \frac{K_m+[S]}{V_{max}[S]} = \frac{K_m}{V_{max}} \frac{1}{[S]} + \frac{1}{V_{max}}

which renders the Lineweaver–Burk plot:

 

Assume:  aA + bB \rightleftharpoons xAB

The equilibrium between forward and backward reaction is: K_D = \frac{[A]^a[B]^b}{[AB]^x}

Clark’s occupancy theory suggests: \frac{E}{E_{max}} = \frac{[L]}{[L]+K_D}

Drug A: Efficacious but not potent;
Drug B: Potent but not efficacious

Rearranging the Scatchard equation, which describes the ratio of bound ligands to the total number of available binding sites, we get this equation for the Scatchard plot: \frac{[PL]}{[L]} = -\frac{1}{K_D}[PL]+\frac{[PL]_{max}}{K_D} where K_D = \frac{[PL]}{[P][L]}

Cheng-Prusoff Equation: K_i = \frac{IC_{50}}{1+\frac{[S]}{K_D}}

Lipinski’s rule of five (RO5)

  • Molecular Mass \leq 500 Da
  • Lipophilicity (logP) \leq 5
  • Number of Hydrogen Bond Acceptors \leq 10
  • Number of Hydrogen Bond Donors \leq 5

Rule of three (RO3)

  • Molecular Mass \leq 300 Da
  • log P \leq 3
  • Number of Hydrogen Bond Acceptors \leq  3
  • Number of Hydrogen Bond Donors \leq  3
  • Number of rotatable bonds \leq  3

 

Lipophilicity is primarily associated with solubility, absorption, membrane penetration and distribution, i.e. the ADME and PKPD properties of the drug. This is usually taken as a ratio between octan-1-ol and water: log P_{octanol/water}.

 

Lipophilic efficiency: LiPE = pIC_{50}-log P

Quality drug candidates usually have high LiPE >6.

Terminology

Better if higher

E_{max}

Better if lower

EC_{50}, IC_{50}, K_D, K_I

Nomenclature

Knowing how your co-workers speak speed up your daily work. Here is the image from Wikipedia’s page on heterocycles – a structure that is often seen in medicinal chemistry:

Optimising Pandas usage

Take home: use numpy instead of pandas if you have numerous calculations.

 

Source: https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6

A few notes on protein sequence analysis

A branch of protein bioinformatics looks at the sequence-structure relationship between protein structures and their encoding sequences. This is most prominent in homology modelling, where we align a query sequence to a database of sequences with known structures, pick the hit with the highest similarity and cast the structure to the query sequence. These different steps can be a field on its own: sequence alignment, similarity measure and structural remodelling. Here I mention a number of tools used in my current research group:

(more…)