![molprop](https://storage.googleapis.com/polaris-public/icons/icons8-fox-60-kinases.png)



## Background
**EGFR (Epidermal Growth Factor Receptor) kinase** is a type of receptor tyrosine kinase that plays a significant role in cell growth, proliferation, and survival. Mutations or overexpression of EGFR have been associated with various diseases, particularly cancer.

## Benchmarking
**EGFR Wild type**:  Targeting wild-type EGFR with small-molecule inhibitors, such as erlotinib, is an ongoing area of research in the treatment of glioblastoma. While early findings are promising, the complexity of glioblastoma biology presents challenges that require further investigation to improve treatment outcomes for patients.

**The goal** of this benchmark is to perform a single task, which is to the best predictive model for 
- Optimization of the bioactivity % inhibition for EGFR wile type.
- Discovery of potential hits in new chemical space.


## Description of readout 
- **Readouts**: `CLASS_EGFR`
- **Bioassay readout**: percentage of inhnibition.
- **Optimization objective**: postive label (1)
- **Number of data points**: train:  496 test:  144
- **Thresholds**:  > 80

## Data resource: 
- **Reference**: [PKIS2](https://www.ncbi.nlm.nih.gov/pubmed/28767711)

## Train/test split
Given the benchmarking goal, a scaffold-based splitting approach was applied to ensure training and test sets contain distinct chemical structures while maintaining the diversity of scaffolds.

**Distribution of the train/test in the chemical space**
![image](https://storage.googleapis.com/polaris-public/datasets/kinases/egfr/figures/drewry_egfr_wildtype_v1_tnse_scaffold_split.png)


## Related links
The full curation and creation process is documented -> [notebook](https://github.com/polaris-hub/polaris-recipes/blob/main/03_Kinases/EGFR)

## Related benchmarks
- polaris/drewry_egfr_wildtype_singletask_reg_v1
- polaris/egfr_wt_l858r_v1
> Note: It's recommanded to evaluate your methods agaisnt all the benchmarks related to this dataset. 
