AI Machine Learning and Data Augmentation, Senior Scientist
Company: SystImmune Inc.
Location: Redmond
Posted on: April 5, 2025
Job Description:
SystImmune Inc. is a clinical-stage bio-pharmaceutical company
dedicated to developing innovative treatments for cancer through
breakthrough-therapeutic multi-specific antibodies and
antibody-drug conjugates (ADCs). With eight assets in ongoing
clinical trials and a robust preclinical pipeline, we are seeking a
talented Senior Scientist/Programmer in our AI Drug Design (AIDD)
team to lead the extension of our in-house implementation of large
language model (LLM), focusing on sequence-to-structure-to-activity
relationship modeling for antibody discovery, protein engineering,
and immunology oncology applications.Job Summary:We are looking for
an exceptional Senior Scientist/Programmer to drive innovation and
optimization in our AI capabilities, leveraging our internal data
and expertise in antibody discovery, protein engineering, and
immunology oncology. The successful candidate will design, develop,
and implement AI models, data pipelines, and parallel computing
architectures to accelerate the discovery and development of novel
therapeutics using our in-house LLM implementation.Key
Responsibilities:
- Llama 3.3 Implementation and Extension:
- Develop and fine-tune Llama 3.3 models for
sequence-to-structure-to-activity relationship prediction,
leveraging internal data from antibody discovery, protein
engineering, and immunology oncology projects.
- Integrate domain-specific knowledge and constraints into the
Llama 3.3 framework to improve model performance and accuracy.
- Data Generation and Processing:
- Design and implement data generation pipelines to produce
high-quality training datasets for Llama 3.3 models, including
sequence, structure, and activity data from internal projects.
- Develop and optimize algorithms for data processing, feature
extraction, and data augmentation to support Llama 3.3 model
development.
- Fine-Tuning of AI Model with Processed SI R&D Data:
- Fine-tune the Llama 3.3 model using processed SystImmune
R&D data, including antibody discovery, protein engineering,
and immunology oncology datasets.
- Integrate additional features and constraints from SystImmune's
internal data to improve model performance and accuracy.
- Embedding Language Models for New Data Structure Design:
- Embed language models (e.g., Milvus, LangChain) to convert data
into numerical representations and store it in a vector database
for new data structure design, mining, and processing.
- Utilize RAG (Retrieval-Augmented Generator) with MariaDB Vector
DB to enhance data retrieval and generation capabilities.
- Automatic Data Flow Management:
- Design and implement automatic data flow management from
current LIMS (Laboratory Information Management System) MariaDB to
AI Embedding DB, ensuring seamless data integration and
synchronization.
- Develop data pipelines to extract, transform, and load data
from various sources into the AI Embedding DB.
- Parallel Computing and Optimization:
- Implement parallel computing architectures using technologies
such as MPI, OpenMP, or distributed computing frameworks (e.g.,
Dask, Ray) to accelerate Llama 3.3 model training and
inference.
- Optimize code performance on various computing platforms,
including CPUs, GPUs, and high-performance computing clusters.
- Software Development and Integration:
- Design, develop, and maintain software applications and tools
for Llama 3.3 model training, data processing, and parallel
computing, using languages such as Python, C++, or Julia.
- Collaborate with our AIDD team to ensure seamless integration
of Llama 3.3 workflows into our production environment.
- Data Security and Backup Management:
- Develop and implement robust data security measures to protect
sensitive external data, including encryption, access controls, and
authentication protocols.
- Design and manage backup strategies for external data, ensuring
data integrity, redundancy, and recoverability in case of data loss
or corruption.
- Collaborate with IT and cybersecurity teams to ensure
compliance with organizational data security policies and
regulatory requirements (e.g., GDPR, HIPAA).
- Conduct regular security audits and risk assessments to
identify potential vulnerabilities and implement corrective
measures.
- Data Loss Prevention and Incident Response:
- Develop and implement procedures for preventing data loss and
responding to potential security incidents, including data breaches
or unauthorized access.
- Establish a disaster recovery plan to ensure business
continuity in case of data loss or system downtime.Requirements:
- Education:Ph.D. or Master's degree in Computer Science,
Artificial Intelligence, Bioinformatics, Computational Biology, or
related field.
- Experience:3+ years of experience in AI/ML model development,
with a focus on natural language processing and/or database
management.
- Technical Skills:
- Proficiency in Python, C++, or Julia programming
languages.
- Experience with deep learning frameworks such as PyTorch or
TensorFlow.
- Familiarity with parallel computing architectures and
distributed computing frameworks.
- Data Security and Backup Management.
- Domain Knowledge:Strong understanding of antibody discovery,
protein engineering, and immunology oncology principles, as well as
experience working with internal data from these fields.
- Communication Skills:Excellent communication and collaboration
skills, with the ability to work effectively with cross-functional
teams.Nice to Have:
- Experience with Llama 3.3 or other large language models.
- Experience with RAG (Retrieval-Augmented Generator) and MariaDB
Vector DB.
- Knowledge of process development principles and their
application in biopharma manufacturing.
- Experience with Milvus, LangChain, or other vector
databases.Compensation and Benefits:The hiring pay range for this
position is $150,000 - $250,000 per year based on skills,
education, and experience relevant to the role.SystImmune is a
leading and well-funded clinical-stage biopharmaceutical company
located in Redmond, WA and Princeton, NJ. It specializes in
developing innovative cancer treatments using its established drug
development platforms, focusing on bi-specific, multi-specific
antibodies, and antibody-drug conjugates (ADCs). SystImmune has
multiple assets in various stages of clinical trials for solid
tumor and hematologic indications. Alongside ongoing clinical
trials, SystImmune has a robust preclinical pipeline of potential
cancer therapeutics in the discovery or IND-enabling stages,
representing cutting-edge biologics development.We offer an
opportunity for you to learn and grow while making significant
contributions to the company's success. SystImmune offers a
comprehensive benefits package.SystImmune is an Equal Opportunity
Employer. We welcome diverse talent and encourage all qualified
applicants to apply.
#J-18808-Ljbffr
Keywords: SystImmune Inc., Bellevue , AI Machine Learning and Data Augmentation, Senior Scientist, Other , Redmond, Washington
Didn't find what you're looking for? Search again!
Loading more jobs...