Misha Rubanov

From Classification to Connectivity: Generating Liquid-Handling Workflows with GNNs

2025-09-02T00:00:00+00:00

Modern Graph Neural Networks (GNNs) excel at predicting node and edge attributes, but many practical problems require changing the graph itself. Liquid-handling protocols are a prime example: executing a protocol means constructing a sequence of transfers that incrementally grows a workflow graph while respecting hard physical and chemical constraints. This post sketches how to adapt GNNs from attribute prediction to connectivity generation for liquid handling.

Background: GNN building blocks

If you are new to GNNs, I recommend the clear and interactive overview in Distill’s “A Gentle Introduction to Graph Neural Networks” link. It explains message passing, aggregation and update functions, and how information flows over graph structure.

Key takeaways for our setting:

GNNs operate over nodes, edges, and (optionally) global features via permutation-invariant aggregation.
Information is localized and propagates by hops, which is useful for enforcing local constraints (e.g., volumes in a well or sterility across edges) while letting global features (e.g., temperature, instrument state) influence decisions.

Problem framing: protocols as dynamic DAGs

We represent a liquid-handling protocol as a dynamic, directed acyclic multigraph (DAG) over time steps t = 0..T. The DAG constraint is fundamental: liquid handling operations cannot create cycles because time flows forward and reagents cannot be “un-mixed” or “un-transferred.”

Nodes: containers/wells, instrument resources (tips, reservoirs), intermediate mixtures, deck locations.
Edges: operations such as aspirate, dispense, transfer, mix; edges carry attributes (volume, liquid identity, tip id, speed, timestamp).
State: per-node attributes (current volume, composition, contamination risk), per-edge attributes (history), and global context (robot capabilities, timing, environment).
DAG Invariant: Every edge (u,v) must satisfy timestamp(u) < timestamp(v), ensuring no cycles can exist.

At each step we add one or more edges that modify node states while preserving the DAG property. Generation ends when goals are satisfied (target mixture, plate layout) or no valid actions remain.

Why attribute models aren’t enough

Attribute-focused GNNs answer questions like “what volume should be in well A?” given a fixed graph. Workflow synthesis instead requires proposing valid connectivity changes. We need a model that:

Proposes the next operation (an edge or set of edges), including its endpoints and attributes.
Respects constraints (conservation of volume, sterility, capacity, tool availability).
Plans long-horizon sequences to reach targets.

Modeling approaches for connectivity generation

There are several viable families, which can be combined:

1) Autoregressive edge generation

Factorize p(protocol) into a sequence of edge additions. At each step, encode the current graph with a message-passing GNN; a policy head scores candidate (source node, op type, target node, attributes).
Sampling: top-k or beam search with constraint masking.
Benefits: precise control and easy constraint integration; drawbacks: long horizons.

2) Diffusion or denoising over graphs

Start from a noisy action plan and denoise into a valid workflow using a GNN denoiser conditioned on task goals and instrument state.
Useful for exploring diverse plans; requires careful constraint handling during sampling.

3) Constraint-satisfying planning with neural guidance

Use a symbolic planner or MILP/CP-SAT to enforce hard physical constraints; use a GNN to learn heuristics (cost-to-go, action priors) that guide the search.
Strong guarantees with improved speed/quality from learning.

4) Imitation + RL hybrid

Train the policy with behavior cloning on historical protocols; fine-tune with RL using a simulator that implements lab physics and penalties for invalid or unsafe actions.

Let me elaborate on each approach with technical details and examples:

1. Autoregressive Edge Generation

This approach treats protocol generation as a sequence modeling problem where each step adds one or more edges to the growing workflow DAG.

Architecture Details:

Encoder: A k-layer message-passing GNN (e.g., GraphSAGE, GAT) processes the current graph state
Policy Head: Multi-output network that predicts:
- Operation type (categorical: transfer, mix, aspirate, dispense, etc.)
- Source node selection (pointer network over available nodes)
- Target node selection (pointer network with feasibility masking)
- Continuous attributes (volume, speed, temperature) with bounded distributions
- Timestamp assignment: Critical for maintaining DAG property

Training Strategy:

Teacher forcing: use ground truth previous actions during training
Scheduled sampling: gradually transition from teacher forcing to autoregressive generation
Constraint masking: zero out probabilities for invalid actions (e.g., transferring from empty wells, creating cycles)
DAG enforcement: Ensure timestamp(u) < timestamp(v) for all new edges (u,v)

Example Implementation:

# Simplified pseudocode
def generate_step(current_graph, goal_embedding):
    # Encode current state
    node_embeddings = gnn_encoder(current_graph)
    
    # Predict next operation
    op_type = op_classifier(node_embeddings, goal_embedding)
    
    # Select source and target with pointer networks
    source_logits = source_pointer(node_embeddings, op_type)
    target_logits = target_pointer(node_embeddings, op_type, source_logits)
    
    # Apply feasibility masks
    source_logits = source_logits * source_feasibility_mask
    target_logits = target_logits * target_feasibility_mask
    
    # Sample and return action
    return sample_action(op_type, source_logits, target_logits)

Advantages:

Direct control over generation process
Easy to integrate hard constraints via masking
Interpretable: each action is explicit and traceable
Can use beam search for better planning

Challenges:

Sequential nature limits parallelization
Error accumulation over long sequences
Requires careful curriculum learning for complex protocols

2. Diffusion/Denoising over Graphs

This approach starts from a noisy, potentially invalid workflow and progressively denoises it into a valid protocol DAG.

Architecture Details:

Noise Schedule: Gradually add noise to a target protocol over T timesteps
Denoiser: GNN that predicts the clean protocol given noisy input and timestep
Conditioning: Task goals, instrument constraints, and current lab state
DAG Structure: Denoiser must learn to respect temporal ordering constraints

Training Process:

Start with clean protocols from dataset
Add Gaussian noise over T timesteps
Train denoiser to predict original protocol given noisy version and timestep
Use classifier-free guidance for better control

Example Implementation:

def diffusion_generate(goal_embedding, num_steps=1000):
    # Start with pure noise
    noisy_protocol = torch.randn(protocol_shape)
    
    for t in reversed(range(num_steps)):
        # Predict clean protocol
        predicted_clean = denoiser(noisy_protocol, t, goal_embedding)
        
        # Apply constraint projection
        predicted_clean = project_to_constraints(predicted_clean)
        
        # Denoise step
        noisy_protocol = denoise_step(noisy_protocol, predicted_clean, t)
    
    return noisy_protocol

Advantages:

Can generate diverse, high-quality protocols
Natural handling of global structure
Good at exploring solution space

Challenges:

Requires many denoising steps
Constraint satisfaction during sampling is tricky
Less interpretable than autoregressive methods

3. Constraint-Satisfying Planning with Neural Guidance

This hybrid approach combines symbolic planning with learned heuristics from GNNs.

Architecture Details:

Symbolic Planner: MILP/CP-SAT solver that enforces hard constraints including DAG structure
Neural Heuristic: GNN that learns to guide the search efficiently
Integration: Use GNN predictions to order search branches or estimate costs
Temporal Constraints: Solver ensures timestamp ordering and prevents cycles

Training Strategy:

Collect planning traces from solver
Train GNN to predict:
- Action priors (which operations are likely useful)
- Cost-to-go estimates (how expensive remaining steps will be)
- Constraint violation likelihood

Example Implementation:

def guided_planning(initial_state, goal):
    # Encode state with GNN
    state_embedding = gnn_encoder(initial_state)
    
    # Use in symbolic planner
    plan = symbolic_planner(
        initial_state, 
        goal,
        action_heuristics=action_priors,
        cost_heuristics=cost_estimate
    )
    
    return plan

Advantages:

Guaranteed constraint satisfaction
Can leverage decades of optimization research
Neural guidance improves search efficiency

Challenges:

Requires symbolic constraint modeling
Integration complexity
May be slower than pure neural approaches

4. Imitation + RL Hybrid

This approach starts with supervised learning on historical data and refines with reinforcement learning.

Architecture Details:

Behavior Cloning: Initial training on expert demonstrations
RL Fine-tuning: Use simulator rewards to improve policy
Hybrid Loss: Combine imitation and RL objectives

Training Phases:

Phase 1: Train policy to mimic expert protocols
Phase 2: Use RL to optimize for efficiency, robustness, and safety
Phase 3: Iterative improvement with human feedback

Example Implementation:

def hybrid_training(expert_data, simulator):
    # Phase 1: Behavior cloning
    policy = train_imitation(expert_data)
    
    # Phase 2: RL fine-tuning
    for episode in range(num_episodes):
        state = simulator.reset()
        done = False
        
        while not done:
            action = policy(state)
            next_state, reward, done = simulator.step(action)
            
            # Update policy with RL algorithm (e.g., PPO)
            policy.update(state, action, reward, next_state)
            state = next_state

Advantages:

Starts with reasonable behavior
Can optimize for complex objectives
Combines best of supervised and RL

Challenges:

Requires high-quality simulator
RL training can be unstable
Need to balance imitation vs. exploration

Combining Approaches

The most effective systems often combine multiple approaches:

Use autoregressive generation for high-level structure
Apply diffusion for local refinements
Use symbolic planning for critical safety constraints
Fine-tune with RL for efficiency optimization

Action parameterization and constraint masking

To keep the action space tractable:

Predict operation type first (transfer/mix/thermo step), then endpoints via pointer networks over node embeddings, then continuous attributes (e.g., volume) with bounded distributions.
Timestamp assignment: Each new operation must have a timestamp greater than all previous operations to maintain DAG structure.
Apply masks derived from current state: available tips, sufficient volume at source, capacity at destination, deck reachability, sterility compatibility.
DAG constraint masking: Prevent edges that would create cycles or violate temporal ordering.
Enforce invariants by projection (e.g., clip volumes to feasible ranges) and by rejecting invalid samples.

State representation details

Node features: current volume, composition embedding (e.g., learned from reagent ontology), temperature, contamination flags, container geometry.
Edge features: operation type, executed volume, time since last action, tip id.
Global features: assay goal embedding, allowed instruments, remaining time budget.
Temporal encoding: append step index or use recurrent GNN layers to retain history.

Training signals and datasets

Imitation data: parse existing protocols (e.g., from OT-2, Hamilton scripts) into action graphs.
Supervision: next-edge classification, endpoint selection, and attribute regression; auxiliary losses for state prediction (e.g., next-node volume) improve stability.
Negative sampling: generate near-miss actions (slightly over volume, wrong tip) to sharpen constraint awareness.

Evaluation metrics

Validity: fraction of generated steps passing all constraints; zero spills/overflows; no cross-contamination.
Goal satisfaction: assay success rate, target composition accuracy.
Efficiency: action count, total time, tip consumption, deck moves.
Diversity: unique valid workflows per goal.
Sim-to-real: execution success on hardware with minimal edits.

Minimal prototype sketch

Outline of an autoregressive generator with constraint masking:

Encode current graph with a k-layer message-passing GNN.
Predict operation type with a masked classifier.
Select source and target nodes using pointer heads over node embeddings with feasibility masks.
Assign timestamp: Ensure new operation timestamp > all previous timestamps to maintain DAG.
Regress attributes (volume, speed) with bounded outputs; project to valid ranges.
Validate DAG: Check that no cycles would be created by the new edge.
Update node states and append the new edge; repeat until done.
Use beam search for better plans; score beams by learned value function + hard constraint checks.

Concrete Example: Variable Serial Dilution Network Discovery

Let’s implement a simplified version of the autoregressive approach for discovering the network required for a variable serial dilution on a 96-well plate. This example shows how DAG constraints and connectivity generation work in practice.

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
from typing import List, Tuple, Dict, Optional
from dataclasses import dataclass
from enum import Enum

# Define operation types
class OpType(Enum):
    ASPIRATE = "aspirate"
    DISPENSE = "dispense"
    TRANSFER = "transfer"
    MIX = "mix"

@dataclass
class LiquidState:
    """Represents the state of liquid in a well"""
    volume: float  # Current volume in μL
    concentration: float  # Concentration of target compound
    contamination_risk: float  # Risk of cross-contamination (0-1)
    timestamp: int  # When this state was created

@dataclass
class Operation:
    """Represents a liquid handling operation"""
    op_type: OpType
    source_well: Optional[str]  # None for aspirate from reservoir
    target_well: str
    volume: float
    timestamp: int
    tip_id: str

class DilutionWorkflow:
    """Represents the current state of a dilution workflow"""
    def __init__(self, plate_rows: int = 8, plate_cols: int = 12):
        self.plate_rows = plate_rows
        self.plate_cols = plate_cols
        self.wells = {}  # well_id -> LiquidState
        self.operations = []  # List of Operation objects
        self.available_tips = [f"tip_{i}" for i in range(8)]  # 8-channel pipette
        self.timestamp = 0
        
        # Initialize source wells (e.g., A1 has stock solution)
        self.wells["A1"] = LiquidState(volume=200.0, concentration=1000.0, 
                                      contamination_risk=0.0, timestamp=0)
    
    def get_well_id(self, row: int, col: int) -> str:
        """Convert row/col to well ID (e.g., A1, B2)"""
        return f"{chr(65 + row)}{col + 1}"
    
    def can_transfer(self, source: str, target: str, volume: float) -> bool:
        """Check if a transfer operation is valid"""
        if source not in self.wells or target not in self.wells:
            return False
        
        source_state = self.wells[source]
        target_state = self.wells[target]
        
        # Check volume constraints
        if source_state.volume < volume:
            return False
        
        # Check contamination risk (can't transfer to contaminated wells)
        if target_state.contamination_risk > 0.5:
            return False
        
        # Check DAG constraint: source must be created before target
        if source_state.timestamp >= target_state.timestamp:
            return False
        
        return True
    
    def add_operation(self, op: Operation):
        """Add an operation and update well states"""
        self.operations.append(op)
        self.timestamp = max(self.timestamp, op.timestamp) + 1
        
        if op.op_type == OpType.TRANSFER:
            # Update source well
            if op.source_well:
                source_state = self.wells[op.source_well]
                source_state.volume -= op.volume
                source_state.timestamp = self.timestamp
            
            # Update target well
            if op.target_well not in self.wells:
                self.wells[op.target_well] = LiquidState(
                    volume=0.0, concentration=0.0, 
                    contamination_risk=0.0, timestamp=self.timestamp
                )
            
            target_state = self.wells[op.target_well]
            target_state.volume += op.volume
            
            # Calculate new concentration (weighted average)
            if target_state.volume > 0:
                if op.source_well:
                    source_conc = self.wells[op.source_well].concentration
                    target_state.concentration = (
                        (target_state.volume - op.volume) * target_state.concentration +
                        op.volume * source_conc
                    ) / target_state.volume
                
                # Update contamination risk
                if op.source_well:
                    source_risk = self.wells[op.source_well].contamination_risk
                    target_state.contamination_risk = max(
                        target_state.contamination_risk, source_risk
                    )

class DilutionNetworkGenerator:
    """Generates dilution networks using a simplified GNN-like approach"""
    
    def __init__(self, hidden_dim: int = 64):
        self.hidden_dim = hidden_dim
        
        # Simple MLPs for different prediction tasks
        self.op_type_predictor = nn.Sequential(
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, len(OpType))
        )
        
        self.source_predictor = nn.Sequential(
            nn.Linear(hidden_dim * 2, hidden_dim),  # node + global context
            nn.ReLU(),
            nn.Linear(hidden_dim, 1)
        )
        
        self.target_predictor = nn.Sequential(
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, 1)
        )
        
        self.volume_predictor = nn.Sequential(
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, 1),
            nn.Sigmoid()  # Output 0-1, scale to actual volume
        )
    
    def encode_workflow_state(self, workflow: DilutionWorkflow) -> Dict[str, torch.Tensor]:
        """Encode the current workflow state into node and global embeddings"""
        # Simple encoding: concatenate well features
        well_features = []
        well_ids = []
        
        for well_id in workflow.wells:
            state = workflow.wells[well_id]
            features = [
                state.volume / 200.0,  # Normalize volume
                state.concentration / 1000.0,  # Normalize concentration
                state.contamination_risk,
                state.timestamp / 100.0  # Normalize timestamp
            ]
            well_features.append(features)
            well_ids.append(well_id)
        
        # Pad to fixed size for batch processing
        max_wells = workflow.plate_rows * workflow.plate_cols
        while len(well_features) < max_wells:
            well_features.append([0.0, 0.0, 0.0, 0.0])
            well_ids.append("")
        
        # Global context: goal concentration, remaining wells to fill
        target_concentration = 100.0  # Example target
        remaining_wells = max_wells - len([w for w in workflow.wells.values() if w.volume > 0])
        
        global_features = [
            target_concentration / 1000.0,
            remaining_wells / max_wells,
            workflow.timestamp / 100.0
        ]
        
        return {
            'well_features': torch.tensor(well_features, dtype=torch.float32),
            'well_ids': well_ids,
            'global_features': torch.tensor(global_features, dtype=torch.float32)
        }
    
    def predict_next_operation(self, workflow: DilutionWorkflow) -> Operation:
        """Predict the next operation using the current workflow state"""
        # Encode current state
        encoded = self.encode_workflow_state(workflow)
        well_features = encoded['well_features']
        global_features = encoded['global_features']
        
        # Simple "GNN-like" processing: aggregate well features
        node_embeddings = well_features @ torch.randn(4, self.hidden_dim)  # Simplified
        
        # Predict operation type
        global_context = global_features.unsqueeze(0).expand(node_embeddings.shape[0], -1)
        combined_features = torch.cat([node_embeddings, global_context], dim=1)
        
        op_type_logits = self.op_type_predictor(node_embeddings.mean(dim=0))
        op_type = OpType(list(OpType)[op_type_logits.argmax().item()])
        
        # Predict source well (with masking)
        source_scores = self.source_predictor(combined_features).squeeze()
        source_mask = torch.zeros_like(source_scores)
        
        # Mask: only wells with liquid can be sources
        for i, well_id in enumerate(encoded['well_ids']):
            if well_id in workflow.wells and workflow.wells[well_id].volume > 0:
                source_mask[i] = 1.0
        
        source_scores = source_scores * source_mask
        source_idx = source_scores.argmax().item()
        source_well = encoded['well_ids'][source_idx] if source_mask[source_idx] > 0 else None
        
        # Predict target well (with masking)
        target_scores = self.target_predictor(combined_features).squeeze()
        target_mask = torch.zeros_like(target_scores)
        
        # Mask: prefer empty wells or wells that need dilution
        for i, well_id in enumerate(encoded['well_ids']):
            if well_id not in workflow.wells or workflow.wells[well_id].volume < 50:
                target_mask[i] = 1.0
        
        target_scores = target_scores * target_mask
        target_idx = target_scores.argmax().item()
        target_well = encoded['well_ids'][target_idx]
        
        # Predict volume
        volume_logit = self.volume_predictor(node_embeddings.mean(dim=0))
        volume = volume_logit.item() * 50.0  # Scale to 0-50 μL range
        
        # Ensure DAG constraint: timestamp must be greater than all previous
        timestamp = workflow.timestamp + 1
        
        # Select available tip
        tip_id = workflow.available_tips[0]  # Simplified
        
        return Operation(
            op_type=op_type,
            source_well=source_well,
            target_well=target_well,
            volume=volume,
            timestamp=timestamp,
            tip_id=tip_id
        )

def generate_dilution_workflow(target_concentrations: List[float], 
                             max_operations: int = 50) -> DilutionWorkflow:
    """Generate a complete dilution workflow"""
    workflow = DilutionWorkflow()
    generator = DilutionNetworkGenerator()
    
    operations_count = 0
    
    while operations_count < max_operations:
        # Check if we've achieved our goals
        filled_wells = [w for w in workflow.wells.values() if w.volume > 0]
        if len(filled_wells) >= len(target_concentrations):
            # Check if concentrations are close enough
            achieved_concentrations = [w.concentration for w in filled_wells[:len(target_concentrations)]]
            if all(abs(ac - tc) < 50 for ac, tc in zip(achieved_concentrations, target_concentrations)):
                break
        
        # Predict next operation
        try:
            next_op = generator.predict_next_operation(workflow)
            
            # Validate operation
            if next_op.source_well and next_op.target_well:
                if workflow.can_transfer(next_op.source_well, next_op.target_well, next_op.volume):
                    workflow.add_operation(next_op)
                    operations_count += 1
                    print(f"Added operation: {next_op.op_type.value} {next_op.volume:.1f}μL "
                          f"from {next_op.source_well} to {next_op.target_well}")
                else:
                    print(f"Invalid operation: {next_op.op_type.value} {next_op.volume:.1f}μL "
                          f"from {next_op.source_well} to {next_op.target_well}")
            else:
                # Handle aspirate/dispense operations
                workflow.add_operation(next_op)
                operations_count += 1
                
        except Exception as e:
            print(f"Error generating operation: {e}")
            break
    
    return workflow

# Example usage
if __name__ == "__main__":
    # Generate a workflow for 8 different concentrations
    target_concentrations = [800, 600, 400, 200, 100, 50, 25, 12.5]
    
    print("Generating dilution workflow...")
    workflow = generate_dilution_workflow(target_concentrations)
    
    print(f"\nGenerated {len(workflow.operations)} operations")
    print(f"Final workflow has {len(workflow.wells)} wells with liquid")
    
    # Show final concentrations
    print("\nFinal well states:")
    for well_id, state in sorted(workflow.wells.items()):
        if state.volume > 0:
            print(f"{well_id}: {state.volume:.1f}μL, {state.concentration:.1f} ng/μL")
    
    # Verify DAG property
    timestamps = [op.timestamp for op in workflow.operations]
    if timestamps == sorted(timestamps):
        print("\n✓ DAG constraint satisfied: all operations are temporally ordered")
    else:
        print("\n✗ DAG constraint violated: operations are not temporally ordered")

This example demonstrates:

DAG Enforcement: Each operation gets a timestamp greater than all previous operations
Constraint Masking: Source wells must have liquid, target wells should be empty or need dilution
State Updates: Well volumes and concentrations are updated after each operation
Validation: Operations are checked for feasibility before execution
Goal-Oriented Generation: The workflow continues until target concentrations are achieved

The generator uses a simplified “GNN-like” approach with:

Node embeddings based on well features (volume, concentration, contamination, timestamp)
Global context (target concentration, remaining wells, current timestamp)
Masked prediction for source/target selection
Constraint validation to maintain physical and temporal consistency

Why GNNs fit this problem

Message passing aligns with local physical constraints while still capturing long-range goals through multiple hops and global features, as articulated in the Distill overview link. The core difference here is that we use the GNN not to label a fixed graph but to drive the creation of new connectivity under constraints.

Outlook

Bringing workflow generation to practice requires: a realistic simulator with rich constraints, curated protocol datasets, and careful interfaces to planners and robots. The architectural pieces above provide a path to move from classification to connectivity.

References:

Sanchez-Lengeling, B., Reif, E., Pearce, A., Wiltschko, A. “A Gentle Introduction to Graph Neural Networks,” Distill (2021). Distill article.

Getting a reaction-diffusion tattoo

2025-07-19T18:00:45+00:00

Reasoning

Part of the exercise in improving/building the reaction-diffusion simulator was so that I build the code needed to explore the next tattoo I plan on getting. I thought of this tattoo after a challenging vipassana course (although it was many years in the making). The tattoo represents my evolving relationship with engineering and science. The tattoo will be a Turing-pattern, generated via the FitzHugh-Nagumo equations, which is by-design, different in length scale to the patterns I generated in my final paper during my PhD. The tattoo will use pointilism as a way of deconstructing the patterns into discrete units, like protocells in nature or transistors in computation. The pointilism will describe concentration - denser number of points in the parts of high concentration, and fewer points in darker parts of the pattern.

A key macroscopic feature will be that the design will have a sharp break in the middle that represents embryogenesis, the phenomena I find most fascinating in biology, while also representing a break in my views/hopes of science. This break will take the form of two cells splitting, a symbolic representation for me; from viewing science as almost a spiritual endeavor to a more practical view of finding challenging problems without compromising on my lifestyle and goals.

Interactive Visualization

Everything is infrastructure

2025-06-16T18:00:45+00:00

I was recently on reddit, trying to figure out why monorepos just feel right, and I stumbled upon this article discussing the advantages to monorepos.

It’s been a wild ride that I’ve come on - from being a strict molecular programmer, to a hardware engineer, to a begrudging software engineer, to slowly but surely seeing an infrastructure engineer peek out at me…

Now that I’m in the middle of it all, I realize something fundamental about my journey: I’ve always been drawn to building tools that make life easier for others. The only difference now is who I’m building for.

When I started as a molecular programmer, I was building tools for scientists and researchers - people who needed computational and/or electromechanical solutions but weren’t necessarily coders themselves. I was the bridge between complex algorithms and practical scientific problems. Every tool I built was designed to make someone else’s work more efficient, more reliable, more accessible.

Now, as I find myself drawn deeper into infrastructure engineering, I see the same pattern emerging. I’m still building tools that make life easier - but now my users are other developers. Instead of creating applications that scientists can use, I’m creating the platforms, systems, and processes that enable other developers to build their own tools more effectively.

The monorepo discussion that sparked this reflection is a perfect example. It’s not about the code itself, but about the infrastructure that makes development teams more productive. It’s about building the tools that build the tools.

This progression feels natural because it’s the same mission, just with a different audience. I’m still the bridge, still the enabler - just operating at a different layer of the technology stack. And now that I’m here, I can see this infrastructure work for what it truly is: another tool-building exercise.

Generating Reaction-Diffusion Tattoos

2025-06-14T18:00:45+00:00

With the infrastructure in place, it’s time to actually generate a tattoo to commemorate the last ~7 years of my life.

This is the symbolism I want layered into this tattoo:

The Turing pattern will be a uniquely-generated tattoo to represent my detatchment from doing truely basic science, for systemic academic reasons. The pattern will be a dot pattern (similar to these tattoos) (specifically the pointilism-style tattoos) where the dots each represent a single cell/transistor - the basis for computing that I build my PhD around, and the basis which I keep on chugging along.
The design will be encompassed in oval-like cellular structures, with a sharp break in the middle to represent some of the disillusionment I felt with science/engineering in general.

Building a reaction-diffusion simulator

2025-06-06T18:00:45+00:00

Introduction

Turing patterns (and more broadly, reaction-diffusion phenomena) have had a large impact on my life over the last few years - from the idea that simple mathematical models can lead to these complex, seen-in-nature patterns, to the idea that phenomena such as morphogenesis and embryogenesis can be modeled using these same principles. One of my favorite papers, Growing Neural Cellular Automata, is based on this principle. Instead of taking the approach of modeling reaction-diffusion using this method, this paper approached the problem by approximating partial differential equations as discrete blocks (cellular automata) - not that different from normal finite element analysis methods which create non-uniform discrete meshes.

This was inspiration for a lot of the work I did during my PhD - I was fascinated with the ability to recreate a lot of these reaction-diffusion patterns seen throughout biology. In particular, I fell in love with the idea that complex, emergent systems can be simulated using first-principles or data-driven approaches, and even recreated in the lab if the simulated principles were cleverly designed.

I wanted to at least design a simple CA-based approach for solving some of the most famous Turing patterns using a readable, developer-friendly python package that I could then use to design a tattoo that memorializes this chapter in my life.

The code for developing this tattoo can be found at repository. A lot of this effort was inspired from this repo.

A jupyter notebook with the reaction-diffusion simulator environment (and a few examples) can be found at https://tattoonotebook.misharubanov.com and a streamlit app for no-code simulator exploration can be found at https://rdapp.misharubanov.com/.

Setting up the code

This codebase can be divided into three main components: simulation, visualization, and default generation.

Simulation

The simulator was developed with scalability and modularity in mind - the overarching goal was to be able not only to simulate any 2-species reactions-diffusion system, but also to be able to easily add new reaction systems and default values as needed, as well as any initial conditions for the two species. The ReactionFunction protocol implements the general structure that each reaction-diffusion equation should take - as input it takes two arrays (each describing the a/b variables) and two constants (describing the reaction rates).

@runtime_checkable
class ReactionFunction(Protocol):
    """Protocol defining the interface for reaction functions.

    A reaction function calculates the rate of change for a chemical species
    based on the current concentrations of both species and reaction parameters.

    Methods:
        __call__: Calculate the reaction rate for a species.
    """

    def __call__(
        self, a: FloatArrayType, b: FloatArrayType, alpha: float, beta: float
    ) -> FloatArrayType: ...

The simulator can then be instantiated with diffusion coefficients, rate constants, simulation parameters (height/width/time or space resolution) and the ReactionType (an Enum that specifies which set of reactions to use):

 class ReactionType(Enum):
    """Enumeration of available reaction-diffusion system types.

    Each reaction type represents a different chemical reaction system with its own
    mathematical equations and behavior patterns.

    Values:
        BRUSSELATOR: The Brusselator model, a theoretical model for a type of
            autocatalytic reaction.
        FITZHUGH_NAGUMO: The FitzHugh-Nagumo model, a simplified model of
            neuron behavior.
        GRAY_SCOTT: The Gray-Scott model, a reaction-diffusion system that can
            produce various patterns.
    """

    BRUSSELATOR = 1
    FITZHUGH_NAGUMO = 2
    GRAY_SCOTT = 3

The simulator is built on Pydantic’s BaseModel which provides a lot of powerful tools to automatically validate the model before running it. This actually blocked development for a bit - as I was having difficulty in figuring out the best way to define ReactionType without going through the effort of defining my own custom types.. The solution I ended up going with was foreshadowed above - by storing all relevant information in an Enum, I can just save the Enum field within my simulator (for JSON dumping/loading and validation) and use that information to load the actual reaction functions into private fields that are not serialized/saved. Using this approach, a simulation run can be reliably recreated using the validated simulator parameters and the initial conditions for both species

Visualization

Once the simulation was completed, I needed some way to actually visualize the evolution of both species without storing every 2D frame. For this reason, I added the ability to specify the total number of frames when running the simulation to visualize.

The backbone for visualizing these simulations was to generate videos using Plotly. Once the simulation was completed, the def run() method output both a/b 3D arrays (the first dimension being frames over time) as well as the total number of time steps calculated.

These values can then be used as input to create animations within the tattoo_plotter.

One limitation of this visualization method, however, is that rerunning these functions requires re-instantiating a new simulator and initial conditions, and running individually. I think that exploring this phase-space would be much more interesting if the simulations had an easy-to-use GUI - enter Streamlit. Building a streamlit app is incredibly easy - and due to the well-typed simulator, being able to visualize (with parameter hints) became trivial. I developed a streamlit app that populates a set of parameters based on the field inputs, and allows a user to no-code run the simulator for both the pre-populated defaults as well as for any type of parameters the user is interested in. Switching between different reaction types enables the user to easily explore parameter spaces. Additionally, brief descriptions on how the different reaction types were set up (and their physical interpretations) were added to help remind the user (including myself) what each parameter means. The app can be found at https://rdapp.misharubanov.com/. If for some reason my server goes down, streamlit allows hosting of a few apps - you can find this app at https://rdtattoos.streamlit.app.

Details on implementation of the app backend and self-hosting the app can be found in the infrastructure section.

Default Generation

Defaults were scraped from various parts of the web as well as some interesting parameters I found when exploring the simulation myself. These defaults were stored as instances of the general RDSimulator.

Future development (hopefully)

I would love to instantiate a SQL database that automatically logs all runs, so that when a user is exploring new parameter spaces they have to keep track of the parameters used, and having a nice represenation of the parameter and output space as a method that the user could call would be nice too!

For visualization, a lot could be improved in the GUI - from optimizing simulation times to removing deadspace around the animations. Hopefully I can do this at some point!

Infrastructure

To develop a reliable and clean environment for running this code, I chose to use Docker to deploy both a Jupyter notebook and a streamlit app. This had the added benefit of easily working with my self-hosted stack. To install the same environment within my jupyter notebook, I reorganized the notebook to be pip-installable as a local package:

FROM quay.io/jupyter/base-notebook
WORKDIR /home/jovyan/work
COPY . .
RUN pip install --no-cache-dir -r requirements.txt && \
    pip install -e .
EXPOSE 8888
ENV JUPYTER_ENABLE_LAB=yes
RUN jupyter notebook --generate-config && \
    echo "c.NotebookApp.password='argon2:$argon2id$v=19$m=10240,t=10,p=8$W/YoaK1HmUWy4ITRrMArwg$3s7sDEPluB2Cp97GURa1+cs0L4/uNruSYE9uXjjYxCA'" >> /home/jovyan/.jupyter/jupyter_notebook_config.py
CMD ["jupyter", "lab", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"] 

For added security, I created a hashed password that would prevent running rogue python within these notebooks from anywhere on the internet (send me a message if you want to try it out!).

For streamlit, the environment looks similar except that instead of opening a notebook, streamlit run... enables generation of the app.

For deploying to a DNS subdomain, I used coolify with a github webhook to automatically redeploy this public repository (more details here).

Next steps

Now that I have the infrastructure in place to really explore these patterns, I want to focus on using them as a symbol for my relationship to science and engineering. My next post will be exploring the personal significance that these patterns have had over the last ~7 years of my life.

My Self Hosted Setup

2025-05-18T18:00:45+00:00

Hetzner

I use Hetzner for hosting. Their CAX21 ARM server (4 vCPUs, 8GB RAM, 80GB NVMe) runs €6.49/month—far cheaper than AWS or DigitalOcean. Server setup and Coolify installation were straightforward.

The web interface handles provisioning, daily backups, and monitoring. Scaling is fast—I doubled my storage and RAM in minutes for an extra $3/month. DDoS protection and firewall management come included.

Coolify

Coolify is a self-hostable Heroku/Vercel alternative. I use it to deploy from GitHub, manage Docker containers, and spin up databases. It’s a thin wrapper around Docker and Traefik with a UI—sometimes opaque, and wiring apps through Cloudflare has been painful—but it works.

Networking

Traefik: Coolify’s default proxy. Handles SSL via Let’s Encrypt and routes traffic.

Cloudflare: DNS, DDoS protection, caching. Cloudflare tunnels let me expose services without a public IP (setup guide).

Monitoring

Uptime Kuma: Lightweight uptime monitoring with alerts.

Glance: Mobile-friendly dashboard for RSS, weather, and container status.

Duplicati: Backups to cloud services, WEBDAV, and local storage.

Dozzle: Container logs and resource usage.

ntfy: Push notifications when services go down.

Beszel: Server monitoring with alerts for CPU/memory spikes. Combined with Dozzle, debugging is quick.

The stack

Code in GitHub → Coolify pulls and deploys containers → Traefik routes traffic with auto-SSL → Uptime Kuma and Glance monitor everything.

Apps

Audiobookshelf: My ebook/audiobook library. I set this up after learning Amazon was removing Kindle download functionality. Good web reader and Android app.

Immich: Self-hosted Google Photos alternative with facial recognition. Still in progress—TB+ of photos means I need a NAS before this makes financial sense.

Gramps Web: A family tree app I set up after visiting ancestry in Uzbekistan. Multi-user with editor/guest roles, hot and cold backups.

Vikunja: Simple todo app. Not as polished as commercial options, but I own my data.