<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2026-02-01T17:17:20+00:00</updated><id>/feed.xml</id><title type="html">Misha Rubanov</title><subtitle>Molecular, software, and robotics programmer interested in building novel tools to enable scaled, robust, and modelable experiments. I enjoy being on the interface between science and engineering, be it via experimental/SOP design, building custom UIs for other scientists, or improving developer interfaces for custom SDKs. My experience in molecular programming has given me a unique lens in which to apply a programming mindset towards problems in a wide variety of fields.</subtitle><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><entry><title type="html">From Classification to Connectivity: Generating Liquid-Handling Workflows with GNNs</title><link href="/lab-automation/ml/graphs/2025/09/02/graph-generation.html" rel="alternate" type="text/html" title="From Classification to Connectivity: Generating Liquid-Handling Workflows with GNNs" /><published>2025-09-02T00:00:00+00:00</published><updated>2025-09-02T00:00:00+00:00</updated><id>/lab-automation/ml/graphs/2025/09/02/graph-generation</id><content type="html" xml:base="/lab-automation/ml/graphs/2025/09/02/graph-generation.html"><![CDATA[<p>Modern Graph Neural Networks (GNNs) excel at predicting node and edge attributes, but many practical problems require changing the graph itself. Liquid-handling protocols are a prime example: executing a protocol means constructing a sequence of transfers that incrementally grows a workflow graph while respecting hard physical and chemical constraints. This post sketches how to adapt GNNs from attribute prediction to connectivity generation for liquid handling.</p>

<h3 id="background-gnn-building-blocks">Background: GNN building blocks</h3>
<p>If you are new to GNNs, I recommend the clear and interactive overview in Distill’s “A Gentle Introduction to Graph Neural Networks” <a href="`https://distill.pub/2021/gnn-intro/#table`">link</a>. It explains message passing, aggregation and update functions, and how information flows over graph structure.</p>

<p>Key takeaways for our setting:</p>
<ul>
  <li>GNNs operate over nodes, edges, and (optionally) global features via permutation-invariant aggregation.</li>
  <li>Information is localized and propagates by hops, which is useful for enforcing local constraints (e.g., volumes in a well or sterility across edges) while letting global features (e.g., temperature, instrument state) influence decisions.</li>
</ul>

<h3 id="problem-framing-protocols-as-dynamic-dags">Problem framing: protocols as dynamic DAGs</h3>
<p>We represent a liquid-handling protocol as a dynamic, directed acyclic multigraph (DAG) over time steps t = 0..T. The DAG constraint is fundamental: liquid handling operations cannot create cycles because time flows forward and reagents cannot be “un-mixed” or “un-transferred.”</p>

<ul>
  <li><strong>Nodes</strong>: containers/wells, instrument resources (tips, reservoirs), intermediate mixtures, deck locations.</li>
  <li><strong>Edges</strong>: operations such as aspirate, dispense, transfer, mix; edges carry attributes (volume, liquid identity, tip id, speed, timestamp).</li>
  <li><strong>State</strong>: per-node attributes (current volume, composition, contamination risk), per-edge attributes (history), and global context (robot capabilities, timing, environment).</li>
  <li><strong>DAG Invariant</strong>: Every edge (u,v) must satisfy timestamp(u) &lt; timestamp(v), ensuring no cycles can exist.</li>
</ul>

<p>At each step we add one or more edges that modify node states while preserving the DAG property. Generation ends when goals are satisfied (target mixture, plate layout) or no valid actions remain.</p>

<h3 id="why-attribute-models-arent-enough">Why attribute models aren’t enough</h3>
<p>Attribute-focused GNNs answer questions like “what volume should be in well A?” given a fixed graph. Workflow synthesis instead requires proposing valid connectivity changes. We need a model that:</p>
<ul>
  <li>Proposes the next operation (an edge or set of edges), including its endpoints and attributes.</li>
  <li>Respects constraints (conservation of volume, sterility, capacity, tool availability).</li>
  <li>Plans long-horizon sequences to reach targets.</li>
</ul>

<h3 id="modeling-approaches-for-connectivity-generation">Modeling approaches for connectivity generation</h3>
<p>There are several viable families, which can be combined:</p>

<p>1) <strong>Autoregressive edge generation</strong></p>
<ul>
  <li>Factorize p(protocol) into a sequence of edge additions. At each step, encode the current graph with a message-passing GNN; a policy head scores candidate (source node, op type, target node, attributes).</li>
  <li>Sampling: top-k or beam search with constraint masking.</li>
  <li>Benefits: precise control and easy constraint integration; drawbacks: long horizons.</li>
</ul>

<p>2) <strong>Diffusion or denoising over graphs</strong></p>
<ul>
  <li>Start from a noisy action plan and denoise into a valid workflow using a GNN denoiser conditioned on task goals and instrument state.</li>
  <li>Useful for exploring diverse plans; requires careful constraint handling during sampling.</li>
</ul>

<p>3) <strong>Constraint-satisfying planning with neural guidance</strong></p>
<ul>
  <li>Use a symbolic planner or MILP/CP-SAT to enforce hard physical constraints; use a GNN to learn heuristics (cost-to-go, action priors) that guide the search.</li>
  <li>Strong guarantees with improved speed/quality from learning.</li>
</ul>

<p>4) <strong>Imitation + RL hybrid</strong></p>
<ul>
  <li>Train the policy with behavior cloning on historical protocols; fine-tune with RL using a simulator that implements lab physics and penalties for invalid or unsafe actions.</li>
</ul>

<p>Let me elaborate on each approach with technical details and examples:</p>

<h4 id="1-autoregressive-edge-generation">1. Autoregressive Edge Generation</h4>
<p>This approach treats protocol generation as a sequence modeling problem where each step adds one or more edges to the growing workflow DAG.</p>

<p><strong>Architecture Details:</strong></p>
<ul>
  <li><strong>Encoder</strong>: A k-layer message-passing GNN (e.g., GraphSAGE, GAT) processes the current graph state</li>
  <li><strong>Policy Head</strong>: Multi-output network that predicts:
    <ul>
      <li>Operation type (categorical: transfer, mix, aspirate, dispense, etc.)</li>
      <li>Source node selection (pointer network over available nodes)</li>
      <li>Target node selection (pointer network with feasibility masking)</li>
      <li>Continuous attributes (volume, speed, temperature) with bounded distributions</li>
      <li><strong>Timestamp assignment</strong>: Critical for maintaining DAG property</li>
    </ul>
  </li>
</ul>

<p><strong>Training Strategy:</strong></p>
<ul>
  <li>Teacher forcing: use ground truth previous actions during training</li>
  <li>Scheduled sampling: gradually transition from teacher forcing to autoregressive generation</li>
  <li>Constraint masking: zero out probabilities for invalid actions (e.g., transferring from empty wells, creating cycles)</li>
  <li><strong>DAG enforcement</strong>: Ensure timestamp(u) &lt; timestamp(v) for all new edges (u,v)</li>
</ul>

<p><strong>Example Implementation:</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Simplified pseudocode
</span><span class="k">def</span> <span class="nf">generate_step</span><span class="p">(</span><span class="n">current_graph</span><span class="p">,</span> <span class="n">goal_embedding</span><span class="p">):</span>
    <span class="c1"># Encode current state
</span>    <span class="n">node_embeddings</span> <span class="o">=</span> <span class="n">gnn_encoder</span><span class="p">(</span><span class="n">current_graph</span><span class="p">)</span>
    
    <span class="c1"># Predict next operation
</span>    <span class="n">op_type</span> <span class="o">=</span> <span class="n">op_classifier</span><span class="p">(</span><span class="n">node_embeddings</span><span class="p">,</span> <span class="n">goal_embedding</span><span class="p">)</span>
    
    <span class="c1"># Select source and target with pointer networks
</span>    <span class="n">source_logits</span> <span class="o">=</span> <span class="n">source_pointer</span><span class="p">(</span><span class="n">node_embeddings</span><span class="p">,</span> <span class="n">op_type</span><span class="p">)</span>
    <span class="n">target_logits</span> <span class="o">=</span> <span class="n">target_pointer</span><span class="p">(</span><span class="n">node_embeddings</span><span class="p">,</span> <span class="n">op_type</span><span class="p">,</span> <span class="n">source_logits</span><span class="p">)</span>
    
    <span class="c1"># Apply feasibility masks
</span>    <span class="n">source_logits</span> <span class="o">=</span> <span class="n">source_logits</span> <span class="o">*</span> <span class="n">source_feasibility_mask</span>
    <span class="n">target_logits</span> <span class="o">=</span> <span class="n">target_logits</span> <span class="o">*</span> <span class="n">target_feasibility_mask</span>
    
    <span class="c1"># Sample and return action
</span>    <span class="k">return</span> <span class="n">sample_action</span><span class="p">(</span><span class="n">op_type</span><span class="p">,</span> <span class="n">source_logits</span><span class="p">,</span> <span class="n">target_logits</span><span class="p">)</span>
</code></pre></div></div>

<p><strong>Advantages:</strong></p>
<ul>
  <li>Direct control over generation process</li>
  <li>Easy to integrate hard constraints via masking</li>
  <li>Interpretable: each action is explicit and traceable</li>
  <li>Can use beam search for better planning</li>
</ul>

<p><strong>Challenges:</strong></p>
<ul>
  <li>Sequential nature limits parallelization</li>
  <li>Error accumulation over long sequences</li>
  <li>Requires careful curriculum learning for complex protocols</li>
</ul>

<h4 id="2-diffusiondenoising-over-graphs">2. Diffusion/Denoising over Graphs</h4>
<p>This approach starts from a noisy, potentially invalid workflow and progressively denoises it into a valid protocol DAG.</p>

<p><strong>Architecture Details:</strong></p>
<ul>
  <li><strong>Noise Schedule</strong>: Gradually add noise to a target protocol over T timesteps</li>
  <li><strong>Denoiser</strong>: GNN that predicts the clean protocol given noisy input and timestep</li>
  <li><strong>Conditioning</strong>: Task goals, instrument constraints, and current lab state</li>
  <li><strong>DAG Structure</strong>: Denoiser must learn to respect temporal ordering constraints</li>
</ul>

<p><strong>Training Process:</strong></p>
<ul>
  <li>Start with clean protocols from dataset</li>
  <li>Add Gaussian noise over T timesteps</li>
  <li>Train denoiser to predict original protocol given noisy version and timestep</li>
  <li>Use classifier-free guidance for better control</li>
</ul>

<p><strong>Example Implementation:</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">diffusion_generate</span><span class="p">(</span><span class="n">goal_embedding</span><span class="p">,</span> <span class="n">num_steps</span><span class="o">=</span><span class="mi">1000</span><span class="p">):</span>
    <span class="c1"># Start with pure noise
</span>    <span class="n">noisy_protocol</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">randn</span><span class="p">(</span><span class="n">protocol_shape</span><span class="p">)</span>
    
    <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">reversed</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="n">num_steps</span><span class="p">)):</span>
        <span class="c1"># Predict clean protocol
</span>        <span class="n">predicted_clean</span> <span class="o">=</span> <span class="n">denoiser</span><span class="p">(</span><span class="n">noisy_protocol</span><span class="p">,</span> <span class="n">t</span><span class="p">,</span> <span class="n">goal_embedding</span><span class="p">)</span>
        
        <span class="c1"># Apply constraint projection
</span>        <span class="n">predicted_clean</span> <span class="o">=</span> <span class="n">project_to_constraints</span><span class="p">(</span><span class="n">predicted_clean</span><span class="p">)</span>
        
        <span class="c1"># Denoise step
</span>        <span class="n">noisy_protocol</span> <span class="o">=</span> <span class="n">denoise_step</span><span class="p">(</span><span class="n">noisy_protocol</span><span class="p">,</span> <span class="n">predicted_clean</span><span class="p">,</span> <span class="n">t</span><span class="p">)</span>
    
    <span class="k">return</span> <span class="n">noisy_protocol</span>
</code></pre></div></div>

<p><strong>Advantages:</strong></p>
<ul>
  <li>Can generate diverse, high-quality protocols</li>
  <li>Natural handling of global structure</li>
  <li>Good at exploring solution space</li>
</ul>

<p><strong>Challenges:</strong></p>
<ul>
  <li>Requires many denoising steps</li>
  <li>Constraint satisfaction during sampling is tricky</li>
  <li>Less interpretable than autoregressive methods</li>
</ul>

<h4 id="3-constraint-satisfying-planning-with-neural-guidance">3. Constraint-Satisfying Planning with Neural Guidance</h4>
<p>This hybrid approach combines symbolic planning with learned heuristics from GNNs.</p>

<p><strong>Architecture Details:</strong></p>
<ul>
  <li><strong>Symbolic Planner</strong>: MILP/CP-SAT solver that enforces hard constraints including DAG structure</li>
  <li><strong>Neural Heuristic</strong>: GNN that learns to guide the search efficiently</li>
  <li><strong>Integration</strong>: Use GNN predictions to order search branches or estimate costs</li>
  <li><strong>Temporal Constraints</strong>: Solver ensures timestamp ordering and prevents cycles</li>
</ul>

<p><strong>Training Strategy:</strong></p>
<ul>
  <li>Collect planning traces from solver</li>
  <li>Train GNN to predict:
    <ul>
      <li>Action priors (which operations are likely useful)</li>
      <li>Cost-to-go estimates (how expensive remaining steps will be)</li>
      <li>Constraint violation likelihood</li>
    </ul>
  </li>
</ul>

<p><strong>Example Implementation:</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">guided_planning</span><span class="p">(</span><span class="n">initial_state</span><span class="p">,</span> <span class="n">goal</span><span class="p">):</span>
    <span class="c1"># Encode state with GNN
</span>    <span class="n">state_embedding</span> <span class="o">=</span> <span class="n">gnn_encoder</span><span class="p">(</span><span class="n">initial_state</span><span class="p">)</span>
    
    <span class="c1"># Use in symbolic planner
</span>    <span class="n">plan</span> <span class="o">=</span> <span class="n">symbolic_planner</span><span class="p">(</span>
        <span class="n">initial_state</span><span class="p">,</span> 
        <span class="n">goal</span><span class="p">,</span>
        <span class="n">action_heuristics</span><span class="o">=</span><span class="n">action_priors</span><span class="p">,</span>
        <span class="n">cost_heuristics</span><span class="o">=</span><span class="n">cost_estimate</span>
    <span class="p">)</span>
    
    <span class="k">return</span> <span class="n">plan</span>
</code></pre></div></div>

<p><strong>Advantages:</strong></p>
<ul>
  <li>Guaranteed constraint satisfaction</li>
  <li>Can leverage decades of optimization research</li>
  <li>Neural guidance improves search efficiency</li>
</ul>

<p><strong>Challenges:</strong></p>
<ul>
  <li>Requires symbolic constraint modeling</li>
  <li>Integration complexity</li>
  <li>May be slower than pure neural approaches</li>
</ul>

<h4 id="4-imitation--rl-hybrid">4. Imitation + RL Hybrid</h4>
<p>This approach starts with supervised learning on historical data and refines with reinforcement learning.</p>

<p><strong>Architecture Details:</strong></p>
<ul>
  <li><strong>Behavior Cloning</strong>: Initial training on expert demonstrations</li>
  <li><strong>RL Fine-tuning</strong>: Use simulator rewards to improve policy</li>
  <li><strong>Hybrid Loss</strong>: Combine imitation and RL objectives</li>
</ul>

<p><strong>Training Phases:</strong></p>
<ol>
  <li><strong>Phase 1</strong>: Train policy to mimic expert protocols</li>
  <li><strong>Phase 2</strong>: Use RL to optimize for efficiency, robustness, and safety</li>
  <li><strong>Phase 3</strong>: Iterative improvement with human feedback</li>
</ol>

<p><strong>Example Implementation:</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">hybrid_training</span><span class="p">(</span><span class="n">expert_data</span><span class="p">,</span> <span class="n">simulator</span><span class="p">):</span>
    <span class="c1"># Phase 1: Behavior cloning
</span>    <span class="n">policy</span> <span class="o">=</span> <span class="n">train_imitation</span><span class="p">(</span><span class="n">expert_data</span><span class="p">)</span>
    
    <span class="c1"># Phase 2: RL fine-tuning
</span>    <span class="k">for</span> <span class="n">episode</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_episodes</span><span class="p">):</span>
        <span class="n">state</span> <span class="o">=</span> <span class="n">simulator</span><span class="p">.</span><span class="n">reset</span><span class="p">()</span>
        <span class="n">done</span> <span class="o">=</span> <span class="bp">False</span>
        
        <span class="k">while</span> <span class="ow">not</span> <span class="n">done</span><span class="p">:</span>
            <span class="n">action</span> <span class="o">=</span> <span class="n">policy</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>
            <span class="n">next_state</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">done</span> <span class="o">=</span> <span class="n">simulator</span><span class="p">.</span><span class="n">step</span><span class="p">(</span><span class="n">action</span><span class="p">)</span>
            
            <span class="c1"># Update policy with RL algorithm (e.g., PPO)
</span>            <span class="n">policy</span><span class="p">.</span><span class="n">update</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">action</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">next_state</span><span class="p">)</span>
            <span class="n">state</span> <span class="o">=</span> <span class="n">next_state</span>
</code></pre></div></div>

<p><strong>Advantages:</strong></p>
<ul>
  <li>Starts with reasonable behavior</li>
  <li>Can optimize for complex objectives</li>
  <li>Combines best of supervised and RL</li>
</ul>

<p><strong>Challenges:</strong></p>
<ul>
  <li>Requires high-quality simulator</li>
  <li>RL training can be unstable</li>
  <li>Need to balance imitation vs. exploration</li>
</ul>

<h4 id="combining-approaches">Combining Approaches</h4>
<p>The most effective systems often combine multiple approaches:</p>
<ul>
  <li>Use autoregressive generation for high-level structure</li>
  <li>Apply diffusion for local refinements</li>
  <li>Use symbolic planning for critical safety constraints</li>
  <li>Fine-tune with RL for efficiency optimization</li>
</ul>

<h3 id="action-parameterization-and-constraint-masking">Action parameterization and constraint masking</h3>
<p>To keep the action space tractable:</p>
<ul>
  <li>Predict operation type first (transfer/mix/thermo step), then endpoints via pointer networks over node embeddings, then continuous attributes (e.g., volume) with bounded distributions.</li>
  <li><strong>Timestamp assignment</strong>: Each new operation must have a timestamp greater than all previous operations to maintain DAG structure.</li>
  <li>Apply masks derived from current state: available tips, sufficient volume at source, capacity at destination, deck reachability, sterility compatibility.</li>
  <li><strong>DAG constraint masking</strong>: Prevent edges that would create cycles or violate temporal ordering.</li>
  <li>Enforce invariants by projection (e.g., clip volumes to feasible ranges) and by rejecting invalid samples.</li>
</ul>

<h3 id="state-representation-details">State representation details</h3>
<ul>
  <li>Node features: current volume, composition embedding (e.g., learned from reagent ontology), temperature, contamination flags, container geometry.</li>
  <li>Edge features: operation type, executed volume, time since last action, tip id.</li>
  <li>Global features: assay goal embedding, allowed instruments, remaining time budget.</li>
  <li>Temporal encoding: append step index or use recurrent GNN layers to retain history.</li>
</ul>

<h3 id="training-signals-and-datasets">Training signals and datasets</h3>
<ul>
  <li>Imitation data: parse existing protocols (e.g., from OT-2, Hamilton scripts) into action graphs.</li>
  <li>Supervision: next-edge classification, endpoint selection, and attribute regression; auxiliary losses for state prediction (e.g., next-node volume) improve stability.</li>
  <li>Negative sampling: generate near-miss actions (slightly over volume, wrong tip) to sharpen constraint awareness.</li>
</ul>

<h3 id="evaluation-metrics">Evaluation metrics</h3>
<ul>
  <li>Validity: fraction of generated steps passing all constraints; zero spills/overflows; no cross-contamination.</li>
  <li>Goal satisfaction: assay success rate, target composition accuracy.</li>
  <li>Efficiency: action count, total time, tip consumption, deck moves.</li>
  <li>Diversity: unique valid workflows per goal.</li>
  <li>Sim-to-real: execution success on hardware with minimal edits.</li>
</ul>

<h3 id="minimal-prototype-sketch">Minimal prototype sketch</h3>
<p>Outline of an autoregressive generator with constraint masking:</p>

<ol>
  <li>Encode current graph with a k-layer message-passing GNN.</li>
  <li>Predict operation type with a masked classifier.</li>
  <li>Select source and target nodes using pointer heads over node embeddings with feasibility masks.</li>
  <li><strong>Assign timestamp</strong>: Ensure new operation timestamp &gt; all previous timestamps to maintain DAG.</li>
  <li>Regress attributes (volume, speed) with bounded outputs; project to valid ranges.</li>
  <li><strong>Validate DAG</strong>: Check that no cycles would be created by the new edge.</li>
  <li>Update node states and append the new edge; repeat until done.</li>
  <li>Use beam search for better plans; score beams by learned value function + hard constraint checks.</li>
</ol>

<h3 id="concrete-example-variable-serial-dilution-network-discovery">Concrete Example: Variable Serial Dilution Network Discovery</h3>

<p>Let’s implement a simplified version of the autoregressive approach for discovering the network required for a variable serial dilution on a 96-well plate. This example shows how DAG constraints and connectivity generation work in practice.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">torch</span>
<span class="kn">import</span> <span class="nn">torch.nn</span> <span class="k">as</span> <span class="n">nn</span>
<span class="kn">import</span> <span class="nn">torch.nn.functional</span> <span class="k">as</span> <span class="n">F</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Tuple</span><span class="p">,</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">Optional</span>
<span class="kn">from</span> <span class="nn">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<span class="kn">from</span> <span class="nn">enum</span> <span class="kn">import</span> <span class="n">Enum</span>

<span class="c1"># Define operation types
</span><span class="k">class</span> <span class="nc">OpType</span><span class="p">(</span><span class="n">Enum</span><span class="p">):</span>
    <span class="n">ASPIRATE</span> <span class="o">=</span> <span class="s">"aspirate"</span>
    <span class="n">DISPENSE</span> <span class="o">=</span> <span class="s">"dispense"</span>
    <span class="n">TRANSFER</span> <span class="o">=</span> <span class="s">"transfer"</span>
    <span class="n">MIX</span> <span class="o">=</span> <span class="s">"mix"</span>

<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">LiquidState</span><span class="p">:</span>
    <span class="s">"""Represents the state of liquid in a well"""</span>
    <span class="n">volume</span><span class="p">:</span> <span class="nb">float</span>  <span class="c1"># Current volume in μL
</span>    <span class="n">concentration</span><span class="p">:</span> <span class="nb">float</span>  <span class="c1"># Concentration of target compound
</span>    <span class="n">contamination_risk</span><span class="p">:</span> <span class="nb">float</span>  <span class="c1"># Risk of cross-contamination (0-1)
</span>    <span class="n">timestamp</span><span class="p">:</span> <span class="nb">int</span>  <span class="c1"># When this state was created
</span>
<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">Operation</span><span class="p">:</span>
    <span class="s">"""Represents a liquid handling operation"""</span>
    <span class="n">op_type</span><span class="p">:</span> <span class="n">OpType</span>
    <span class="n">source_well</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span>  <span class="c1"># None for aspirate from reservoir
</span>    <span class="n">target_well</span><span class="p">:</span> <span class="nb">str</span>
    <span class="n">volume</span><span class="p">:</span> <span class="nb">float</span>
    <span class="n">timestamp</span><span class="p">:</span> <span class="nb">int</span>
    <span class="n">tip_id</span><span class="p">:</span> <span class="nb">str</span>

<span class="k">class</span> <span class="nc">DilutionWorkflow</span><span class="p">:</span>
    <span class="s">"""Represents the current state of a dilution workflow"""</span>
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">plate_rows</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">8</span><span class="p">,</span> <span class="n">plate_cols</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">12</span><span class="p">):</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">plate_rows</span> <span class="o">=</span> <span class="n">plate_rows</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">plate_cols</span> <span class="o">=</span> <span class="n">plate_cols</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">wells</span> <span class="o">=</span> <span class="p">{}</span>  <span class="c1"># well_id -&gt; LiquidState
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">operations</span> <span class="o">=</span> <span class="p">[]</span>  <span class="c1"># List of Operation objects
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">available_tips</span> <span class="o">=</span> <span class="p">[</span><span class="sa">f</span><span class="s">"tip_</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s">"</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">)]</span>  <span class="c1"># 8-channel pipette
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">=</span> <span class="mi">0</span>
        
        <span class="c1"># Initialize source wells (e.g., A1 has stock solution)
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="s">"A1"</span><span class="p">]</span> <span class="o">=</span> <span class="n">LiquidState</span><span class="p">(</span><span class="n">volume</span><span class="o">=</span><span class="mf">200.0</span><span class="p">,</span> <span class="n">concentration</span><span class="o">=</span><span class="mf">1000.0</span><span class="p">,</span> 
                                      <span class="n">contamination_risk</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">timestamp</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
    
    <span class="k">def</span> <span class="nf">get_well_id</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">row</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">col</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">str</span><span class="p">:</span>
        <span class="s">"""Convert row/col to well ID (e.g., A1, B2)"""</span>
        <span class="k">return</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="nb">chr</span><span class="p">(</span><span class="mi">65</span> <span class="o">+</span> <span class="n">row</span><span class="p">)</span><span class="si">}{</span><span class="n">col</span> <span class="o">+</span> <span class="mi">1</span><span class="si">}</span><span class="s">"</span>
    
    <span class="k">def</span> <span class="nf">can_transfer</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">source</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">target</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">volume</span><span class="p">:</span> <span class="nb">float</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
        <span class="s">"""Check if a transfer operation is valid"""</span>
        <span class="k">if</span> <span class="n">source</span> <span class="ow">not</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span> <span class="ow">or</span> <span class="n">target</span> <span class="ow">not</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">False</span>
        
        <span class="n">source_state</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">source</span><span class="p">]</span>
        <span class="n">target_state</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">target</span><span class="p">]</span>
        
        <span class="c1"># Check volume constraints
</span>        <span class="k">if</span> <span class="n">source_state</span><span class="p">.</span><span class="n">volume</span> <span class="o">&lt;</span> <span class="n">volume</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">False</span>
        
        <span class="c1"># Check contamination risk (can't transfer to contaminated wells)
</span>        <span class="k">if</span> <span class="n">target_state</span><span class="p">.</span><span class="n">contamination_risk</span> <span class="o">&gt;</span> <span class="mf">0.5</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">False</span>
        
        <span class="c1"># Check DAG constraint: source must be created before target
</span>        <span class="k">if</span> <span class="n">source_state</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">&gt;=</span> <span class="n">target_state</span><span class="p">.</span><span class="n">timestamp</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">False</span>
        
        <span class="k">return</span> <span class="bp">True</span>
    
    <span class="k">def</span> <span class="nf">add_operation</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">op</span><span class="p">:</span> <span class="n">Operation</span><span class="p">):</span>
        <span class="s">"""Add an operation and update well states"""</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">operations</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">op</span><span class="p">)</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">timestamp</span><span class="p">,</span> <span class="n">op</span><span class="p">.</span><span class="n">timestamp</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span>
        
        <span class="k">if</span> <span class="n">op</span><span class="p">.</span><span class="n">op_type</span> <span class="o">==</span> <span class="n">OpType</span><span class="p">.</span><span class="n">TRANSFER</span><span class="p">:</span>
            <span class="c1"># Update source well
</span>            <span class="k">if</span> <span class="n">op</span><span class="p">.</span><span class="n">source_well</span><span class="p">:</span>
                <span class="n">source_state</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">op</span><span class="p">.</span><span class="n">source_well</span><span class="p">]</span>
                <span class="n">source_state</span><span class="p">.</span><span class="n">volume</span> <span class="o">-=</span> <span class="n">op</span><span class="p">.</span><span class="n">volume</span>
                <span class="n">source_state</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">timestamp</span>
            
            <span class="c1"># Update target well
</span>            <span class="k">if</span> <span class="n">op</span><span class="p">.</span><span class="n">target_well</span> <span class="ow">not</span> <span class="ow">in</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">:</span>
                <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">op</span><span class="p">.</span><span class="n">target_well</span><span class="p">]</span> <span class="o">=</span> <span class="n">LiquidState</span><span class="p">(</span>
                    <span class="n">volume</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">concentration</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> 
                    <span class="n">contamination_risk</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">timestamp</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">timestamp</span>
                <span class="p">)</span>
            
            <span class="n">target_state</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">op</span><span class="p">.</span><span class="n">target_well</span><span class="p">]</span>
            <span class="n">target_state</span><span class="p">.</span><span class="n">volume</span> <span class="o">+=</span> <span class="n">op</span><span class="p">.</span><span class="n">volume</span>
            
            <span class="c1"># Calculate new concentration (weighted average)
</span>            <span class="k">if</span> <span class="n">target_state</span><span class="p">.</span><span class="n">volume</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
                <span class="k">if</span> <span class="n">op</span><span class="p">.</span><span class="n">source_well</span><span class="p">:</span>
                    <span class="n">source_conc</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">op</span><span class="p">.</span><span class="n">source_well</span><span class="p">].</span><span class="n">concentration</span>
                    <span class="n">target_state</span><span class="p">.</span><span class="n">concentration</span> <span class="o">=</span> <span class="p">(</span>
                        <span class="p">(</span><span class="n">target_state</span><span class="p">.</span><span class="n">volume</span> <span class="o">-</span> <span class="n">op</span><span class="p">.</span><span class="n">volume</span><span class="p">)</span> <span class="o">*</span> <span class="n">target_state</span><span class="p">.</span><span class="n">concentration</span> <span class="o">+</span>
                        <span class="n">op</span><span class="p">.</span><span class="n">volume</span> <span class="o">*</span> <span class="n">source_conc</span>
                    <span class="p">)</span> <span class="o">/</span> <span class="n">target_state</span><span class="p">.</span><span class="n">volume</span>
                
                <span class="c1"># Update contamination risk
</span>                <span class="k">if</span> <span class="n">op</span><span class="p">.</span><span class="n">source_well</span><span class="p">:</span>
                    <span class="n">source_risk</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">op</span><span class="p">.</span><span class="n">source_well</span><span class="p">].</span><span class="n">contamination_risk</span>
                    <span class="n">target_state</span><span class="p">.</span><span class="n">contamination_risk</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span>
                        <span class="n">target_state</span><span class="p">.</span><span class="n">contamination_risk</span><span class="p">,</span> <span class="n">source_risk</span>
                    <span class="p">)</span>

<span class="k">class</span> <span class="nc">DilutionNetworkGenerator</span><span class="p">:</span>
    <span class="s">"""Generates dilution networks using a simplified GNN-like approach"""</span>
    
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">hidden_dim</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">64</span><span class="p">):</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">hidden_dim</span> <span class="o">=</span> <span class="n">hidden_dim</span>
        
        <span class="c1"># Simple MLPs for different prediction tasks
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">op_type_predictor</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="n">Sequential</span><span class="p">(</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span><span class="p">,</span> <span class="n">hidden_dim</span><span class="p">),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">ReLU</span><span class="p">(),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">OpType</span><span class="p">))</span>
        <span class="p">)</span>
        
        <span class="bp">self</span><span class="p">.</span><span class="n">source_predictor</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="n">Sequential</span><span class="p">(</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">hidden_dim</span><span class="p">),</span>  <span class="c1"># node + global context
</span>            <span class="n">nn</span><span class="p">.</span><span class="n">ReLU</span><span class="p">(),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
        <span class="p">)</span>
        
        <span class="bp">self</span><span class="p">.</span><span class="n">target_predictor</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="n">Sequential</span><span class="p">(</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">hidden_dim</span><span class="p">),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">ReLU</span><span class="p">(),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
        <span class="p">)</span>
        
        <span class="bp">self</span><span class="p">.</span><span class="n">volume_predictor</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="n">Sequential</span><span class="p">(</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span><span class="p">,</span> <span class="n">hidden_dim</span><span class="p">),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">ReLU</span><span class="p">(),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">hidden_dim</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
            <span class="n">nn</span><span class="p">.</span><span class="n">Sigmoid</span><span class="p">()</span>  <span class="c1"># Output 0-1, scale to actual volume
</span>        <span class="p">)</span>
    
    <span class="k">def</span> <span class="nf">encode_workflow_state</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">workflow</span><span class="p">:</span> <span class="n">DilutionWorkflow</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">]:</span>
        <span class="s">"""Encode the current workflow state into node and global embeddings"""</span>
        <span class="c1"># Simple encoding: concatenate well features
</span>        <span class="n">well_features</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="n">well_ids</span> <span class="o">=</span> <span class="p">[]</span>
        
        <span class="k">for</span> <span class="n">well_id</span> <span class="ow">in</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">:</span>
            <span class="n">state</span> <span class="o">=</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">well_id</span><span class="p">]</span>
            <span class="n">features</span> <span class="o">=</span> <span class="p">[</span>
                <span class="n">state</span><span class="p">.</span><span class="n">volume</span> <span class="o">/</span> <span class="mf">200.0</span><span class="p">,</span>  <span class="c1"># Normalize volume
</span>                <span class="n">state</span><span class="p">.</span><span class="n">concentration</span> <span class="o">/</span> <span class="mf">1000.0</span><span class="p">,</span>  <span class="c1"># Normalize concentration
</span>                <span class="n">state</span><span class="p">.</span><span class="n">contamination_risk</span><span class="p">,</span>
                <span class="n">state</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">/</span> <span class="mf">100.0</span>  <span class="c1"># Normalize timestamp
</span>            <span class="p">]</span>
            <span class="n">well_features</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">features</span><span class="p">)</span>
            <span class="n">well_ids</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">well_id</span><span class="p">)</span>
        
        <span class="c1"># Pad to fixed size for batch processing
</span>        <span class="n">max_wells</span> <span class="o">=</span> <span class="n">workflow</span><span class="p">.</span><span class="n">plate_rows</span> <span class="o">*</span> <span class="n">workflow</span><span class="p">.</span><span class="n">plate_cols</span>
        <span class="k">while</span> <span class="nb">len</span><span class="p">(</span><span class="n">well_features</span><span class="p">)</span> <span class="o">&lt;</span> <span class="n">max_wells</span><span class="p">:</span>
            <span class="n">well_features</span><span class="p">.</span><span class="n">append</span><span class="p">([</span><span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">])</span>
            <span class="n">well_ids</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="s">""</span><span class="p">)</span>
        
        <span class="c1"># Global context: goal concentration, remaining wells to fill
</span>        <span class="n">target_concentration</span> <span class="o">=</span> <span class="mf">100.0</span>  <span class="c1"># Example target
</span>        <span class="n">remaining_wells</span> <span class="o">=</span> <span class="n">max_wells</span> <span class="o">-</span> <span class="nb">len</span><span class="p">([</span><span class="n">w</span> <span class="k">for</span> <span class="n">w</span> <span class="ow">in</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">.</span><span class="n">values</span><span class="p">()</span> <span class="k">if</span> <span class="n">w</span><span class="p">.</span><span class="n">volume</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">])</span>
        
        <span class="n">global_features</span> <span class="o">=</span> <span class="p">[</span>
            <span class="n">target_concentration</span> <span class="o">/</span> <span class="mf">1000.0</span><span class="p">,</span>
            <span class="n">remaining_wells</span> <span class="o">/</span> <span class="n">max_wells</span><span class="p">,</span>
            <span class="n">workflow</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">/</span> <span class="mf">100.0</span>
        <span class="p">]</span>
        
        <span class="k">return</span> <span class="p">{</span>
            <span class="s">'well_features'</span><span class="p">:</span> <span class="n">torch</span><span class="p">.</span><span class="n">tensor</span><span class="p">(</span><span class="n">well_features</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">torch</span><span class="p">.</span><span class="n">float32</span><span class="p">),</span>
            <span class="s">'well_ids'</span><span class="p">:</span> <span class="n">well_ids</span><span class="p">,</span>
            <span class="s">'global_features'</span><span class="p">:</span> <span class="n">torch</span><span class="p">.</span><span class="n">tensor</span><span class="p">(</span><span class="n">global_features</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">torch</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>
        <span class="p">}</span>
    
    <span class="k">def</span> <span class="nf">predict_next_operation</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">workflow</span><span class="p">:</span> <span class="n">DilutionWorkflow</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Operation</span><span class="p">:</span>
        <span class="s">"""Predict the next operation using the current workflow state"""</span>
        <span class="c1"># Encode current state
</span>        <span class="n">encoded</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">encode_workflow_state</span><span class="p">(</span><span class="n">workflow</span><span class="p">)</span>
        <span class="n">well_features</span> <span class="o">=</span> <span class="n">encoded</span><span class="p">[</span><span class="s">'well_features'</span><span class="p">]</span>
        <span class="n">global_features</span> <span class="o">=</span> <span class="n">encoded</span><span class="p">[</span><span class="s">'global_features'</span><span class="p">]</span>
        
        <span class="c1"># Simple "GNN-like" processing: aggregate well features
</span>        <span class="n">node_embeddings</span> <span class="o">=</span> <span class="n">well_features</span> <span class="o">@</span> <span class="n">torch</span><span class="p">.</span><span class="n">randn</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">hidden_dim</span><span class="p">)</span>  <span class="c1"># Simplified
</span>        
        <span class="c1"># Predict operation type
</span>        <span class="n">global_context</span> <span class="o">=</span> <span class="n">global_features</span><span class="p">.</span><span class="n">unsqueeze</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">expand</span><span class="p">(</span><span class="n">node_embeddings</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
        <span class="n">combined_features</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">cat</span><span class="p">([</span><span class="n">node_embeddings</span><span class="p">,</span> <span class="n">global_context</span><span class="p">],</span> <span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
        
        <span class="n">op_type_logits</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">op_type_predictor</span><span class="p">(</span><span class="n">node_embeddings</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span>
        <span class="n">op_type</span> <span class="o">=</span> <span class="n">OpType</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">OpType</span><span class="p">)[</span><span class="n">op_type_logits</span><span class="p">.</span><span class="n">argmax</span><span class="p">().</span><span class="n">item</span><span class="p">()])</span>
        
        <span class="c1"># Predict source well (with masking)
</span>        <span class="n">source_scores</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">source_predictor</span><span class="p">(</span><span class="n">combined_features</span><span class="p">).</span><span class="n">squeeze</span><span class="p">()</span>
        <span class="n">source_mask</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="n">source_scores</span><span class="p">)</span>
        
        <span class="c1"># Mask: only wells with liquid can be sources
</span>        <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">well_id</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">encoded</span><span class="p">[</span><span class="s">'well_ids'</span><span class="p">]):</span>
            <span class="k">if</span> <span class="n">well_id</span> <span class="ow">in</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span> <span class="ow">and</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">well_id</span><span class="p">].</span><span class="n">volume</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
                <span class="n">source_mask</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mf">1.0</span>
        
        <span class="n">source_scores</span> <span class="o">=</span> <span class="n">source_scores</span> <span class="o">*</span> <span class="n">source_mask</span>
        <span class="n">source_idx</span> <span class="o">=</span> <span class="n">source_scores</span><span class="p">.</span><span class="n">argmax</span><span class="p">().</span><span class="n">item</span><span class="p">()</span>
        <span class="n">source_well</span> <span class="o">=</span> <span class="n">encoded</span><span class="p">[</span><span class="s">'well_ids'</span><span class="p">][</span><span class="n">source_idx</span><span class="p">]</span> <span class="k">if</span> <span class="n">source_mask</span><span class="p">[</span><span class="n">source_idx</span><span class="p">]</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="k">else</span> <span class="bp">None</span>
        
        <span class="c1"># Predict target well (with masking)
</span>        <span class="n">target_scores</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">target_predictor</span><span class="p">(</span><span class="n">combined_features</span><span class="p">).</span><span class="n">squeeze</span><span class="p">()</span>
        <span class="n">target_mask</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="n">target_scores</span><span class="p">)</span>
        
        <span class="c1"># Mask: prefer empty wells or wells that need dilution
</span>        <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">well_id</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">encoded</span><span class="p">[</span><span class="s">'well_ids'</span><span class="p">]):</span>
            <span class="k">if</span> <span class="n">well_id</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span> <span class="ow">or</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">[</span><span class="n">well_id</span><span class="p">].</span><span class="n">volume</span> <span class="o">&lt;</span> <span class="mi">50</span><span class="p">:</span>
                <span class="n">target_mask</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mf">1.0</span>
        
        <span class="n">target_scores</span> <span class="o">=</span> <span class="n">target_scores</span> <span class="o">*</span> <span class="n">target_mask</span>
        <span class="n">target_idx</span> <span class="o">=</span> <span class="n">target_scores</span><span class="p">.</span><span class="n">argmax</span><span class="p">().</span><span class="n">item</span><span class="p">()</span>
        <span class="n">target_well</span> <span class="o">=</span> <span class="n">encoded</span><span class="p">[</span><span class="s">'well_ids'</span><span class="p">][</span><span class="n">target_idx</span><span class="p">]</span>
        
        <span class="c1"># Predict volume
</span>        <span class="n">volume_logit</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">volume_predictor</span><span class="p">(</span><span class="n">node_embeddings</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span>
        <span class="n">volume</span> <span class="o">=</span> <span class="n">volume_logit</span><span class="p">.</span><span class="n">item</span><span class="p">()</span> <span class="o">*</span> <span class="mf">50.0</span>  <span class="c1"># Scale to 0-50 μL range
</span>        
        <span class="c1"># Ensure DAG constraint: timestamp must be greater than all previous
</span>        <span class="n">timestamp</span> <span class="o">=</span> <span class="n">workflow</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">+</span> <span class="mi">1</span>
        
        <span class="c1"># Select available tip
</span>        <span class="n">tip_id</span> <span class="o">=</span> <span class="n">workflow</span><span class="p">.</span><span class="n">available_tips</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>  <span class="c1"># Simplified
</span>        
        <span class="k">return</span> <span class="n">Operation</span><span class="p">(</span>
            <span class="n">op_type</span><span class="o">=</span><span class="n">op_type</span><span class="p">,</span>
            <span class="n">source_well</span><span class="o">=</span><span class="n">source_well</span><span class="p">,</span>
            <span class="n">target_well</span><span class="o">=</span><span class="n">target_well</span><span class="p">,</span>
            <span class="n">volume</span><span class="o">=</span><span class="n">volume</span><span class="p">,</span>
            <span class="n">timestamp</span><span class="o">=</span><span class="n">timestamp</span><span class="p">,</span>
            <span class="n">tip_id</span><span class="o">=</span><span class="n">tip_id</span>
        <span class="p">)</span>

<span class="k">def</span> <span class="nf">generate_dilution_workflow</span><span class="p">(</span><span class="n">target_concentrations</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">float</span><span class="p">],</span> 
                             <span class="n">max_operations</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">50</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">DilutionWorkflow</span><span class="p">:</span>
    <span class="s">"""Generate a complete dilution workflow"""</span>
    <span class="n">workflow</span> <span class="o">=</span> <span class="n">DilutionWorkflow</span><span class="p">()</span>
    <span class="n">generator</span> <span class="o">=</span> <span class="n">DilutionNetworkGenerator</span><span class="p">()</span>
    
    <span class="n">operations_count</span> <span class="o">=</span> <span class="mi">0</span>
    
    <span class="k">while</span> <span class="n">operations_count</span> <span class="o">&lt;</span> <span class="n">max_operations</span><span class="p">:</span>
        <span class="c1"># Check if we've achieved our goals
</span>        <span class="n">filled_wells</span> <span class="o">=</span> <span class="p">[</span><span class="n">w</span> <span class="k">for</span> <span class="n">w</span> <span class="ow">in</span> <span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">.</span><span class="n">values</span><span class="p">()</span> <span class="k">if</span> <span class="n">w</span><span class="p">.</span><span class="n">volume</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">]</span>
        <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">filled_wells</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="nb">len</span><span class="p">(</span><span class="n">target_concentrations</span><span class="p">):</span>
            <span class="c1"># Check if concentrations are close enough
</span>            <span class="n">achieved_concentrations</span> <span class="o">=</span> <span class="p">[</span><span class="n">w</span><span class="p">.</span><span class="n">concentration</span> <span class="k">for</span> <span class="n">w</span> <span class="ow">in</span> <span class="n">filled_wells</span><span class="p">[:</span><span class="nb">len</span><span class="p">(</span><span class="n">target_concentrations</span><span class="p">)]]</span>
            <span class="k">if</span> <span class="nb">all</span><span class="p">(</span><span class="nb">abs</span><span class="p">(</span><span class="n">ac</span> <span class="o">-</span> <span class="n">tc</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">50</span> <span class="k">for</span> <span class="n">ac</span><span class="p">,</span> <span class="n">tc</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">achieved_concentrations</span><span class="p">,</span> <span class="n">target_concentrations</span><span class="p">)):</span>
                <span class="k">break</span>
        
        <span class="c1"># Predict next operation
</span>        <span class="k">try</span><span class="p">:</span>
            <span class="n">next_op</span> <span class="o">=</span> <span class="n">generator</span><span class="p">.</span><span class="n">predict_next_operation</span><span class="p">(</span><span class="n">workflow</span><span class="p">)</span>
            
            <span class="c1"># Validate operation
</span>            <span class="k">if</span> <span class="n">next_op</span><span class="p">.</span><span class="n">source_well</span> <span class="ow">and</span> <span class="n">next_op</span><span class="p">.</span><span class="n">target_well</span><span class="p">:</span>
                <span class="k">if</span> <span class="n">workflow</span><span class="p">.</span><span class="n">can_transfer</span><span class="p">(</span><span class="n">next_op</span><span class="p">.</span><span class="n">source_well</span><span class="p">,</span> <span class="n">next_op</span><span class="p">.</span><span class="n">target_well</span><span class="p">,</span> <span class="n">next_op</span><span class="p">.</span><span class="n">volume</span><span class="p">):</span>
                    <span class="n">workflow</span><span class="p">.</span><span class="n">add_operation</span><span class="p">(</span><span class="n">next_op</span><span class="p">)</span>
                    <span class="n">operations_count</span> <span class="o">+=</span> <span class="mi">1</span>
                    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Added operation: </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">op_type</span><span class="p">.</span><span class="n">value</span><span class="si">}</span><span class="s"> </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">volume</span><span class="si">:</span><span class="p">.</span><span class="mi">1</span><span class="n">f</span><span class="si">}</span><span class="s">μL "</span>
                          <span class="sa">f</span><span class="s">"from </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">source_well</span><span class="si">}</span><span class="s"> to </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">target_well</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
                <span class="k">else</span><span class="p">:</span>
                    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Invalid operation: </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">op_type</span><span class="p">.</span><span class="n">value</span><span class="si">}</span><span class="s"> </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">volume</span><span class="si">:</span><span class="p">.</span><span class="mi">1</span><span class="n">f</span><span class="si">}</span><span class="s">μL "</span>
                          <span class="sa">f</span><span class="s">"from </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">source_well</span><span class="si">}</span><span class="s"> to </span><span class="si">{</span><span class="n">next_op</span><span class="p">.</span><span class="n">target_well</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="c1"># Handle aspirate/dispense operations
</span>                <span class="n">workflow</span><span class="p">.</span><span class="n">add_operation</span><span class="p">(</span><span class="n">next_op</span><span class="p">)</span>
                <span class="n">operations_count</span> <span class="o">+=</span> <span class="mi">1</span>
                
        <span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
            <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Error generating operation: </span><span class="si">{</span><span class="n">e</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
            <span class="k">break</span>
    
    <span class="k">return</span> <span class="n">workflow</span>

<span class="c1"># Example usage
</span><span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="c1"># Generate a workflow for 8 different concentrations
</span>    <span class="n">target_concentrations</span> <span class="o">=</span> <span class="p">[</span><span class="mi">800</span><span class="p">,</span> <span class="mi">600</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mi">25</span><span class="p">,</span> <span class="mf">12.5</span><span class="p">]</span>
    
    <span class="k">print</span><span class="p">(</span><span class="s">"Generating dilution workflow..."</span><span class="p">)</span>
    <span class="n">workflow</span> <span class="o">=</span> <span class="n">generate_dilution_workflow</span><span class="p">(</span><span class="n">target_concentrations</span><span class="p">)</span>
    
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="se">\n</span><span class="s">Generated </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">workflow</span><span class="p">.</span><span class="n">operations</span><span class="p">)</span><span class="si">}</span><span class="s"> operations"</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Final workflow has </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">)</span><span class="si">}</span><span class="s"> wells with liquid"</span><span class="p">)</span>
    
    <span class="c1"># Show final concentrations
</span>    <span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">Final well states:"</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">well_id</span><span class="p">,</span> <span class="n">state</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">workflow</span><span class="p">.</span><span class="n">wells</span><span class="p">.</span><span class="n">items</span><span class="p">()):</span>
        <span class="k">if</span> <span class="n">state</span><span class="p">.</span><span class="n">volume</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
            <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">well_id</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">state</span><span class="p">.</span><span class="n">volume</span><span class="si">:</span><span class="p">.</span><span class="mi">1</span><span class="n">f</span><span class="si">}</span><span class="s">μL, </span><span class="si">{</span><span class="n">state</span><span class="p">.</span><span class="n">concentration</span><span class="si">:</span><span class="p">.</span><span class="mi">1</span><span class="n">f</span><span class="si">}</span><span class="s"> ng/μL"</span><span class="p">)</span>
    
    <span class="c1"># Verify DAG property
</span>    <span class="n">timestamps</span> <span class="o">=</span> <span class="p">[</span><span class="n">op</span><span class="p">.</span><span class="n">timestamp</span> <span class="k">for</span> <span class="n">op</span> <span class="ow">in</span> <span class="n">workflow</span><span class="p">.</span><span class="n">operations</span><span class="p">]</span>
    <span class="k">if</span> <span class="n">timestamps</span> <span class="o">==</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">timestamps</span><span class="p">):</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">✓ DAG constraint satisfied: all operations are temporally ordered"</span><span class="p">)</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">✗ DAG constraint violated: operations are not temporally ordered"</span><span class="p">)</span>
</code></pre></div></div>

<p>This example demonstrates:</p>

<ol>
  <li><strong>DAG Enforcement</strong>: Each operation gets a timestamp greater than all previous operations</li>
  <li><strong>Constraint Masking</strong>: Source wells must have liquid, target wells should be empty or need dilution</li>
  <li><strong>State Updates</strong>: Well volumes and concentrations are updated after each operation</li>
  <li><strong>Validation</strong>: Operations are checked for feasibility before execution</li>
  <li><strong>Goal-Oriented Generation</strong>: The workflow continues until target concentrations are achieved</li>
</ol>

<p>The generator uses a simplified “GNN-like” approach with:</p>
<ul>
  <li>Node embeddings based on well features (volume, concentration, contamination, timestamp)</li>
  <li>Global context (target concentration, remaining wells, current timestamp)</li>
  <li>Masked prediction for source/target selection</li>
  <li>Constraint validation to maintain physical and temporal consistency</li>
</ul>

<h3 id="why-gnns-fit-this-problem">Why GNNs fit this problem</h3>
<p>Message passing aligns with local physical constraints while still capturing long-range goals through multiple hops and global features, as articulated in the Distill overview <a href="`https://distill.pub/2021/gnn-intro/#table`">link</a>. The core difference here is that we use the GNN not to label a fixed graph but to drive the creation of new connectivity under constraints.</p>

<h3 id="outlook">Outlook</h3>
<p>Bringing workflow generation to practice requires: a realistic simulator with rich constraints, curated protocol datasets, and careful interfaces to planners and robots. The architectural pieces above provide a path to move from classification to connectivity.</p>

<p>References:</p>
<ul>
  <li>Sanchez-Lengeling, B., Reif, E., Pearce, A., Wiltschko, A. “A Gentle Introduction to Graph Neural Networks,” Distill (2021). <a href="`https://distill.pub/2021/gnn-intro/#table`">Distill article</a>.</li>
</ul>]]></content><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><category term="lab-automation" /><category term="ml" /><category term="graphs" /><category term="graph neural networks" /><category term="laboratory automation" /><category term="liquid handling" /><category term="graph generation" /><summary type="html"><![CDATA[Modern Graph Neural Networks (GNNs) excel at predicting node and edge attributes, but many practical problems require changing the graph itself. Liquid-handling protocols are a prime example: executing a protocol means constructing a sequence of transfers that incrementally grows a workflow graph while respecting hard physical and chemical constraints. This post sketches how to adapt GNNs from attribute prediction to connectivity generation for liquid handling.]]></summary></entry><entry><title type="html">Getting a reaction-diffusion tattoo</title><link href="/2025/07/19/time-for-tattoos.html" rel="alternate" type="text/html" title="Getting a reaction-diffusion tattoo" /><published>2025-07-19T18:00:45+00:00</published><updated>2025-07-19T18:00:45+00:00</updated><id>/2025/07/19/time-for-tattoos</id><content type="html" xml:base="/2025/07/19/time-for-tattoos.html"><![CDATA[<h2 id="reasoning">Reasoning</h2>

<p>Part of the exercise in improving/building the reaction-diffusion simulator was so that I build the code needed to explore the next tattoo I plan on getting. I thought of this tattoo after a challenging vipassana course (although it was many years in the making). The tattoo represents my evolving relationship with engineering and science. The tattoo will be a Turing-pattern, generated via the <a href="https://en.wikipedia.org/wiki/FitzHugh%E2%80%93Nagumo_model">FitzHugh-Nagumo equations</a>, which is by-design, different in length scale to the patterns I generated in my <a href="https://www.sciencedirect.com/science/article/abs/pii/S2590238525002516">final paper</a> during my PhD. The tattoo will use pointilism as a way of deconstructing the patterns into discrete units, like protocells in nature or transistors in computation. The pointilism will describe concentration - denser number of points in the parts of high concentration, and fewer points in darker parts of the pattern.</p>

<p>A key macroscopic feature will be that the design will have a sharp break in the middle that represents embryogenesis, the phenomena I find most fascinating in biology, while also representing a break in my views/hopes of science. This break will take the form of two cells splitting, a symbolic representation for me; from viewing science as almost a spiritual endeavor to a more practical view of finding challenging problems without compromising on my lifestyle and goals.</p>

<h2 id="interactive-visualization">Interactive Visualization</h2>

<div style="position: relative; width: 100%; height: 600px; margin: 20px 0;">
    <iframe src="/assets/tattoo.html" width="100%" height="100%" frameborder="0" scrolling="no" style="border: 1px solid #ddd; border-radius: 8px; box-shadow: 0 2px 8px rgba(0,0,0,0.1);">
    </iframe>
</div>]]></content><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><summary type="html"><![CDATA[Reasoning]]></summary></entry><entry><title type="html">Everything is infrastructure</title><link href="/2025/06/16/infrastructure.html" rel="alternate" type="text/html" title="Everything is infrastructure" /><published>2025-06-16T18:00:45+00:00</published><updated>2025-06-16T18:00:45+00:00</updated><id>/2025/06/16/infrastructure</id><content type="html" xml:base="/2025/06/16/infrastructure.html"><![CDATA[<p>I was recently on reddit, trying to figure out why monorepos just <em>feel</em> right, and I stumbled upon <a href="https://danluu.com/monorepo/">this article</a> discussing the advantages to monorepos.</p>

<p>It’s been a wild ride that I’ve come on - from being a strict molecular programmer, to a hardware engineer, to a begrudging software engineer, to slowly but surely seeing an infrastructure engineer peek out at me…</p>

<p>Now that I’m in the middle of it all, I realize something fundamental about my journey: I’ve always been drawn to building tools that make life easier for others. The only difference now is who I’m building for.</p>

<p>When I started as a molecular programmer, I was building tools for scientists and researchers - people who needed computational and/or electromechanical solutions but weren’t necessarily coders themselves. I was the bridge between complex algorithms and practical scientific problems. Every tool I built was designed to make someone else’s work more efficient, more reliable, more accessible.</p>

<p>Now, as I find myself drawn deeper into infrastructure engineering, I see the same pattern emerging. I’m still building tools that make life easier - but now my users are other developers. Instead of creating applications that scientists can use, I’m creating the platforms, systems, and processes that enable other developers to build their own tools more effectively.</p>

<p>The monorepo discussion that sparked this reflection is a perfect example. It’s not about the code itself, but about the infrastructure that makes development teams more productive. It’s about building the tools that build the tools.</p>

<p>This progression feels natural because it’s the same mission, just with a different audience. I’m still the bridge, still the enabler - just operating at a different layer of the technology stack. And now that I’m here, I can see this infrastructure work for what it truly is: another tool-building exercise.</p>]]></content><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><summary type="html"><![CDATA[I was recently on reddit, trying to figure out why monorepos just feel right, and I stumbled upon this article discussing the advantages to monorepos.]]></summary></entry><entry><title type="html">Generating Reaction-Diffusion Tattoos</title><link href="/2025/06/14/generating-rd-tattoos.html" rel="alternate" type="text/html" title="Generating Reaction-Diffusion Tattoos" /><published>2025-06-14T18:00:45+00:00</published><updated>2025-06-14T18:00:45+00:00</updated><id>/2025/06/14/generating-rd-tattoos</id><content type="html" xml:base="/2025/06/14/generating-rd-tattoos.html"><![CDATA[<p>With the infrastructure in place, it’s time to actually generate a tattoo to commemorate the last ~7 years of my life.</p>

<p>This is the symbolism I want layered into this tattoo:</p>

<ul>
  <li>
    <p>The Turing pattern will be a uniquely-generated tattoo to represent my detatchment from doing truely basic science, for systemic academic reasons. The pattern will be a dot pattern (similar to <a href="https://www.stylecraze.com/articles/dotwork-tattoos/">these tattoos</a>) (specifically the pointilism-style tattoos) where the dots each represent a single cell/transistor - the basis for computing that I build my PhD around, and the basis which I keep on chugging along.</p>
  </li>
  <li>
    <p>The design will be encompassed in oval-like cellular structures, with a sharp break in the middle to represent some of the disillusionment I felt with science/engineering in general.</p>
  </li>
</ul>]]></content><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><summary type="html"><![CDATA[With the infrastructure in place, it’s time to actually generate a tattoo to commemorate the last ~7 years of my life.]]></summary></entry><entry><title type="html">Building a reaction-diffusion simulator</title><link href="/2025/06/06/RDSimul.html" rel="alternate" type="text/html" title="Building a reaction-diffusion simulator" /><published>2025-06-06T18:00:45+00:00</published><updated>2025-06-06T18:00:45+00:00</updated><id>/2025/06/06/RDSimul</id><content type="html" xml:base="/2025/06/06/RDSimul.html"><![CDATA[<h2 id="introduction">Introduction</h2>
<p>Turing patterns (and more broadly, reaction-diffusion phenomena) have had a large impact on my life over the last few years - from the idea that simple mathematical models can lead to these complex, seen-in-nature patterns, to the idea that phenomena such as morphogenesis and embryogenesis can be modeled using these same principles. One of my favorite papers, <a href="https://distill.pub/2020/growing-ca/">Growing Neural Cellular Automata</a>, is based on this principle. Instead of taking the approach of modeling reaction-diffusion using this method, this paper approached the problem by approximating partial differential equations as discrete blocks (cellular automata) - not that different from normal finite element analysis methods which create non-uniform discrete meshes.</p>

<p>This was inspiration for a lot of the work I did during my PhD - I was fascinated with the ability to recreate a lot of these reaction-diffusion patterns seen throughout biology. In particular, I fell in love with the idea that complex, emergent systems can be simulated using first-principles or data-driven approaches, and even recreated in the lab if the simulated principles were cleverly designed.</p>

<p>I wanted to at least design a simple CA-based approach for solving some of the most famous Turing patterns using a readable,  developer-friendly python package that I could then use to design a tattoo that memorializes this chapter in my life.</p>

<p>The code for developing this tattoo can be found at <a href="https://github.com/MishaRubanov/RDtattoo">repository</a>. A lot of this effort was inspired from this <a href="https://github.com/ijmbarr/turing-patterns/blob/master/turing-patterns.ipynb">repo</a>.</p>

<p>A jupyter notebook with the reaction-diffusion simulator environment (and a few examples) can be found at <a href="https://tattoonotebook.misharubanov.com">https://tattoonotebook.misharubanov.com</a>  and a streamlit app for no-code simulator exploration can be found at <a href="https://rdapp.misharubanov.com/">https://rdapp.misharubanov.com/</a>.</p>

<h2 id="setting-up-the-code">Setting up the code</h2>
<p>This codebase can be divided into three main components:  simulation, visualization, and default generation.</p>

<h3 id="simulation">Simulation</h3>
<p>The <a href="https://github.com/MishaRubanov/RDtattoo/blob/main/rdtattoo/tattoo_functions.py">simulator</a> was developed with scalability and modularity in mind - the overarching goal was to be able not only to simulate any 2-species reactions-diffusion system, but also to be able to easily add new reaction systems and default values as needed, as well as any initial conditions for the two species. The <code class="language-plaintext highlighter-rouge">ReactionFunction</code> protocol implements the general structure that each reaction-diffusion equation should take - as input it takes two arrays (each describing the a/b variables) and two constants (describing the reaction rates).</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">@</span><span class="n">runtime_checkable</span>
<span class="k">class</span> <span class="nc">ReactionFunction</span><span class="p">(</span><span class="n">Protocol</span><span class="p">):</span>
    <span class="s">"""Protocol defining the interface for reaction functions.

    A reaction function calculates the rate of change for a chemical species
    based on the current concentrations of both species and reaction parameters.

    Methods:
        __call__: Calculate the reaction rate for a species.
    """</span>

    <span class="k">def</span> <span class="nf">__call__</span><span class="p">(</span>
        <span class="bp">self</span><span class="p">,</span> <span class="n">a</span><span class="p">:</span> <span class="n">FloatArrayType</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="n">FloatArrayType</span><span class="p">,</span> <span class="n">alpha</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span> <span class="n">beta</span><span class="p">:</span> <span class="nb">float</span>
    <span class="p">)</span> <span class="o">-&gt;</span> <span class="n">FloatArrayType</span><span class="p">:</span> <span class="p">...</span>
</code></pre></div></div>

<p>The simulator can then be instantiated with diffusion coefficients, rate constants, simulation parameters (height/width/time or space resolution) and the <code class="language-plaintext highlighter-rouge">ReactionType</code> (an <code class="language-plaintext highlighter-rouge">Enum</code> that specifies which set of reactions to use):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">class</span> <span class="nc">ReactionType</span><span class="p">(</span><span class="n">Enum</span><span class="p">):</span>
    <span class="s">"""Enumeration of available reaction-diffusion system types.

    Each reaction type represents a different chemical reaction system with its own
    mathematical equations and behavior patterns.

    Values:
        BRUSSELATOR: The Brusselator model, a theoretical model for a type of
            autocatalytic reaction.
        FITZHUGH_NAGUMO: The FitzHugh-Nagumo model, a simplified model of
            neuron behavior.
        GRAY_SCOTT: The Gray-Scott model, a reaction-diffusion system that can
            produce various patterns.
    """</span>

    <span class="n">BRUSSELATOR</span> <span class="o">=</span> <span class="mi">1</span>
    <span class="n">FITZHUGH_NAGUMO</span> <span class="o">=</span> <span class="mi">2</span>
    <span class="n">GRAY_SCOTT</span> <span class="o">=</span> <span class="mi">3</span>
</code></pre></div></div>
<p>The simulator is built on Pydantic’s <a href="https://docs.pydantic.dev/latest/concepts/models/">BaseModel</a> which provides a lot of powerful tools to automatically validate the model before running it. This actually blocked development for a bit - as I was having difficulty in figuring out the best way to define <code class="language-plaintext highlighter-rouge">ReactionType</code> without going through the effort of <a href="https://docs.pydantic.dev/latest/concepts/types/#custom-types">defining my own custom types.</a>. The solution I ended up going with was foreshadowed above - by storing all relevant information in an <code class="language-plaintext highlighter-rouge">Enum</code>, I can just save the <code class="language-plaintext highlighter-rouge">Enum</code> field within my simulator (for JSON dumping/loading and validation) and use that information to load the actual reaction functions into private fields that are not serialized/saved. Using this approach, a simulation run can be reliably recreated using the validated simulator parameters and the initial conditions for both species</p>

<h3 id="visualization">Visualization</h3>
<p>Once the simulation was completed, I needed some way to actually visualize the evolution of both species without storing every 2D frame. For this reason, I added the ability to specify the total number of frames when running the simulation to visualize.</p>

<p>The backbone for visualizing these simulations was to generate videos using Plotly. Once the simulation was completed, the <code class="language-plaintext highlighter-rouge">def run()</code> method output both a/b 3D arrays (the first dimension being frames over time) as well as the total number of time steps calculated.</p>

<p>These values can then be used as input to create animations within the <a href="https://github.com/MishaRubanov/RDtattoo/blob/main/rdtattoo/tattoo_plotter.py">tattoo_plotter</a>.</p>

<p>One limitation of this visualization method, however, is that rerunning these functions requires re-instantiating a new simulator and initial conditions, and running individually. I think that exploring this phase-space would be much more interesting if the simulations had an easy-to-use GUI - enter <a href="https://streamlit.io/">Streamlit</a>. Building a streamlit app is incredibly easy - and due to the well-typed simulator, being able to visualize (with parameter hints) became trivial. I developed a <a href="https://github.com/MishaRubanov/RDtattoo/blob/main/rdtattoo/rd_simulator_gui.py">streamlit app</a> that populates a set of parameters based on the field inputs, and allows a user to no-code run the simulator for both the pre-populated defaults as well as for any type of parameters the user is interested in. Switching between different reaction types enables the user to easily explore parameter spaces. Additionally, brief descriptions on how the different reaction types were set up (and their physical interpretations) were added to help remind the user (including myself) what each parameter means. The app can be found at https://rdapp.misharubanov.com/. If for some reason my server goes down, streamlit allows hosting of a few apps - you can find this app at https://rdtattoos.streamlit.app.</p>

<p>Details on implementation of the app backend and self-hosting the app can be found in the <a href="#infrastructure">infrastructure section</a>.</p>

<h3 id="default-generation">Default Generation</h3>
<p>Defaults were scraped from various parts of the web as well as some interesting parameters I found when exploring the simulation myself. These defaults were stored as <a href="https://github.com/MishaRubanov/RDtattoo/blob/main/rdtattoo/rd_defaults.py">instances</a> of the general <code class="language-plaintext highlighter-rouge">RDSimulator</code>.</p>

<h3 id="future-development-hopefully">Future development (hopefully)</h3>
<p>I would love to instantiate a SQL database that automatically logs all runs, so that when a user is exploring new parameter spaces they have to keep track of the parameters used, and having a nice represenation of the parameter and output space as a method that the user could call would be nice too!</p>

<p>For visualization, a lot could be improved in the GUI - from optimizing simulation times to removing deadspace around the animations. Hopefully I can do this at some point!</p>

<h2 id="infrastructure">Infrastructure</h2>
<p>To develop a reliable and clean environment for running this code, I chose to use Docker to deploy both a Jupyter <a href="https://tattoonotebook.misharubanov.com/login?next=%2Flab%3F">notebook</a> and a streamlit <a href="https://rdapp.misharubanov.com/">app</a>. This had the added benefit of easily working with my <a href="https://misharubanov.github.io/2025/05/18/self-hosted-setup.html">self-hosted stack</a>. To install the same environment within my jupyter notebook, I reorganized the notebook to be pip-installable as a local package:</p>

<div class="language-docker highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="s"> quay.io/jupyter/base-notebook</span>
<span class="k">WORKDIR</span><span class="s"> /home/jovyan/work</span>
<span class="k">COPY</span><span class="s"> . .</span>
<span class="k">RUN </span>pip <span class="nb">install</span> <span class="nt">--no-cache-dir</span> <span class="nt">-r</span> requirements.txt <span class="o">&amp;&amp;</span> <span class="se">\
</span>    pip <span class="nb">install</span> <span class="nt">-e</span> .
<span class="k">EXPOSE</span><span class="s"> 8888</span>
<span class="k">ENV</span><span class="s"> JUPYTER_ENABLE_LAB=yes</span>
<span class="k">RUN </span>jupyter notebook <span class="nt">--generate-config</span> <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">echo</span> <span class="s2">"c.NotebookApp.password='argon2:</span><span class="nv">$argon2id$v</span><span class="s2">=19</span><span class="nv">$m</span><span class="s2">=10240,t=10,p=8</span><span class="nv">$W</span><span class="s2">/YoaK1HmUWy4ITRrMArwg</span><span class="nv">$3s7sDEPluB2Cp97GURa1</span><span class="s2">+cs0L4/uNruSYE9uXjjYxCA'"</span> <span class="o">&gt;&gt;</span> /home/jovyan/.jupyter/jupyter_notebook_config.py
<span class="k">CMD</span><span class="s"> ["jupyter", "lab", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"] </span>
</code></pre></div></div>
<p>For added security, I created a hashed password that would prevent running rogue python within these notebooks from anywhere on the internet (send me a message if you want to try it out!).</p>

<p>For streamlit, the environment looks similar except that instead of opening a notebook, <code class="language-plaintext highlighter-rouge">streamlit run...</code> enables generation of the app.</p>

<p>For deploying to a DNS subdomain, I used coolify with a github webhook to automatically redeploy this public repository (more details <a href="https://misharubanov.github.io/2025/05/18/self-hosted-setup.html">here</a>).</p>

<h2 id="next-steps">Next steps</h2>
<p>Now that I have the infrastructure in place to really explore these patterns, I want to focus on using them as a symbol for my relationship to science and engineering. My next post will be exploring the personal significance that these patterns have had over the last ~7 years of my life.</p>]]></content><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><summary type="html"><![CDATA[Introduction Turing patterns (and more broadly, reaction-diffusion phenomena) have had a large impact on my life over the last few years - from the idea that simple mathematical models can lead to these complex, seen-in-nature patterns, to the idea that phenomena such as morphogenesis and embryogenesis can be modeled using these same principles. One of my favorite papers, Growing Neural Cellular Automata, is based on this principle. Instead of taking the approach of modeling reaction-diffusion using this method, this paper approached the problem by approximating partial differential equations as discrete blocks (cellular automata) - not that different from normal finite element analysis methods which create non-uniform discrete meshes.]]></summary></entry><entry><title type="html">My Self Hosted Setup</title><link href="/2025/05/18/self-hosted-setup.html" rel="alternate" type="text/html" title="My Self Hosted Setup" /><published>2025-05-18T18:00:45+00:00</published><updated>2025-05-18T18:00:45+00:00</updated><id>/2025/05/18/self-hosted-setup</id><content type="html" xml:base="/2025/05/18/self-hosted-setup.html"><![CDATA[<h2 id="hetzner">Hetzner</h2>

<p>I use Hetzner for hosting. Their CAX21 ARM server (4 vCPUs, 8GB RAM, 80GB NVMe) runs €6.49/month—far cheaper than AWS or DigitalOcean. <a href="https://docs.hetzner.com/cloud/servers/getting-started/creating-a-server/">Server setup</a> and <a href="https://community.hetzner.com/tutorials/install-and-configure-coolify-on-linux">Coolify installation</a> were straightforward.</p>

<p>The web interface handles provisioning, daily backups, and monitoring. Scaling is fast—I doubled my storage and RAM in minutes for an extra $3/month. DDoS protection and firewall management come included.</p>

<h2 id="coolify">Coolify</h2>

<p><a href="https://coolify.io/">Coolify</a> is a self-hostable Heroku/Vercel alternative. I use it to deploy from GitHub, manage Docker containers, and spin up databases. It’s a thin wrapper around Docker and Traefik with a UI—sometimes opaque, and wiring apps through Cloudflare has been painful—but it works.</p>

<h3 id="networking">Networking</h3>

<p><strong><a href="https://coolify.io/docs/knowledge-base/proxy/traefik/overview">Traefik</a></strong>: Coolify’s default proxy. Handles SSL via Let’s Encrypt and routes traffic.</p>

<p><strong>Cloudflare</strong>: DNS, DDoS protection, caching. Cloudflare tunnels let me expose services without a public IP (<a href="https://rasmusgodske.com/posts/securely-expose-your-coolify-apps-with-the-magic-of-cloudflare-tunnels/">setup guide</a>).</p>

<h3 id="monitoring">Monitoring</h3>

<p><strong>Uptime Kuma</strong>: Lightweight uptime <a href="https://uptime.misharubanov.com/">monitoring</a> with alerts.</p>

<p><strong>Glance</strong>: Mobile-friendly <a href="dashboard.misharubanov.com">dashboard</a> for RSS, weather, and container status.</p>

<p><strong>Duplicati</strong>: Backups to cloud services, WEBDAV, and local storage.</p>

<p><strong>Dozzle</strong>: Container logs and resource usage.</p>

<p><strong>ntfy</strong>: Push notifications when services go down.</p>

<p><strong>Beszel</strong>: Server monitoring with alerts for CPU/memory spikes. Combined with Dozzle, debugging is quick.</p>

<h3 id="the-stack">The stack</h3>

<p>Code in GitHub → Coolify pulls and deploys containers → Traefik routes traffic with auto-SSL → Uptime Kuma and Glance monitor everything.</p>

<h2 id="apps">Apps</h2>

<p><strong>Audiobookshelf</strong>: My ebook/audiobook <a href="http://bookshelf.misharubanov.com/">library</a>. I set this up after learning Amazon was <a href="https://www.theverge.com/news/612898/amazon-removing-kindle-book-download-transfer-usb">removing Kindle download functionality</a>. Good web reader and Android app.</p>

<p><strong>Immich</strong>: Self-hosted <a href="https://immich.app/">Google Photos alternative</a> with facial recognition. Still in progress—TB+ of photos means I need a NAS before this makes financial sense.</p>

<p><strong>Gramps Web</strong>: A <a href="https://family.misharubanov.com/">family tree app</a> I set up after visiting ancestry in Uzbekistan. Multi-user with editor/guest roles, hot and cold backups.</p>

<p><strong>Vikunja</strong>: Simple <a href="https://todo.misharubanov.com/">todo app</a>. Not as polished as commercial options, but I own my data.</p>]]></content><author><name>Misha Rubanov</name><email>misha.rubanov.1@com</email></author><summary type="html"><![CDATA[Hetzner]]></summary></entry></feed>