{ "cells": [ { "cell_type": "markdown", "id": "18621e33", "metadata": {}, "source": [ "# ZEDprofiler Walkthrough\n", "\n", "ZEDprofiler is a CPU-first toolkit for extracting morphological features from 3D fluorescence microscopy images.\n", "It is designed for high-content and high-throughput workflows where images are volumetric z-stacks with multiple spectral channels and segmentation labels.\n", "\n", "## What this walkthrough covers\n", "\n", "By the end of this notebook you will have:\n", "\n", "1. **Generated synthetic 3D image data**: two channels (a compact nuclear stain and a diffuse marker with per-object colocalization) and a ground-truth label mask\n", "2. **Explored the data interactively**: browsed the z-stack volume with a slice-by-slice viewer\n", "3. **Loaded image sets and objects**: configured `ImageSetLoader`, `ObjectLoader`, and `TwoObjectLoader` to feed data into the pipeline\n", "4. **Extracted six feature classes**: colocalization, granularity, intensity, neighbors, texture, and volume/size/shape\n", "5. **Merged all features** into a single profile DataFrame with one row per object, following the ZEDprofiler naming convention\n", "6. **Normalized and feature selected** the profile using [`Pycytominer`](https://pycytominer.readthedocs.io): z-scoring features and removing low-variance and redundant columns\n", "7. **Visualized pairwise object similarity**: computed and plotted a Pearson correlation heatmap across all objects\n", "\n", "Each section includes a brief description of what is being computed and what the key parameters control." ] }, { "cell_type": "markdown", "id": "59b30dd9", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "id": "22324587", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "annotation=NoneType required=False default_factory=list\n" ] } ], "source": "import warnings\nimport logging\nimport pathlib\n\nimport numpy as np\nimport pandas as pd\nimport tifffile\nfrom itables import show\n\nimport zedprofiler\nfrom zedprofiler.IO import loading_classes\n\nlogging.basicConfig(level=logging.WARNING)\nlogging.getLogger(\"matplotlib\").setLevel(logging.WARNING)\nwarnings.filterwarnings(\"ignore\", category=SyntaxWarning, module=\"mahotas\")" }, { "cell_type": "markdown", "id": "1aec073d", "metadata": {}, "source": [ "## Define Paths and Parameters\n", "\n", "This walkthrough uses synthetically generated 3D arrays so it runs anywhere without real microscopy data.\n", "Two random intensity channels and a label mask are generated from spherical objects placed at known positions.\n", "\n", "The `channel_mapping` dictionary maps logical names to substrings found in your image filenames: ZEDprofiler uses these keys to identify which file corresponds to which channel or segmentation label.\n", "\n", "*Expand the cells below to see how paths are configured and how the synthetic data is generated.*" ] }, { "cell_type": "code", "execution_count": 2, "id": "8a34d8cf", "metadata": { "tags": [ "hide-input" ] }, "outputs": [], "source": [ "notebook_root = pathlib.Path.cwd().resolve()\n", "image_set_path = (notebook_root / \"test_data\" / \"dummy_image_set\").resolve()\n", "label_set_path = (notebook_root / \"test_data\" / \"dummy_label_set\").resolve()\n", "\n", "channel_mapping = {\n", " \"Channel1\": \"channel1\",\n", " \"Channel2\": \"channel2\",\n", " \"Nuclei\": \"nuclei_labels\",\n", "}\n", "\n", "CHANNEL1 = \"Channel1\"\n", "CHANNEL2 = \"Channel2\"\n", "COMPARTMENT = \"Nuclei\"" ] }, { "cell_type": "code", "execution_count": 3, "id": "3aad616f", "metadata": { "tags": [ "hide-input" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generated volume: (80, 80, 30), objects: 6\n", "Channel 1: nuclear stain (compact)\n", "Channel 2: diffuse stain (per-object colocalization strength varies)\n" ] } ], "source": [ "# --- synthetic data parameters ---\n", "VOLUME_SHAPE = (80, 80, 30) # (x, y, z)\n", "N_OBJECTS = 6\n", "MIN_RADIUS, MAX_RADIUS = 6, 12\n", "rng = np.random.default_rng(seed=42)\n", "\n", "# --- build label mask and intensity channels ---\n", "# Channel 1: nuclear stain: compact, bright signal confined to each nucleus\n", "# Channel 2: diffuse stain: partially overlaps each nucleus to a varying degree,\n", "# simulating a cytoplasmic/organelle marker with per-object colocalization\n", "nuclei_labels = np.zeros(VOLUME_SHAPE, dtype=np.uint16)\n", "channel1_array = np.zeros(VOLUME_SHAPE, dtype=np.float32)\n", "channel2_array = np.zeros(VOLUME_SHAPE, dtype=np.float32)\n", "\n", "xs = np.arange(VOLUME_SHAPE[0])\n", "ys = np.arange(VOLUME_SHAPE[1])\n", "zs = np.arange(VOLUME_SHAPE[2])\n", "grid = np.stack(np.meshgrid(xs, ys, zs, indexing=\"ij\"), axis=-1)\n", "\n", "centers, radii = [], []\n", "for obj_id in range(1, N_OBJECTS + 1):\n", " for _ in range(50):\n", " r = rng.integers(MIN_RADIUS, MAX_RADIUS)\n", " cx = rng.integers(r + 2, VOLUME_SHAPE[0] - r - 2)\n", " cy = rng.integers(r + 2, VOLUME_SHAPE[1] - r - 2)\n", " cz = rng.integers(r + 2, VOLUME_SHAPE[2] - r - 2)\n", " center = np.array([cx, cy, cz])\n", " if all(\n", " np.linalg.norm(center - c) > r + rv + 4 for c, rv in zip(centers, radii)\n", " ):\n", " break\n", " centers.append(center)\n", " radii.append(r)\n", "\n", " dist = np.linalg.norm(grid - center, axis=-1)\n", " nucleus_mask = dist <= r\n", " nuclei_labels[nucleus_mask] = obj_id\n", "\n", " # Channel 1: compact nuclear signal with fine texture\n", " ch1_base = rng.uniform(0.7, 1.0)\n", " channel1_array[nucleus_mask] = ch1_base + rng.normal(\n", " 0, 0.06, size=nucleus_mask.sum()\n", " )\n", "\n", " # Channel 2: diffuse stain with per-object colocalization strength\n", " # coloc_strength controls how much signal sits inside vs outside the nucleus\n", " coloc_strength = rng.uniform(0.15, 0.95)\n", " diffuse_r = r * rng.uniform(1.3, 2.0)\n", " diffuse_mask = (dist <= diffuse_r) & ~nucleus_mask\n", "\n", " ch2_base = rng.uniform(0.4, 0.9)\n", " channel2_array[nucleus_mask] += coloc_strength * ch2_base + rng.normal(\n", " 0, 0.08, size=nucleus_mask.sum()\n", " )\n", " if diffuse_mask.any():\n", " channel2_array[diffuse_mask] += (\n", " 1 - coloc_strength\n", " ) * ch2_base * 0.6 + rng.normal(0, 0.06, size=diffuse_mask.sum())\n", "\n", "# clip negatives and scale to uint8\n", "channel1_array = np.clip(channel1_array / channel1_array.max() * 255, 0, 255).astype(\n", " np.uint8\n", ")\n", "channel2_array = np.clip(channel2_array / channel2_array.max() * 255, 0, 255).astype(\n", " np.uint8\n", ")\n", "\n", "# save to disk\n", "image_set_path.mkdir(parents=True, exist_ok=True)\n", "label_set_path.mkdir(parents=True, exist_ok=True)\n", "tifffile.imwrite(image_set_path / \"channel1.tif\", channel1_array)\n", "tifffile.imwrite(image_set_path / \"channel2.tif\", channel2_array)\n", "tifffile.imwrite(label_set_path / \"nuclei_labels.tif\", nuclei_labels)\n", "\n", "print(f\"Generated volume: {VOLUME_SHAPE}, objects: {N_OBJECTS}\")\n", "print(\"Channel 1: nuclear stain (compact)\")\n", "print(\"Channel 2: diffuse stain (per-object colocalization strength varies)\")" ] }, { "cell_type": "markdown", "id": "preview-md", "metadata": {}, "source": [ "## Preview Synthetic Data\n", "\n", "The interactive viewer below shows a z-slice browser for all three arrays. Use the slider to step through the volume and see how the objects and their labels change across z-planes.\n", "\n", "*Expand the cell below to see how the interactive viewer is built.*" ] }, { "cell_type": "code", "execution_count": 4, "id": "preview-code", "metadata": { "tags": [ "hide-input" ] }, "outputs": [ { "data": { "text/html": [ "
| \n", "\n", " Loading ITables v2.8.0 from the internet...\n", " (need help?)\n", " | \n", "