{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Importing/exporting data from/to Scanpy\n", "\n", "## Overview\n", "\n", "This notebook demonstrates how to use Monet to import/export data from/to [Scanpy](https://scanpy.readthedocs.io/en/stable/).\n", "\n", "*Note: This functionality requires Monet >= 0.2.2, please run `pip install 'monet>=0.2.2'` to upgrade if necessary.*\n", "\n", "*Note: This assumes that you have [scanpy installed](https://scanpy.readthedocs.io/en/stable/installation.html) (it's not automatically installed with Monet).*\n", "\n", "Scanpy represents expression data using `AnnData` objects, which can hold the expression matrix as well as gene/cell annotation data. Please see the [Scanpy manual](https://scanpy.readthedocs.io/en/stable/usage-principles.html) for more details. In contrast, Monet represents expression data using `ExpMatrix` objects, which only contain the expression matrix (including the gene and cell names). The `ExpMatrix` class is a simple wrapper (subclass) of the pandas `DataFrame`, and can be used in identical fashion. Rows of the data frame correspond to genes, and columns correspond to cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up notebook" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# change notebook width and font\n", "from IPython.core.display import HTML, display\n", "display(HTML(\"\"\"\"\"\"))\n", "\n", "from monet import util\n", "_LOGGER = util.configure_logger()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import data from Scanpy by converting `AnnData` objects to `ExpMatrix` objects\n", "\n", "Here, we use the `ExpMatrix.from_anndata()` function to convert an `AnnData` object from Scanpy into an `ExpMatrix` object from Monet." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2020-06-22 11:01:16] (numexpr.utils) INFO: Note: NumExpr detected 12 cores but \"NUMEXPR_MAX_THREADS\" not set, so enforcing safe limit of 8.\n", "[2020-06-22 11:01:16] (numexpr.utils) INFO: NumExpr defaulting to 8 threads.\n", "[2020-06-22 11:01:16] (get_version) INFO: dirname: Trying to get version of get_version from dirname /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages\n", "[2020-06-22 11:01:16] (get_version) INFO: dirname: Failed; Does not match re.compile('get[_-]version-([\\\\d.]+?)(?:\\\\.dev(\\\\d+))?(?:[_+-]([0-9a-zA-Z.]+))?$')\n", "[2020-06-22 11:01:16] (get_version) INFO: git: Trying to get version from git in directory /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages\n", "[2020-06-22 11:01:16] (get_version) INFO: git: Failed; directory is not managed by git\n", "[2020-06-22 11:01:16] (get_version) INFO: metadata: Trying to get version for get_version in dir /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages\n", "[2020-06-22 11:01:16] (get_version) INFO: metadata: Succeeded\n", "[2020-06-22 11:01:16] (get_version) INFO: dirname: Trying to get version of legacy_api_wrap from dirname /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages\n", "[2020-06-22 11:01:16] (get_version) INFO: dirname: Failed; Does not match re.compile('legacy[_-]api[_-]wrap-([\\\\d.]+?)(?:\\\\.dev(\\\\d+))?(?:[_+-]([0-9a-zA-Z.]+))?$')\n", "[2020-06-22 11:01:16] (get_version) INFO: git: Trying to get version from git in directory /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages\n", "[2020-06-22 11:01:16] (get_version) INFO: git: Failed; directory is not managed by git\n", "[2020-06-22 11:01:16] (get_version) INFO: metadata: Trying to get version for legacy_api_wrap in dir /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages\n", "[2020-06-22 11:01:16] (get_version) INFO: metadata: Succeeded\n", "AnnData object with n_obs × n_vars = 2700 × 32738\n", " var: 'gene_ids'\n" ] } ], "source": [ "# first, we load a dataset with Scanpy\n", "from scanpy import datasets\n", "\n", "adata = datasets.pbmc3k()\n", "print(adata)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/plain": [ "66" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import gc\n", "from monet import ExpMatrix\n", "\n", "matrix = ExpMatrix.from_anndata(adata)\n", "print(matrix)\n", "\n", "# free up memory\n", "del adata; gc.collect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Export data to Scanpy by converting `ExpMatrix` objects to `AnnData` objects\n", "\n", "Here, we use the `ExpMatrix.to_anndata()` function to convert an `ExpMatrix` object from Monet into an `AnnData` object from Scanpy. We're also showing that the exporting/importing cycle accurately preserves the expression data, by comparing the `hash` value of the resulting `ExpMatrix` object to the original `ExpMatrix` object." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AnnData object with n_obs × n_vars = 2700 × 32738\n" ] } ], "source": [ "# export data to AnnData object\n", "adata = matrix.to_anndata()\n", "print(adata)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original hash: dc9636573cc717aa76f07b07c936457d\n", "New hash: dc9636573cc717aa76f07b07c936457d\n", "Identical? True\n" ] }, { "data": { "text/plain": [ "0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# now check accuracy\n", "original_hash = matrix.hash\n", "del matrix; gc.collect()\n", "\n", "matrix = ExpMatrix.from_anndata(adata)\n", "new_hash = matrix.hash\n", "\n", "print('Original hash:', original_hash)\n", "print('New hash: ', new_hash)\n", "print('Identical?', original_hash == new_hash)\n", "\n", "# free up memory\n", "del matrix; gc.collect()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 4 }