# Importing/exporting data from/to Scanpy

## Overview

This notebook demonstrates how to use Monet to import/export data from/to [Scanpy](https://scanpy.readthedocs.io/en/stable/).

*Note: This functionality requires Monet >= 0.2.2, please run `pip install 'monet>=0.2.2'` to upgrade if necessary.*

*Note: This assumes that you have [scanpy installed](https://scanpy.readthedocs.io/en/stable/installation.html) (it's not automatically installed with Monet).*

Scanpy represents expression data using `AnnData` objects, which can hold the expression matrix as well as gene/cell annotation data. Please see the [Scanpy manual](https://scanpy.readthedocs.io/en/stable/usage-principles.html) for more details. In contrast, Monet represents expression data using `ExpMatrix` objects, which only contain the expression matrix (including the gene and cell names). The `ExpMatrix` class is a simple wrapper (subclass) of the pandas `DataFrame`, and can be used in identical fashion. Rows of the data frame correspond to genes, and columns correspond to cells.

### Set up notebook

In [1]:
# change notebook width and font
from IPython.core.display import HTML, display
display(HTML(""""""))

from monet import util
_LOGGER = util.configure_logger()

## Import data from Scanpy by converting `AnnData` objects to `ExpMatrix` objects

Here, we use the `ExpMatrix.from_anndata()` function to convert an `AnnData` object from Scanpy into an `ExpMatrix` object from Monet.

In [2]:
# first, we load a dataset with Scanpy
from scanpy import datasets

adata = datasets.pbmc3k()
print(adata)

[2020-06-22 11:01:16] (numexpr.utils) INFO: Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
[2020-06-22 11:01:16] (numexpr.utils) INFO: NumExpr defaulting to 8 threads.
[2020-06-22 11:01:16] (get_version) INFO: dirname: Trying to get version of get_version from dirname /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages
[2020-06-22 11:01:16] (get_version) INFO: dirname: Failed; Does not match re.compile('get[_-]version-([\\d.]+?)(?:\\.dev(\\d+))?(?:[_+-]([0-9a-zA-Z.]+))?$')
[2020-06-22 11:01:16] (get_version) INFO: git: Trying to get version from git in directory /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages
[2020-06-22 11:01:16] (get_version) INFO: git: Failed; directory is not managed by git
[2020-06-22 11:01:16] (get_version) INFO: metadata: Trying to get version for get_version in dir /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages
[2020-06-22 11:01:16] (get_version) INFO: metadata: Succeeded
[202

In [3]:
import gc
from monet import ExpMatrix

matrix = ExpMatrix.from_anndata(adata)
print(matrix)

# free up memory
del adata; gc.collect()




66

## Export data to Scanpy by converting `ExpMatrix` objects to `AnnData` objects

Here, we use the `ExpMatrix.to_anndata()` function to convert an `ExpMatrix` object from Monet into an `AnnData` object from Scanpy. We're also showing that the exporting/importing cycle accurately preserves the expression data, by comparing the `hash` value of the resulting `ExpMatrix` object to the original `ExpMatrix` object.

In [4]:
# export data to AnnData object
adata = matrix.to_anndata()
print(adata)

AnnData object with n_obs × n_vars = 2700 × 32738


In [5]:
# now check accuracy
original_hash = matrix.hash
del matrix; gc.collect()

matrix = ExpMatrix.from_anndata(adata)
new_hash = matrix.hash

print('Original hash:', original_hash)
print('New hash: ', new_hash)
print('Identical?', original_hash == new_hash)

# free up memory
del matrix; gc.collect()

Original hash: dc9636573cc717aa76f07b07c936457d
New hash: dc9636573cc717aa76f07b07c936457d
Identical? True


0