pymportx

Here the main arguments of the package functions are described:

Salmon, Sailfish or kallisto


salmon.read_salmon(), sailfish.read_sailfish(), kallisto.read_kallisto()

This main functions have the following arguments:

Argument Description
folders A list of strings with the folder path for each of the samples.
tx_out Boolean argument. Default is False for gene-level output. Set to True for transcript-level output.
tx2gene A two-column .csv file containing gene annotations: transcript ID in the first column and and gene ID in the second column.
countsFromAbundance Could be set to either: " no " (default), " scaledTPM ", " lengthScaledTPM ", " dtuScaledTPM ". See countsFromAbundance for more detail.
dropInfReps Whether to skip inferential replicates read or not (default is False)1.
varReduce Whether to condense per-sample inferential replicated into a matrix displaying sample variances (default is False)1.
infRepStat A predefined function to operate over rows of inferential replicates (default is median over rows).
ignoreTxVersion Whether to ignore transcript isoforms by removing the version number from the transcriptID after the period ' . ' .
ignoreAfterBar Whether to ignore transcriptID characters after the bar ' / '.

1 Salmon, Sailfish, and kallisto users have the option to include inferential replicates (DropInfReps = False) of each sample in a separate matrix. Alternatively, they can include the variance by setting varReduce = True.

RSEM


The arguments for rsem.read_rsem() are detailed in the following table:

Argument Description
folders A list of strings with the quantification file path for each of the samples.
tx_in Boolean argument. Default is True for trasncript-level input. Set to False for transcript-level output.
tx_out Boolean argument. Default is False for gene-level output. Set to True for transcript-level output.
tx2gene A two-column .csv file containing gene annotations: transcript ID in the first column and and gene ID in the second column.
countsFromAbundance Could be set to either: " no " (default), " scaledTPM ", " lengthScaledTPM ", " dtuScaledTPM ". See countsFromAbundance for more detail.
ignoreTxVersion Whether to ignore transcript isoforms by removing the version number from the transcriptID after the period ' . ' .
ignoreAfterBar Whether to ignore transcriptID characters after the bar ' / '.

*countsFromAbundance:

This countsFromAbundance argument could be set to either:

  • " no " (default) : Determining whether to produce estimated counts using abundance estimations.

  • " scaledTPM " : Obtain estimated counts scaled up to library size.

  • " lengthScaledTPM " : Adjusted by utilizing the mean transcript length across samples, followed by the library size.

  • " dtuScaledTPM " : Obtain estimated counts scaled by employing the median length of transcripts within gene isoforms, followed by the library size.