Skip to contents

Design KASP markers based on causal variants.

Usage

kasp_marker_design(
  vcf_file = NULL,
  gt_df = NULL,
  variant_id_col = "variant_id",
  chrom_col = "chrom",
  pos_col = "pos",
  ref_al_col = "ref",
  alt_al_col = "alt",
  geno_start = 7,
  marker_ID,
  chr = NULL,
  genome_file,
  save_alignment = TRUE,
  plot_file = tempdir(),
  region_name = "loc_1",
  maf = 0.05
)

Arguments

vcf_file

Path to the vcf file containing identified variants.

gt_df

A data frame or matrix containing the meta data of identified variants and sample VCF genotype calls. The variants are rows and samples as columns.

variant_id_col, chrom_col, pos_col

A character value specifying the column names of variant IDs, chromosome, and positions in `gt_df` or `vcf_file`.

ref_al_col, alt_al_col,

A character value specifying the column names of reference and alternate alleles, respectively in `gt_df` or `vcf_file`.

geno_start

An integer value specifying the column index number of the start of the sample genotypes in `gt_df` or `vcf_file`.

marker_ID

Designated name of variant for marker design. Name must be contained in `gt_df` or `vcf_file`.

chr

A character value representation the chromosome description in the `genome_file`. Providing this helps to save memory in R.

genome_file

Path to reference genome file in fasta format, either compressed (.gz) or decompressed.

save_alignment

A logical value indicating whether to save the alignment plot to a directory specified by `plot_file`.

plot_file

Path to save the sequence alignment if `plot_draw = TRUE`.

region_name

A n optional character value for assigned region name.

maf

A numeric value between 0 and 1 representing the minor allele frequency for variant subset.

Value

A list object of two components comprising a data frame containing all information required for KASP marker design, and a plot of DNA sequence alignment to the reference genome.

Details

This function provides the intertek sequence to be used for marker development for the selected casual variants. It provides all the information of allele and location to fill Intertek form.It needs a vcf file of variants calls, and a genome sequence of the target crop in fasta format.

Examples

# \donttest{
# Example to design a KASP marker on a substitution variant
path <- tempdir()
path1 <- "https://raw.githubusercontent.com/awkena/panGB/main/Chr02.fa.gz"
path2 <-  system.file("extdata", "Sobic.002G302700_SNP_snpeff.vcf",
                     package = "panGenomeBreedr",
                     mustWork = TRUE)

ma1 <- kasp_marker_design(vcf_file = path2,
                          variant_id_col = 'ID',
                          chrom_col = 'CHROM',
                          pos_col = 'POS',
                          ref_al_col = 'REF',
                          alt_al_col = 'ALT',
                          genome_file = path1,
                          geno_start = 10,
                          marker_ID = "SNP_Chr02_69200443",
                          chr = "Chr02",
                          save_alignment = TRUE,
                          plot_file = path,
                          region_name = "ma1",
                          maf = 0.05)
#> using Gonnet

# View marker alignment output from temp folder
# path3 <- file.path(path, list.files(path = path, "alignment_"))
# system(paste0('open "', path3, '"'))

on.exit(unlink(path))
# }