admin管理员组

文章数量:1031259

CUT&Tag 数据处理和分析教程(6)

简介

CUT&Tag 技术会在靠近固定酶的染色质颗粒两侧加上接头,不过染色质颗粒内部的标签化反应也有可能发生。所以,当 CUT&Tag 针对组蛋白修饰时,得到的主要是核小体长度(大约 180 bp)或其倍数的片段。而如果目标是转录因子,就会生成核小体大小的片段,同时混杂一些较短的片段,这些短片段分别来自旁边的核小体和转录因子结合的位置。此外,核小体表面的 DNA 也会被标签化。通过绘制片段长度分布图(精确到单个碱基对),可以观察到 10 bp 的锯齿形周期变化,这是成功的 CUT&Tag 实验的一个典型标志。

评估测序片段的长度分布

代码语言:javascript代码运行次数:0运行复制
##== linux command ==##
mkdir -p $projPath/alignment/sam/fragmentLen

## Extract the 9th column from the alignment sam file which is the fragment length
samtools view -F 0x04 $projPath/alignment/sam/${histName}_bowtie2.sam | awk -F'\t' 'function abs(x){return ((x < 0.0) ? -x : x)} {print abs($9)}' | sort | uniq -c | awk -v OFS="\t" '{print $2, $1/2}' >$projPath/alignment/sam/fragmentLen/${histName}_fragmentLen.txt
代码语言:javascript代码运行次数:0运行复制
##=== R command ===## 
## Collect the fragment size information
fragLen = c()
for(hist in sampleList){

  histInfo = strsplit(hist, "_")[[1]]
  fragLen = read.table(paste0(projPath, "/alignment/sam/fragmentLen/", hist, "_fragmentLen.txt"), header = FALSE) %>% mutate(fragLen = V1 %>% as.numeric, fragCount = V2 %>% as.numeric, Weight = as.numeric(V2)/sum(as.numeric(V2)), Histone = histInfo[1], Replicate = histInfo[2], sampleInfo = hist) %>% rbind(fragLen, .) 
}
fragLen$sampleInfo = factor(fragLen$sampleInfo, levels = sampleList)
fragLen$Histone = factor(fragLen$Histone, levels = histList)
## Generate the fragment size density plot (violin plot)
fig5A = fragLen %>% ggplot(aes(x = sampleInfo, y = fragLen, weight = Weight, fill = Histone)) +
    geom_violin(bw = 5) +
    scale_y_continuous(breaks = seq(0, 800, 50)) +
    scale_fill_viridis(discrete = TRUE, begin = 0.1, end = 0.9, option = "magma", alpha = 0.8) +
    scale_color_viridis(discrete = TRUE, begin = 0.1, end = 0.9) +
    theme_bw(base_size = 20) +
    ggpubr::rotate_x_text(angle = 20) +
    ylab("Fragment Length") +
    xlab("")

fig5B = fragLen %>% ggplot(aes(x = fragLen, y = fragCount, color = Histone, group = sampleInfo, linetype = Replicate)) +
  geom_line(size = 1) +
  scale_color_viridis(discrete = TRUE, begin = 0.1, end = 0.9, option = "magma") +
  theme_bw(base_size = 20) +
  xlab("Fragment Length") +
  ylab("Count") +
  coord_cartesian(xlim = c(0, 500))

ggarrange(fig5A, fig5B, ncol = 2)
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。原始发表:2025-04-11,如有侵权请联系 cloudcommunity@tencent 删除教程数据处理ampcutsize

CUT&amp;Tag 数据处理和分析教程(6)

简介

CUT&Tag 技术会在靠近固定酶的染色质颗粒两侧加上接头,不过染色质颗粒内部的标签化反应也有可能发生。所以,当 CUT&Tag 针对组蛋白修饰时,得到的主要是核小体长度(大约 180 bp)或其倍数的片段。而如果目标是转录因子,就会生成核小体大小的片段,同时混杂一些较短的片段,这些短片段分别来自旁边的核小体和转录因子结合的位置。此外,核小体表面的 DNA 也会被标签化。通过绘制片段长度分布图(精确到单个碱基对),可以观察到 10 bp 的锯齿形周期变化,这是成功的 CUT&Tag 实验的一个典型标志。

评估测序片段的长度分布

代码语言:javascript代码运行次数:0运行复制
##== linux command ==##
mkdir -p $projPath/alignment/sam/fragmentLen

## Extract the 9th column from the alignment sam file which is the fragment length
samtools view -F 0x04 $projPath/alignment/sam/${histName}_bowtie2.sam | awk -F'\t' 'function abs(x){return ((x < 0.0) ? -x : x)} {print abs($9)}' | sort | uniq -c | awk -v OFS="\t" '{print $2, $1/2}' >$projPath/alignment/sam/fragmentLen/${histName}_fragmentLen.txt
代码语言:javascript代码运行次数:0运行复制
##=== R command ===## 
## Collect the fragment size information
fragLen = c()
for(hist in sampleList){

  histInfo = strsplit(hist, "_")[[1]]
  fragLen = read.table(paste0(projPath, "/alignment/sam/fragmentLen/", hist, "_fragmentLen.txt"), header = FALSE) %>% mutate(fragLen = V1 %>% as.numeric, fragCount = V2 %>% as.numeric, Weight = as.numeric(V2)/sum(as.numeric(V2)), Histone = histInfo[1], Replicate = histInfo[2], sampleInfo = hist) %>% rbind(fragLen, .) 
}
fragLen$sampleInfo = factor(fragLen$sampleInfo, levels = sampleList)
fragLen$Histone = factor(fragLen$Histone, levels = histList)
## Generate the fragment size density plot (violin plot)
fig5A = fragLen %>% ggplot(aes(x = sampleInfo, y = fragLen, weight = Weight, fill = Histone)) +
    geom_violin(bw = 5) +
    scale_y_continuous(breaks = seq(0, 800, 50)) +
    scale_fill_viridis(discrete = TRUE, begin = 0.1, end = 0.9, option = "magma", alpha = 0.8) +
    scale_color_viridis(discrete = TRUE, begin = 0.1, end = 0.9) +
    theme_bw(base_size = 20) +
    ggpubr::rotate_x_text(angle = 20) +
    ylab("Fragment Length") +
    xlab("")

fig5B = fragLen %>% ggplot(aes(x = fragLen, y = fragCount, color = Histone, group = sampleInfo, linetype = Replicate)) +
  geom_line(size = 1) +
  scale_color_viridis(discrete = TRUE, begin = 0.1, end = 0.9, option = "magma") +
  theme_bw(base_size = 20) +
  xlab("Fragment Length") +
  ylab("Count") +
  coord_cartesian(xlim = c(0, 500))

ggarrange(fig5A, fig5B, ncol = 2)
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。原始发表:2025-04-11,如有侵权请联系 cloudcommunity@tencent 删除教程数据处理ampcutsize

本文标签: CUTampampTag 数据处理和分析教程(6)