아래와 같은 데이터 세트가 있습니다.
chr1 HAVANA gene 69091 70008 . + . gene_id "ENSG00000186092.4"; transcript_id "ENSG00000186092.4"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "OR4F5"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "OR4F5"; level 2; havana_gene "OTTHUMG00000001094.1";
chr1 ENSEMBL gene 134901 139379 . - . gene_id "ENSG00000237683.5"; transcript_id "ENSG00000237683.5"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "AL627309.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "AL627309.1"; level 3;
chr1 HAVANA gene 367640 368634 . + . gene_id "ENSG00000235249.1"; transcript_id "ENSG00000235249.1"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "OR4F29"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "OR4F29"; level 2; havana_gene "OTTHUMG00000002860.1";
chr1 HAVANA gene 621059 622053 . - . gene_id "ENSG00000185097.2"; transcript_id "ENSG00000185097.2"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "OR4F16"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "OR4F16"; level 2; havana_gene "OTTHUMG00000002581.1";
chr1 ENSEMBL gene 738532 739137 . - . gene_id "ENSG00000269831.1"; transcript_id "ENSG00000269831.1"; gene_type "protein_coding"; gene_status "NOVEL"; gene_name "AL669831.1"; transcript_type "protein_coding"; transcript_status "NOVEL"; transcript_name "AL669831.1"; level 3;
가능하다면 awk를 사용하여 $5와 $4의 차이가 2000 미만인 유전자를 삭제하고 싶습니다. sed도 허용되지만.
따라서 다음을 반환합니다.
chr1 ENSEMBL gene 134901 139379 . - . gene_id "ENSG00000237683.5"; transcript_id "ENSG00000237683.5"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "AL627309.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "AL627309.1"; level 3;
감사해요.
답변1
awk '$5-$4 >= 2000' file
$5가 항상 $4보다 큰 경우