DNA 파일의 DNA 코돈 계산

Question 1

스크립트의 첫 번째 줄이 새 bash셸을 시작하기 때문에 해당 출력을 얻습니다.

이 줄은 읽어야합니다

#!/bin/bash

( #시작 부분에 주의하세요).

그런 다음 awk절대로 작동하지 않는 방식으로 구문과 셸 코드를 혼합합니다.

대신, 단순하게 유지하고 파일을 세 개의 문자 그룹으로 나누고 정렬한 다음 얻을 수 있는 고유한 문자 수를 세어보세요.

$ fold -w 3 dnafile | sort | uniq -c
   3 aac
   2 acg
   1 ttt

이 접근 방식은 입력이 항상 3개의 문자의 배수를 포함하고 포함된 공백이나 다른 문자가 없는 한 작동합니다.

Answer

스크립트의 첫 번째 줄이 새 bash셸을 시작하기 때문에 해당 출력을 얻습니다.

이 줄은 읽어야합니다

#!/bin/bash

( #시작 부분에 주의하세요).

그런 다음 awk절대로 작동하지 않는 방식으로 구문과 셸 코드를 혼합합니다.

대신, 단순하게 유지하고 파일을 세 개의 문자 그룹으로 나누고 정렬한 다음 얻을 수 있는 고유한 문자 수를 세어보세요.

$ fold -w 3 dnafile | sort | uniq -c
   3 aac
   2 acg
   1 ttt

이 접근 방식은 입력이 항상 3개의 문자의 배수를 포함하고 포함된 공백이나 다른 문자가 없는 한 작동합니다.

Question 2

(echo aacacgaactttaacacg ;echo aacacgaactttaacacg ) |
  perl -ne '# Split input into triplets (A3)
            # use each triplet as key in the hash table count
            #   and increase the value for the key
            map { $count{$_}++ } unpack("(A3)*",$_);
            # When we are at the end of the file
            END{ 
                 # Remove the key "" (which is wrong)
                 delete $count{""};
                 # For each key: Print key, count
                 print map { "$_ $count{$_}\n" } keys %count
            }'

Answer

(echo aacacgaactttaacacg ;echo aacacgaactttaacacg ) |
  perl -ne '# Split input into triplets (A3)
            # use each triplet as key in the hash table count
            #   and increase the value for the key
            map { $count{$_}++ } unpack("(A3)*",$_);
            # When we are at the end of the file
            END{ 
                 # Remove the key "" (which is wrong)
                 delete $count{""};
                 # For each key: Print key, count
                 print map { "$_ $count{$_}\n" } keys %count
            }'

Question 3

약간 장황한 awk버전

awk 'BEGINFILE{print FILENAME; delete codon}
     ENDFILE {
     if (NR!=1 || NF!=1 || length($0)%3!=0){
         print "is broken"}
     else{
         for (i=1; i<=length($0); i+=3) codon[substr($0,i,3)]++}; 
         for (c in codon) print c, codon[c]; 
         print ""}' file*

이 입력의 경우

파일 1: 확인

aacacgaactttaacacg

파일 2: 공간

aacacgaact ttaacacg

파일 3: 개행 문자

aacacgaact
ttaacacg

file4: 3개 염기의 배수가 아님

aacacgaactttaacac

당신은 얻는다

file1
aac 3
ttt 1
acg 2

file2
is broken

file3
is broken

file4
is broken

파일을 복구하고 싶고 file4파일 cat이 tr한쪽 끝에서 전달되거나 awk예제와 같이 다른 쪽 끝에서 전달되는 것과 같은 것이 없는 경우

<<< $(cat file[1..3] | tr -d "\n ")

Answer