BASH: 파일을 이름별로 그룹화

Question 1

안타깝게도 완전한 답변을 드릴 시간이 없으며 도움이 될 수 있는 몇 가지 팁만 말씀드리겠습니다.

방금 관련 파일을 인쇄하고 Unix 시간을 기준으로 정렬했습니다(일반/사람이 읽을 수 있는 시간보다 더 나은 것으로 나타났습니다).

find $PWD -type f -printf '%T@ %p\n' | sort -nb

그런 다음 30분 시작 시간에 대한 참조 지점으로 30분 그룹의 첫 번째 구성원의 Unix 시간을 저장하고 현재 파일의 Unix 타임스탬프(1800보다 큰 경우)와의 차이를 계산한 다음 새 그룹을 생성하고, 그렇지 않으면 추가할 수 있습니다. 현재 그룹에 . 다음과 같이 :

#!/bin/bash
#1800 s = 30 min
#unix time 86400s = 1 day

fileList=$(find $PWD -type f -printf '%T@ %p\n' | sort -nb)
## for debugging:
# fileList=$(find $PWD -type f -printf '%T@ %t %p\n' | sort -nb)

org_IFS=$IFS
IFS=$'\n'
group_start_time=0
for line in $fileList; do
    current_time=$(echo $line | awk '{print $1}')
    if [ $group_start_time -eq 0 ] ; then
        group_start_time=$current_time
    else
        delta=$(($current_time - $group_start_time))
        #echo $delta
        if [ $delta -lt 1801 ] ; then
            echo $line
        else
            echo -e "\nnew group:\n$line"
            group_start_time=$current_time
        fi
    fi
done
IFS=$org_IFS

거기에서 파일 경로를 원하는 파일로 리디렉션할 수 있습니다(>>사용). 그런 다음 mv해당 파일 목록을 해당 디렉토리에서 실행하십시오.

이게 도움이 되길 바란다. :)

편집: log.gz 파일 그룹을 대상 디렉터리 의 소스(귀하의) 파일 에 기록하도록 스크립트를 수정했습니다 /opt/rename/(귀하의 것으로 가정합니다 ). /opt/send/combined/수정된 코드는 다음과 같습니다.

#!/bin/bash
#1800 s = 30 min
#unix time 86400s = 1 day

sourceFolder="/opt/rename/"
target="/opt/send/combined/"

path_to_file=$target
current_file="ORACLE_gprtcp_000.log.gz"

fileList=$(find $sourceFolder -type f -name '*.log.gz' -printf '%T@ %p\n' | sort -nb)
## for debugging:
# fileList=$(find $PWD -type f -printf '%T@ %t %p\n' | sort -nb)

echo ${fileList[0]}

org_IFS=$IFS
IFS=$'\n'
group_start_time=0

for line in $fileList; do
    current_time=$(echo $line | awk '{print $1}')
    if [ $group_start_time -eq 0 ] ; then
        group_start_time=$current_time
        hr_time=$( date -d @$current_time +%F_%0k%0M )
        current_file="ORACLE_gprtcp_"$hr_time".log.gz"
    else
        delta=$(($current_time - $group_start_time))
        #echo $delta
        if [ $delta -lt 1801 ] ; then
            # just append file path to current_file
            echo $line | awk '{print $2}' >> $path_to_file"/"$current_file
            echo $line
        else
            # construct new filename based on time of the first member of the group
            hr_time=$( date -d @$current_time +%F_%0k%0M )
            current_file="ORACLE_gprtcp_"$hr_time".log.gz"

            # create file, append file path to current_file
            echo $line | awk '{print $2}' >> $path_to_file"/"$current_file
            echo -e "\nnew group:\n$line"

            group_start_time=$current_time
        fi
    fi
done

IFS=$org_IFS

Answer

안타깝게도 완전한 답변을 드릴 시간이 없으며 도움이 될 수 있는 몇 가지 팁만 말씀드리겠습니다.

방금 관련 파일을 인쇄하고 Unix 시간을 기준으로 정렬했습니다(일반/사람이 읽을 수 있는 시간보다 더 나은 것으로 나타났습니다).

find $PWD -type f -printf '%T@ %p\n' | sort -nb

그런 다음 30분 시작 시간에 대한 참조 지점으로 30분 그룹의 첫 번째 구성원의 Unix 시간을 저장하고 현재 파일의 Unix 타임스탬프(1800보다 큰 경우)와의 차이를 계산한 다음 새 그룹을 생성하고, 그렇지 않으면 추가할 수 있습니다. 현재 그룹에 . 다음과 같이 :

#!/bin/bash
#1800 s = 30 min
#unix time 86400s = 1 day

fileList=$(find $PWD -type f -printf '%T@ %p\n' | sort -nb)
## for debugging:
# fileList=$(find $PWD -type f -printf '%T@ %t %p\n' | sort -nb)

org_IFS=$IFS
IFS=$'\n'
group_start_time=0
for line in $fileList; do
    current_time=$(echo $line | awk '{print $1}')
    if [ $group_start_time -eq 0 ] ; then
        group_start_time=$current_time
    else
        delta=$(($current_time - $group_start_time))
        #echo $delta
        if [ $delta -lt 1801 ] ; then
            echo $line
        else
            echo -e "\nnew group:\n$line"
            group_start_time=$current_time
        fi
    fi
done
IFS=$org_IFS

거기에서 파일 경로를 원하는 파일로 리디렉션할 수 있습니다(>>사용). 그런 다음 mv해당 파일 목록을 해당 디렉토리에서 실행하십시오.

이게 도움이 되길 바란다. :)

편집: log.gz 파일 그룹을 대상 디렉터리 의 소스(귀하의) 파일 에 기록하도록 스크립트를 수정했습니다 /opt/rename/(귀하의 것으로 가정합니다 ). /opt/send/combined/수정된 코드는 다음과 같습니다.

#!/bin/bash
#1800 s = 30 min
#unix time 86400s = 1 day

sourceFolder="/opt/rename/"
target="/opt/send/combined/"

path_to_file=$target
current_file="ORACLE_gprtcp_000.log.gz"

fileList=$(find $sourceFolder -type f -name '*.log.gz' -printf '%T@ %p\n' | sort -nb)
## for debugging:
# fileList=$(find $PWD -type f -printf '%T@ %t %p\n' | sort -nb)

echo ${fileList[0]}

org_IFS=$IFS
IFS=$'\n'
group_start_time=0

for line in $fileList; do
    current_time=$(echo $line | awk '{print $1}')
    if [ $group_start_time -eq 0 ] ; then
        group_start_time=$current_time
        hr_time=$( date -d @$current_time +%F_%0k%0M )
        current_file="ORACLE_gprtcp_"$hr_time".log.gz"
    else
        delta=$(($current_time - $group_start_time))
        #echo $delta
        if [ $delta -lt 1801 ] ; then
            # just append file path to current_file
            echo $line | awk '{print $2}' >> $path_to_file"/"$current_file
            echo $line
        else
            # construct new filename based on time of the first member of the group
            hr_time=$( date -d @$current_time +%F_%0k%0M )
            current_file="ORACLE_gprtcp_"$hr_time".log.gz"

            # create file, append file path to current_file
            echo $line | awk '{print $2}' >> $path_to_file"/"$current_file
            echo -e "\nnew group:\n$line"

            group_start_time=$current_time
        fi
    fi
done

IFS=$org_IFS

Question 2

파일 이름에 "\n" 문자가 없다고 가정합니다.

find . -name '*_*_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_*.gz' | perl -le '
    use strict;
    use warnings;
    my %hash;
    while(<>) {
        chomp;
        my($group)=/^([^_]+_[^_]+_[0-9]{11})/;
        $group=~s/[0-2]$/00/;
        $group=~s/[3-5]$/30/;
        push @{$hash{$group}},$_;
    }
    while(my($group,$files_arr_ref)=each%hash) {
        print "processing group $group";
        for my$file (sort @{$files_arr_ref}) {
            print "processing file $file";
            # do system command calls here; for example
            # system "gzip -cd \"$file\" >> $group.txt";
        }
    }
'

편집: Craig의 제안에 따라 몇 가지 사항을 변경했습니다. 첫 번째 아이디어는 배열과 해시에 Perl을 사용하는 것이었고 결국 모든 것이 더 명확해졌습니다. @ARGV는 검색을 통과할 경로 목록입니다. 예를 들어 스크립트 이름이 script.pl인 경우:

script.pl ${sourceFolder}

#!/usr/bin/perl

use strict;
use warnings;
use File::Find;

my %hash;

sub wanted {
    return unless /^([^_]+_[^_]+_[0-9]{11})/;
    my$group=$1;
    $group=~s/[0-2]$/00/;
    $group=~s/[3-5]$/30/;
    push @{$hash{$group}},$_;
}

File::Find::find(\&wanted, @ARGV);

while(my($group,$files_arr_ref)=each%hash) {
    print "processing group $group\n";
    ### do system command calls here; for example
    # system "rm $group.txt";
    ### or just use perl
    # unlink $group.'.txt';
    for my$file (sort @{$files_arr_ref}) {
         print "processing file $file\n";
         ### and other system command calls here; for example
         # system "gzip -cd $file >> $group.txt";
    }
    ### and here; for example
    # system "gzip $group.txt";
}

Answer

파일 이름에 "\n" 문자가 없다고 가정합니다.

find . -name '*_*_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_*.gz' | perl -le '
    use strict;
    use warnings;
    my %hash;
    while(<>) {
        chomp;
        my($group)=/^([^_]+_[^_]+_[0-9]{11})/;
        $group=~s/[0-2]$/00/;
        $group=~s/[3-5]$/30/;
        push @{$hash{$group}},$_;
    }
    while(my($group,$files_arr_ref)=each%hash) {
        print "processing group $group";
        for my$file (sort @{$files_arr_ref}) {
            print "processing file $file";
            # do system command calls here; for example
            # system "gzip -cd \"$file\" >> $group.txt";
        }
    }
'

편집: Craig의 제안에 따라 몇 가지 사항을 변경했습니다. 첫 번째 아이디어는 배열과 해시에 Perl을 사용하는 것이었고 결국 모든 것이 더 명확해졌습니다. @ARGV는 검색을 통과할 경로 목록입니다. 예를 들어 스크립트 이름이 script.pl인 경우:

script.pl ${sourceFolder}

#!/usr/bin/perl

use strict;
use warnings;
use File::Find;

my %hash;

sub wanted {
    return unless /^([^_]+_[^_]+_[0-9]{11})/;
    my$group=$1;
    $group=~s/[0-2]$/00/;
    $group=~s/[3-5]$/30/;
    push @{$hash{$group}},$_;
}

File::Find::find(\&wanted, @ARGV);

while(my($group,$files_arr_ref)=each%hash) {
    print "processing group $group\n";
    ### do system command calls here; for example
    # system "rm $group.txt";
    ### or just use perl
    # unlink $group.'.txt';
    for my$file (sort @{$files_arr_ref}) {
         print "processing file $file\n";
         ### and other system command calls here; for example
         # system "gzip -cd $file >> $group.txt";
    }
    ### and here; for example
    # system "gzip $group.txt";
}

BASH: 파일을 이름별로 그룹화

답변1

답변2

관련 정보