파일이 나열된 디렉터리에서 파일을 어떻게 찾을 수 있나요?

Question 1

디렉터리 이름이 한 줄에 하나씩 있으면 readarray(bash v4+)를 사용하여 이름에 공백, 탭 또는 와일드카드가 포함된 디렉터리의 문제를 피할 수 있습니다.

readarray -t dirs < subdirs2search.txt
find "${dirs[@]}" ...

일부 디렉토리 이름이 로 시작하는 경우 여전히 도움이 되지 않지만 -GNU에서는 find이 문제를 해결할 방법이 없습니다.

Answer

디렉터리 이름이 한 줄에 하나씩 있으면 readarray(bash v4+)를 사용하여 이름에 공백, 탭 또는 와일드카드가 포함된 디렉터리의 문제를 피할 수 있습니다.

readarray -t dirs < subdirs2search.txt
find "${dirs[@]}" ...

일부 디렉토리 이름이 로 시작하는 경우 여전히 도움이 되지 않지만 -GNU에서는 find이 문제를 해결할 방법이 없습니다.

Question 2

검색이 텍스트 파일로만 제한되지 않는다는 것을 알았습니다.

ack재귀적인 grep 유형 작업을 위한 편리한 도구인 경우가 많습니다. 그것하다기본적으로 검색은 텍스트 파일(파일 이름 및 내용을 기반으로 휴리스틱을 사용하여 결정됨)로 제한되며 .git/와 같은 디렉터리는 기본적으로 건너뜁니다 .svn. 이는 아마도 개발자라면 원하는 것일 것입니다. https://beyondgrep.com/.

대부분의 GNU/Linux 배포판에는 이 기능이 포함되어 있어 설치가 쉽습니다. 이는 Perl로 작성되었습니다(따라서 정규 표현식은 perlGNU의 정규 표현식과 유사한 정규 표현식 입니다 grep -P).

ack -- "desired text"  $(<subdirs.txt)

원하는 것을 수행하고 입력하기 쉬워야 합니다. 또한 대화형 사용에 적합한 색상 출력을 제공합니다.

(명령줄에서 토큰화를 수행하는 다양한 방법은 다른 답변에 설명되어 있습니다 subdirs.txt. 쉘의 표준 토큰화가 이를 수행하도록 하거나 readarray라인에서 토큰화를 수행하고 전역 확장을 방지할 수도 있습니다.)

Answer

검색이 텍스트 파일로만 제한되지 않는다는 것을 알았습니다.

ack재귀적인 grep 유형 작업을 위한 편리한 도구인 경우가 많습니다. 그것하다기본적으로 검색은 텍스트 파일(파일 이름 및 내용을 기반으로 휴리스틱을 사용하여 결정됨)로 제한되며 .git/와 같은 디렉터리는 기본적으로 건너뜁니다 .svn. 이는 아마도 개발자라면 원하는 것일 것입니다. https://beyondgrep.com/.

대부분의 GNU/Linux 배포판에는 이 기능이 포함되어 있어 설치가 쉽습니다. 이는 Perl로 작성되었습니다(따라서 정규 표현식은 perlGNU의 정규 표현식과 유사한 정규 표현식 입니다 grep -P).

ack -- "desired text"  $(<subdirs.txt)

원하는 것을 수행하고 입력하기 쉬워야 합니다. 또한 대화형 사용에 적합한 색상 출력을 제공합니다.

(명령줄에서 토큰화를 수행하는 다양한 방법은 다른 답변에 설명되어 있습니다 subdirs.txt. 쉘의 표준 토큰화가 이를 수행하도록 하거나 readarray라인에서 토큰화를 수행하고 전역 확장을 방지할 수도 있습니다.)

Question 3

물론, 이 질문을 게시함으로써 이 작업을 엄격하게 수행해야 한다는 집착을 극복하는 데 도움이 되었고 findBash를 통해 파일을 확장하는 것에 대해 생각하게 되었습니다. 다른 사람들에게 도움이 되기를 바라며 답변을 게시하고 있습니다(나중에 사용할 수 있도록 문서화했습니다).

Bash가 파일의 내용을 확장하게 만드는 주문은 $(<subdirs2search.txt). 따라서 subdirs2search.txt에 다음이 포함되어 있는 경우:

SubDir1 SubDir2 SubDir4

다음과 같은 명령을 사용하면 필요한 검색이 수행됩니다.

find $(<subdirs2search.txt) -type f -name="*.txt" -exec grep -H "desired text" {} \;

Answer

물론, 이 질문을 게시함으로써 이 작업을 엄격하게 수행해야 한다는 집착을 극복하는 데 도움이 되었고 findBash를 통해 파일을 확장하는 것에 대해 생각하게 되었습니다. 다른 사람들에게 도움이 되기를 바라며 답변을 게시하고 있습니다(나중에 사용할 수 있도록 문서화했습니다).

Bash가 파일의 내용을 확장하게 만드는 주문은 $(<subdirs2search.txt). 따라서 subdirs2search.txt에 다음이 포함되어 있는 경우:

SubDir1 SubDir2 SubDir4

다음과 같은 명령을 사용하면 필요한 검색이 수행됩니다.

find $(<subdirs2search.txt) -type f -name="*.txt" -exec grep -H "desired text" {} \;

Question 4

#!/usr/bin/perl -w

use strict;
use File::Find ();

sub wanted;
sub process_file ($@);

my $dirfile = shift;    # First argument is the filename containing the list
                        # of directories.

my $pattern = shift;    # Second arg is a perl RE containing the pattern to search
                        # for. Remember to single-quote it on the command line.

# Read in the @dirs array from $dirfile
#
# A NUL-separated file is best, just in case any of the directory names
# contained line-feeds.  If you're certain that could never happen, a
# plain-text LF-separated file would do.
#
# BTW, you can easily generate a NUL-separated file from the shell with:
#    printf "%s\0" dir1 dir2 dir3 dir4 $'dir\nwith\n3\nLFs' > dirs.txt

my @dirs=();

{
  local $/="\0";    # delete this line if you want to use a LF-separated file.
                    # In that case, the { ... } block around the code from open to
                    # close is no longer needed.  It's only there so it's possible
                    # to make a local change to the $/ aka $INPUT_RECORD_SEPARATOR
                    # variable.

  open(DIRFILE,"<",$dirfile);
  while(<DIRFILE>) {
    chomp;
    push @dirs, $_;
  };
  close(DIRFILE);
};

File::Find::find({wanted => \&wanted}, @dirs);
exit;

sub wanted {
    my ($dev,$ino,$mode,$nlink,$uid,$gid);

    (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) && -f _ && process_file($_);
}

sub process_file ($@) {

    # This function currently just greps for pattern in the filename passed to
    # it. As the function name implies, it could be used to process the file
    # in any way, not just grep it.

    my $filename = shift;

    # uncomment the return statement below to skip "binary" files.
    # (note this is a workable but fairly crude test.  Perl's File::MMagic
    # module can be used to more accurately identify file types, using the
    # same "magic" file databases as the /usr/bin/file command)

    # return if -B $filename;

    open(FILE,"<",$filename);
    while(<FILE>) {
      print "$filename:$_" if (m/$pattern/o) ;
    };

    close(FILE);
}

이는 perlPerl File::Find모듈을 사용하여 수행됩니다 find ... -exec grep.

이 스크립트에는 특별히 흥미롭거나 특별한 것이 없습니다.와는 별개로이 process_file기능은 소유자 또는 권한 변경, 파일 삭제, 이름 바꾸기, 행 삽입 또는 삭제, 기타 원하는 작업 등 파일에 대해 원하는 작업을 수행하도록 쉽게 수정할 수 있습니다.

예를 들어 패턴과 일치하는 텍스트가 포함된 파일을 삭제하려면 process_file 함수를 다음과 같이 바꿀 수 있습니다.

sub process_file ($@) {

    my $filename = shift;
    my $found = 0;

    # uncomment to skip "binary" files:
    return if -B $filename;

    open(FILE,"<",$filename);
    while(<FILE>) {
      if (m/$pattern/o) {
        $found = 1;
        last;
      };
    };

    close(FILE);
    unlink $filename if ($found);
}

wanted또한 이 스크립트의 함수는 현재 일반 파일( -f테스트) 만 검색한다는 점도 언급할 가치가 있습니다 . Perl stat및 lstat함수는 find파일 일치에 사용할 수 있는 모든 파일 메타데이터(uid, gid, perms, 크기, atime, mtime 등) 에 대한 액세스를 제공하므로 wanted함수는 모든 조회 조건을 복제할 수 있습니다. 자세히 보고 알아 perldoc -f stat보세요 perldoc -f lstat.

그런데 이 스크립트는 원래 find2perla) 파일에서 디렉토리 목록을 읽고, b) 포크를 통해 grep하는 대신 perl 코드에서 grep하고, grepc) 많은 주석을 추가하도록 생성된 다음 크게 수정되었습니다. find ... -exec grepgrep은 Perl보다 빠르게 파일을 열거나 정규식 패턴 일치를 수행할 수 없으므로 성능은 거의 동일해야 합니다. 더 빠를 수도 있습니다.

또한, find2perl이는 Perl에 포함되어 있었지만 Perl 5.22부터 제거되어 이제 CPAN에서 사용할 수 있습니다.2펄 찾기

Answer

#!/usr/bin/perl -w

use strict;
use File::Find ();

sub wanted;
sub process_file ($@);

my $dirfile = shift;    # First argument is the filename containing the list
                        # of directories.

my $pattern = shift;    # Second arg is a perl RE containing the pattern to search
                        # for. Remember to single-quote it on the command line.

# Read in the @dirs array from $dirfile
#
# A NUL-separated file is best, just in case any of the directory names
# contained line-feeds.  If you're certain that could never happen, a
# plain-text LF-separated file would do.
#
# BTW, you can easily generate a NUL-separated file from the shell with:
#    printf "%s\0" dir1 dir2 dir3 dir4 $'dir\nwith\n3\nLFs' > dirs.txt

my @dirs=();

{
  local $/="\0";    # delete this line if you want to use a LF-separated file.
                    # In that case, the { ... } block around the code from open to
                    # close is no longer needed.  It's only there so it's possible
                    # to make a local change to the $/ aka $INPUT_RECORD_SEPARATOR
                    # variable.

  open(DIRFILE,"<",$dirfile);
  while(<DIRFILE>) {
    chomp;
    push @dirs, $_;
  };
  close(DIRFILE);
};

File::Find::find({wanted => \&wanted}, @dirs);
exit;

sub wanted {
    my ($dev,$ino,$mode,$nlink,$uid,$gid);

    (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) && -f _ && process_file($_);
}

sub process_file ($@) {

    # This function currently just greps for pattern in the filename passed to
    # it. As the function name implies, it could be used to process the file
    # in any way, not just grep it.

    my $filename = shift;

    # uncomment the return statement below to skip "binary" files.
    # (note this is a workable but fairly crude test.  Perl's File::MMagic
    # module can be used to more accurately identify file types, using the
    # same "magic" file databases as the /usr/bin/file command)

    # return if -B $filename;

    open(FILE,"<",$filename);
    while(<FILE>) {
      print "$filename:$_" if (m/$pattern/o) ;
    };

    close(FILE);
}