내용이 달라도 중복된 디렉터리 경로를 찾는 방법은 무엇입니까?

내용이 달라도 중복된 디렉터리 경로를 찾는 방법은 무엇입니까?

여기저기 검색해봤는데 한 곳 빼고 다 있는 것 같네요(중복 디렉터리 찾기 및 나열) 실제로 내 상황과 관련된 주제를 찾았지만 결과는 내가 원하는 것과 정확히 일치하지 않았습니다.

편집: 다음은 제가 달성하려는 작업을 보여주는 데 도움이 되는 몇 가지 샘플 데이터입니다. 다음은 두 디렉토리 세트의 목록입니다.


idx1
idx1/defaultdb
idx1/defaultdb/thaweddb
idx1/defaultdb/colddb
idx1/defaultdb/db
idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/test
idx1/defaultdb/db/rb_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558020652_1558020018_8_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558020652_1558020018_8_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/GlobalMetaData
idx1/defaultdb/db/db_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/db_1558020619_1558019927_5_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558020619_1558019927_5_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147/rawdata

idx2
idx2/defaultdb
idx2/defaultdb/thaweddb
idx2/defaultdb/colddb
idx2/defaultdb/db
idx2/defaultdb/db/db_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/db_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/test
idx2/defaultdb/db/rb_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/db_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/GlobalMetaData
idx2/defaultdb/db/5_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/5_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147/rawdata

다음과 같은 것이 있다고 가정해 보겠습니다.

idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480

idx2디렉토리에 존재하는지 확인 defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480하고, 존재한다면 인쇄하고 싶습니다.

궁극적인 목표는 모두를 위한 것이다충분히디렉토리(디렉토리에는 하위 디렉토리가 없습니다. 표시하고 싶지는 않지만 defaultdb하위 디렉토리만 있음)는 모든 최상위 디렉토리 중에서 고유하며 두 개의 서로 다른 최상위 디렉토리에 있는 하위 디렉토리 목록입니다. 거기에서 그 중 하나를 삭제하겠습니다.


Edit2: 이것이 현재 작업 복사본의 모습입니다. 수정해야 할 버그가 있을 수 있습니다. 현재 경로의 디렉터리를 허용하고 동일한 경로 이름을 찾아 두 번째 디렉터리에서 경로 이름을 제거합니다.

#!/bin/bash
echo 'Chosen directories must reside in the current directory.'
echo 'This will find duplicate sub directories between the two and delete the ones in the second path.'
echo ''
read -p 'First directory to compare:' DIR1
read -p 'second directory to compare:' DIR2

depth="${DIR1//[^\/]}"
depth="${#depth}"
recurse='..'

for ((i=1; i<=depth; i++)) {
    recurse="${recurse}/.."
}

cd $DIR1; find . -type d > "$recurse"/list.txt; cd "$recurse"
cd $DIR2; find . -type d >> "$recurse"/list.txt; cd "$recurse"
echo 'Paths found:'
echo ''
awk 'seen[$1]++ {print $1}' list.txt | grep -v "db$" | grep -v "\.$"
echo ''
read -p 'Delete paths in ${DIR2}? (y/n)' bool
case 'y' in
    $bool)
    echo 'deleting:'
    awk 'seen[$1]++ {print $1}' list.txt | grep -v "db$" | grep -v "\.$"    
    cd $DIR2
    awk 'seen[$1]++ {print $1}' "$recurse"/list.txt | grep -v "db$" | grep -v "\.$" | xargs rmdir   
    echo ''
esac

답변1

나는 다음과 같은 것을 선택할 것입니다 :

  1. idx1 측의 디렉토리를 나열하십시오.cd idx1/defaultdb; find . -type d > path/to/list.txt

  2. idx2 측의 디렉토리를 나열하십시오.cd idx2/defaultdb; find . -type d >> path/to/list.txt

  3. 중복 찾기:awk 'seen[$1]++ {print $1}' path/to/list.txt

알아채다:

  • 이것은 단지 일반적인 개념일 뿐입니다. 스크립트가 되려면 여전히 다듬기가 필요합니다 ;-)
  • 두 명령 모두 find실제로 동일한 파일에 작성해야 합니다. 그에 따라 경로를 선택하십시오.

관련 정보