쉘 스크립트를 사용하여 두 텍스트 파일의 차이점을 인쇄하는 방법은 무엇입니까?

Question 1

예, sed다음으로 파이프하기 전에 이를 사용하여 무언가를 제거 할 수 있습니다 diff.

$ diff file1 file2
1,3c1,3
< http://google.com/search
< http://www.google.com
< http://example.com
---
> google.com/search
> google.com
> example.com

$ < file1 sed 's|https\{0,1\}://||g' | diff - file2
2c2
< www.google.com
---
> google.com

Answer

예, sed다음으로 파이프하기 전에 이를 사용하여 무언가를 제거 할 수 있습니다 diff.

$ diff file1 file2
1,3c1,3
< http://google.com/search
< http://www.google.com
< http://example.com
---
> google.com/search
> google.com
> example.com

$ < file1 sed 's|https\{0,1\}://||g' | diff - file2
2c2
< www.google.com
---
> google.com

Question 2

다음 두 파일을 예로 들어 보겠습니다.

$ cat file1
http://google.com
example.com
http://foobar.org

$ cat file2
example.com
google.com
foobar.org
unique.url

sed예를 들어 모든 항목의 마지막 항목을 삭제하는 도구를 사용하고 싶습니다 //. 또한 diff두 파일을 모두 사용하려면 정렬해야 합니다. 두 가지를 결합하면 다음과 같은 결과를 얻을 수 있습니다.

$ diff <(sed 's#.*//##' file1 | sort) <(sort file2) 
3a4
> unique.url

또는 다음을 사용하십시오 comm -3.

$ comm -3 <(sed 's#.*//##' file1 | sort) <(sort file2) 
    unique.url

선행 공백을 다시 제거할 수 있습니다 sed.

$ comm -3 <(sed 's#.*//##' file1 | sort) <(sort file2) | sed 's/^\s*//'
unique.url

또 다른 방법은 http://두 파일 모두에서 나머지 부분을 제거하고 인쇄한 다음 uniq -u고유한 줄만 인쇄되도록 전달하는 것입니다. 두 파일에 있는 URL은 고유하지 않으므로 두 파일 중 하나에 있는 URL만 인쇄됩니다.

$ sed 's#.*//##' file1 file2 | sort | uniq -u
unique.url

Answer

다음 두 파일을 예로 들어 보겠습니다.

$ cat file1
http://google.com
example.com
http://foobar.org

$ cat file2
example.com
google.com
foobar.org
unique.url

sed예를 들어 모든 항목의 마지막 항목을 삭제하는 도구를 사용하고 싶습니다 //. 또한 diff두 파일을 모두 사용하려면 정렬해야 합니다. 두 가지를 결합하면 다음과 같은 결과를 얻을 수 있습니다.

$ diff <(sed 's#.*//##' file1 | sort) <(sort file2) 
3a4
> unique.url

또는 다음을 사용하십시오 comm -3.

$ comm -3 <(sed 's#.*//##' file1 | sort) <(sort file2) 
    unique.url

선행 공백을 다시 제거할 수 있습니다 sed.

$ comm -3 <(sed 's#.*//##' file1 | sort) <(sort file2) | sed 's/^\s*//'
unique.url

또 다른 방법은 http://두 파일 모두에서 나머지 부분을 제거하고 인쇄한 다음 uniq -u고유한 줄만 인쇄되도록 전달하는 것입니다. 두 파일에 있는 URL은 고유하지 않으므로 두 파일 중 하나에 있는 URL만 인쇄됩니다.

$ sed 's#.*//##' file1 file2 | sort | uniq -u
unique.url

Question 3

예, 해당 부분을 건너뛸 수 있습니다(예 http://: ).awk필드 구분 기호.

맨페이지에서:

NAME
       awk - pattern scanning and processing language

-F fs
       --field-separator fs
              Use fs for the input field separator (the value of the FS predefined variable).

예:

$ cat file1
http://google.com
http://gnu.org
http://fsf.org
http://linux.stackexchange.com

$ cat file2
google.com
gnu.org
fsf.org
unix.stackexchange.com

$ cat file1 | awk -F "http://" '{print $2}'
google.com
gnu.org
fsf.org
linux.stackexchange.com

$ cat file1 | awk -F "http://" '{print $2}' | diff - file2
4c4
< linux.stackexchange.com
---
> unix.stackexchange.com

노트:

또는 -F "://"일반 용도로 사용할 수도 있습니다 .https://http://
이러한 파일은sort비교하기 전에 수정하세요

Answer

예, 해당 부분을 건너뛸 수 있습니다(예 http://: ).awk필드 구분 기호.

맨페이지에서:

NAME
       awk - pattern scanning and processing language

-F fs
       --field-separator fs
              Use fs for the input field separator (the value of the FS predefined variable).

예:

$ cat file1
http://google.com
http://gnu.org
http://fsf.org
http://linux.stackexchange.com

$ cat file2
google.com
gnu.org
fsf.org
unix.stackexchange.com

$ cat file1 | awk -F "http://" '{print $2}'
google.com
gnu.org
fsf.org
linux.stackexchange.com

$ cat file1 | awk -F "http://" '{print $2}' | diff - file2
4c4
< linux.stackexchange.com
---
> unix.stackexchange.com

노트:

또는 -F "://"일반 용도로 사용할 수도 있습니다 .https://http://
이러한 파일은sort비교하기 전에 수정하세요

Question 4

관심 있는 분들을 위해 다음은 URL이 파일의 어디에 있는지(반드시 줄의 시작 부분에 있을 필요는 없음)에 관계없이 프로토콜(예: https://)을 포함하는 두 개 이상의 파일에서 URL을 비교하는 스크립트입니다.

#!/bin/sh

# uniql.sh

# UNIQue Links - a "uniq" and "diff" utilities-like .sh 
# script (bash, zsh compatible) that can find 
# different/common URL web links in one file compared to 
# a group of files

# Please note that: for simplicity, in this script, only 
# URLs containing "://" are taken into consideration, 
# although there can be URLs that do not contain it 
# (such as mailto:[email protected])

GetOS () {
    
    OS_kernel_name=$(uname -s)
    
    case "$OS_kernel_name" in
        "Linux")
            eval $1="Linux"
        ;;
        "Darwin")
            eval $1="Mac"
        ;;
        "CYGWIN"*|"MSYS"*|"MINGW"*)
            eval $1="Windows"
        ;;
        "")
            eval $1="unknown"
        ;;
        *)
            eval $1="other"
        ;;
    esac
    
}

DetectShell () {
    eval $1=\"\";
    if [ -n "$BASH_VERSION" ]; then
        eval $1=\"bash\";
    elif [ -n "$ZSH_VERSION" ]; then
        eval $1=\"zsh\";
    else
        eval $1=\"undetermined\";
    fi
}

PrintInTitle () {
    printf "\033]0;%s\007" "$1"
}

PrintJustInTitle () {
    PrintInTitle "$1">$where_to_print
}

trap1 () {
    CleanUp
    printf "\nAborted.\n">$where_to_print
}

CleanUp () {
    
    #Restore "INTERRUPT" (CTRL-C) and "TERMINAL STOP" (CTRL-Z) signals:
    trap - INT
    trap - TSTP
    
    #Clear the title:
    printf "\033]0;%s\007" "">/dev/tty
    
    #Restore initial IFS:
    #IFS=$old_IFS
    unset IFS
}

ProcFindLinks () {

    if [ -z "$count" ]; then
        #For files_1 and files_2: initialise next variables
        count="0"
        mask="00000000000000000000"
    fi
    
    #extract number from the end of files_N:
    file_index="$1"
    file_index="${file_index##*"_"}"
    
    #index_amplifier is "1" for "--subshell1" and "--subshell3" flags or "2" for "--subshell2" and "--subshell4" flags, and otherwise "1"
    if [ "$file_index" = "1" ]; then
        #files_1:
        title_index="$(( 1 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
        file_group_index="1"
    else
        #files_2..N
        title_index="$(( 2 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
        file_group_index="2"
    fi
    
    for link in $(\
        eval cat "\$$1" |\
        eval "$sed_command1" |\
        eval "$sed_command2"\
    ); do
        count_prev=$count
        count=$((count+1))
        if [ "${#count_prev}" -lt "${#count}" ]; then
            mask="${mask%?}"
        fi
        number="$mask$count"
        printf '%s\n' "$file_group_index $number $link"
        printf "\033]0;%s\007" "Links found [$title_index]: $((count))...">$where_to_print
    done;
    #Prepare initialisation for files_2:
    if [ "$1" = "files_1" ]; then count=""; fi
}

ProcFindLinksShowLineNumbers () {
    
    if [ -z "$count" ]; then
        #For files_1 and files_2: initialise next variables:
        count=0
    fi
    
    #extract number from the end of files_N:
    file_index="$1"
    file_index="${file_index##*"_"}"
    
    if [ "$file_index" = "1" ]; then
        #files_1:
        file_group_index="1"
    else
        #files_2..N
        file_group_index="2"
    fi
    
    for link in $(\
        eval cat "\$$1" |\
        eval "$sed_command1" |\
        eval "$sed_command2"\
    ); do
        count=$((count+1))
        number="$count"
        printf '%s\n' "File: $file_group_index Link: $number"
        printf '%s\n' "$link"
        printf "\033]0;%s\007" "Links found [$file_group_index]: $((count))...">$where_to_print
    done
    #Prepare initialisation for files_2:
    if [ "$1" = "files_1" ]; then count=""; fi
}

DisplayHelp () {
    printf "\n"
    printf "uniql - UNIQue URL web Links - find different/common URL web links in one file compared to a group of files and viceversa\n"
    printf "\n"
    printf "    What it does:\n"
    printf "        - compares the URL web links in the files provided as parameters - <file1>, <file2>, ..., <fileN> - and shows the different/common web links in <file1> compared to the group of files <file2>, ..., <fileN> and viceversa (see flags below):\n"
    printf "    Syntax:\n"
    printf "        <caller_shell> '/path/to/uniql.sh' <file1> <file2> ... <fileN> [flags]\n"
    printf "        - where:\n"
    printf "            - <caller_shell> can be any of the shells: bash, zsh\n"
    printf "            - '/path/to/uniql.sh' represents the path of this script\n"
    printf "            - <file1>, <file2>, ..., <fileN> represent the files to be compared: the web links in <file1> are compared with all the web links in <file2>, ... <fileN> and viceversa\n"
    printf "            - [flags] can be:\n"
    printf "                --help or -h\n"
    printf "                    Displays this help information\n"
    printf "                --common or -c\n"
    printf "                    - compare the URL web links in the files provided as parameters - in the group of files: <file1>, <file2>, ..., <fileN> - and show the common web links that are found in all files\n"
    printf "                --different or -d\n"
    printf "                    - compare the URL web links in the files provided as parameters - <file1> and the group of files: <file2>, ..., <fileN> - and show the missing web links that are found in <file1> but not in the group <file2>, ..., <fileN>, and viceversa\n"
    printf "                --preserve-order or -p\n"
    printf "                    - preserve the order and the occurences in which the links appear in files 1..N in this order (where file 1 is the first file given as parameter and file N is the last file given as parameter)\n"
    printf "                --domains\n"
    printf "                    - compare and print only the URL domains (plus subdomains) not the full URLs\n"
    printf "                --domains-full\n"
    printf "                    - compare only the URL domains (plus subdomains) but print the full URLs\n"
    printf "    Output:\n"
    printf "        For the \"-d\" flag:\n"
    printf "            - lines starting with '<' signify web links from <file1>\n"
    printf "            - lines starting with '>' signify web links from <file2>, ..., <fileN>\n"
    printf "    Notes:\n"
    printf "               - for simplicity, in this script, only URLs containing \"://\" are taken into consideration, although there can be URLs that do not contain it (such as mailto:[email protected])\n"
    printf "\n"
}

GetOS OS

#################################################################################
##    Uncomment the next line if your OS is not Linux or Mac (and eventually   ##
##    modify the commands used (sed, sort, uniq) according to your system):    ##
#################################################################################
#OS="userdefined"

DetectShell current_shell

if [ "$current_shell" = "bash" ]; then
    current_script_path="${BASH_SOURCE}"
elif [ "$current_shell" = "zsh" ]; then
    current_script_path="${(%):-%N}"
elif [ "$current_shell" = "undetermined" ]; then
    printf "\Error: This script was designed to work with bash and zsh shells.\n\n">$where_to_print
    CleanUp & exit 1
fi

#Get the program parameters into the array "params":
params_count=0
for i; do
    params_count=$((params_count+1))
    eval params_$params_count=\"\$i\"
done
params_0=$((params_count))

if [ "$params_0" = "0" ]; then #if no parameters are provided: display help
    DisplayHelp
    CleanUp && exit 0
fi


#Create a flags array. A flag denotes special parameters:
different_flag="0"
common_flag="0"
domains_flag="0"
domains_full_flag="0"
preserve_order_flag="0"
help_flag="0"
subshell_flag="0"
subshell1_flag="0"
subshell2_flag="0"
subshell3_flag="0"
subshell4_flag="0"
i=1;
j=0;
while [ "$i" -le "$((params_0))" ]; do
    eval params_i=\"\$\{params_$i\}\"
    case "${params_i}" in
        "--different" | "-d" )
            different_flag="1"
        ;;
        "--common" | "-c" )
            common_flag="1"
        ;;
        "--domains" )
            domains_flag="1"
        ;;
        "--domains-full" )
            domains_full_flag="1"
        ;;
        "--preserve_order" | "-p" )
            preserve_order_flag="1"
        ;;
        "--help" | "-h" )
            help_flag="1"
        ;;
        "--subshell1" )
            subshell1_flag="1"
            subshell_flag="1"
        ;;
        "--subshell2" )
            subshell2_flag="1"
            subshell_flag="1"
        ;;
        "--subshell3" )
            subshell3_flag="1"
            subshell_flag="1"
        ;;
        "--subshell4" )
            subshell4_flag="1"
            subshell_flag="1"
        ;;
        * )
            j=$((j+1))
            eval selected_params_$j=\"\$params_i\"
        ;;
    esac
    
    i=$((i+1))
done

error="false"
if [ "$different_flag" = "0" -a "$common_flag" = "0" ]; then
    printf '%s\n' "ERROR: Expected either -c or -d flag!">/dev/stderr
    error="true"
elif [ "$common_flag" = "1" -a "$different_flag" = "1" ]; then
    printf '%s\n' "ERROR: Invalid combination of flags ( -d and -c )!"
    error="true"
# If the --domains-full flag is set ("1"): the --preserve-order flag is set by default ("1") for both -c and -d flags;
# Otherwise, if the --domains-full flag is not set ("0"):
elif [ ! "$domains_full_flag" = "1" -a "$common_flag" = "1" -a "$preserve_order_flag" = "1" ]; then
    printf '%s\n' "ERROR: The -p flag cannot be used together with the -c flag)!"
    error="true"
fi
if [ "$domains_flag" = "1" -a "$domains_full_flag" = "1" ]; then
    printf '%s\n' "ERROR: Invalid combination of flags ( --domains and --domains-full )!"
    error="true"
fi
if [ "$error" = "true" ]; then
    CleanUp & exit 1
fi

# When "$domains_full_flag" is "1":
#
# Currently: process substitution cannot be waited in bash, 
# and as a result multiple processes from process substitution
# would display at the same time in title:
# i.e. command1 + command2 or command1 + command3
#
# so - as a workarround: 
# we display in the title the most time consuming of the processes
# from process substitution:
# i.e. command1 and command3

# WHEN PROCESS SUBSTITUTION RUNS:

# IN SUBSHELLS 1 and 3 (command 1 and 3):
if [ "$subshell1_flag" = "1" -o "$subshell3_flag" = "1" ]; then
    where_to_print="/dev/tty"
# IN SUBSHELLS 2 and 4 (commands 2 and 4):
elif [ "$subshell2_flag" = "1" -o "$subshell4_flag" = "1" ]; then
    where_to_print="/dev/null"
# OTHERWISE: NO SUBSHELL CONFLICT:
else
    where_to_print="/dev/tty"
fi

# IN SUBSHELLS 1 and 2 (commands 1 and 2):
if [ "$subshell1_flag" = "1" -o "$subshell2_flag" = "1" ]; then
    index_amplifier="1"
# IN SUBSHELLS 3 and 4 (commands 3 and 4):
elif [ "$subshell3_flag" = "1" -o "$subshell4_flag" = "1" ]; then
    index_amplifier="2"
# OTHERWISE: NO SUBSHELL CONFLICT:
else
    index_amplifier="1"
fi


selected_params_0=$j

#Rebuild params array:
for i in $(seq 1 $selected_params_0); do
    eval params_$i=\"\$\{selected_params_$i\}\"
done
params_0=$selected_params_0

if [ "$help_flag" = "1" ]; then
    DisplayHelp
else #Run program:
    
    NL=$(printf '%s' "\n\n"); #final NewLine is deleted
    #or use:
    #NL=$'\n'
    
    error1="false"
    error2="false"
    error3="false"
    { sed --help >/dev/null 2>/dev/null; } || { error1="true"; }
    { sort --help >/dev/null 2>/dev/null; } || { error2="true"; }
    { uniq --help >/dev/null 2>/dev/null; } || { error3="true"; }
    { diff --help >/dev/null 2>/dev/null; } || { error4="true"; }
    { grep --help >/dev/null 2>/dev/null; } || { error5="true"; }
    if [ "$error1" = "true" -o "$error2" = "true" -o "$error3" = "true" -o "$error4" = "true" -o "$error5" = "true" ]; then
        {
            printf "\n"
            if [ "$error1" = "true" ]; then printf '%s\n' "ERROR: Could not run \"sed\" (necessary in order for this script to function correctly)!"; fi
            if [ "$error2" = "true" ]; then printf '%s\n' "ERROR: Could not run \"sort\" (necessary in order for this script to function correctly)"; fi
            if [ "$error3" = "true" ]; then printf '%s\n' "ERROR: Could not run \"uniq\" (necessary in order for this script to function correctly)"; fi
            if [ "$error4" = "true" ]; then printf '%s\n' "ERROR: Could not run \"diff\" (necessary in order for this script to function correctly)"; fi
            if [ "$error5" = "true" ]; then printf '%s\n' "ERROR: Could not run \"grep\" (necessary in order for this script to function correctly)"; fi
            printf "\n"
        }>/dev/stderr
        CleanUp & exit 1
    fi

    #Copy the parameters to the array "files":
    for i in $(seq 1 $selected_params_0); do
        eval files_$i=\"\$selected_params_$i\"
    done
    files_0=$selected_params_0
    
    error="false"
    if [ "$files_0" -lt "2" ]; then
        printf '\n%s\n' "ERROR: Please provide at least two parameters!">/dev/stderr
        error="true"
    fi
    
    if [ "$error" = "true" ]; then
        printf "\n"
        CleanUp & exit 1
    fi
    
    if [ "$OS" = "Linux" -o "$OS" = "Mac" -o "$OS" = "userdefined" ]; then
        
        #Commands to display common/different domains when --domains-full flag is not set:
        if [ "$domains_full_flag" = "0" ]; then
            # Command to add a NewLine character at the beginning of each URL found (an URL is identified by: "...://..."):
            #   command1: sed -E 's/([a-zA-Z]*\:\/\/)/\\${NL}\1/g'
            sed_command1='sed -E '"'"'s/([a-zA-Z]*\:\/\/)/'"\\${NL}"'\1/g'"'";
            
            # Command to strip anything beside the URL body ("...://...") = for each new line, remove anything after (including) the first SPACE/TAB/NEWLINE found:
            if [ "$domains_flag" = "0" ]; then
                # Do not strip the URL domain (keep the full URL):
                #   command2: sed -n 's/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^ \|^\t\|^\n]*\).*/\4\5\7/p'
                sed_command2='sed -n '"'"'s/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^ \|^\t\|^\n]*\).*/\4\5\7/p'"'"
            elif [ "$domains_flag" = "1" ]; then
                # Filter only URLS that contain "."
                # Strip anything beside the URL domain (the string until the first encountered SLASH/COLON/NEWLINE beside "://"):
                #   command2: sed -n 's/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^/\|^\:\|^\n]*\).*/\4\5\7/p'|sed '/\./!D'
                sed_command2='sed -n '"'"'s/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^/\|^\:\|^\n]*\).*/\4\5\7/p'"'"\|'sed '"'"'/\./!D'"'"
            fi
            
            #Command to remove URLs that contain "file:///" (URLs that contain the file protocol (file:///...) are automatically skipped):
            sed_command2="$sed_command2"\|'sed '"'"'/file\:\/\/\/./D'"'"
            
            # For each line: Command to remove the second column (containing a padded number which was used to preserve initial find order (after sorting)):
            if [ "$different_flag" = "1" ]; then
                #   command3: sed -E 's/(.) [0-9]* (.*)/\1 \2/g'
                sed_command3='sed -E '"'"'s/(.) [0-9]* (.*)/\1 \2/g'"'";
                #   command4: sed -E 's/^2/>/g;s/^1/</g'
                sed_command4='sed -E '"'"'s/^2/>/g;s/^1/</g'"'"
            elif [ "$common_flag" = "1" ]; then
                #   command3: sed -E 's/(.) [0-9]* (.*)/\2/g'
                sed_command3='sed -E '"'"'s/(.) [0-9]* (.*)/\2/g'"'";
            fi
        #Commands to display common/different domains when --domains-full flag is set:
        elif [ "$domains_full_flag" = "1" ]; then
            # Command to find common domains:
            #     command1: 
            uniql_command1="$current_shell $current_script_path -c --domains $(for i in seq 1 $files_0; do eval printf \'%s \' \'\$files_$i\'; done)"

            # Links that are only in first parameter file (file group 1):
            #     command2: 
            uniql_command2="$current_shell $current_script_path -d '$files_1' \"/dev/null\""
            
            #     command3: 
            uniql_command3="$current_shell $current_script_path -c --domains $(for i in seq 1 $files_0; do eval printf \'%s \' \'\$files_$i\'; done)"
            # Links that are only in 2..N parameter files (file group 2):
            #     command4: 
            uniql_command4="$current_shell $current_script_path -d \"/dev/null\" $(for i in seq 2 $files_0; do eval printf \'%s \' \'\$files_$i\'; done)"
        fi
                    
    else
        printf '\n%s\n\n' "Error: Unsupported OS!">/dev/stderr
        CleanUp & exit 1
    fi
    
    error="false"
    for i in $(seq 1 $files_0); do
        eval current_file=\"\$files_$i\"
        # If current file does not exist or is a directory:
        if [ ! -e "$current_file" -o -d "$current_file" ]; then
            printf '\n%s\n' "ERROR: File \"$current_file\" does not exist or is not a accessible!">/dev/stderr
            error="true"
        else
            # If current file is not readable:
            if [ ! -r "$current_file" ]; then
                printf '\n%s\n' "ERROR: File <file$i> = \"$current_file\" is not accessible!">/dev/stderr
                error="true"
            fi
        fi
    done
    
    if [ "$error" = "true" ]; then
        printf "\n"
        CleanUp & exit 1
    fi
    
    #Proceed to finding and comparing links:
    
    #Trap "INTERRUPT" (CTRL-C) and "TERMINAL STOP" (CTRL-Z) signals:
    trap 'trap1' INT
    trap 'trap1' TSTP
    
    old_IFS="$IFS" #Store initial IFS value
    IFS="
"
    
    if [ "$domains_full_flag" = "0" ]; then
        
        if [ "$different_flag" = "1" -a "$preserve_order_flag" = "0" ]; then
        {
            title_index="$(( 1 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                ProcFindLinks "files_1"
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -u -k 3
            
            title_index="$(( 2 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                for i in $(seq 2 $files_0); do
                    ProcFindLinks "files_$i"
                done
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -u -k 3
            
            PrintJustInTitle "Searching for unique links [$((title_index + 1))]..."
        }|{\
            sort -k 3|uniq -u -f 2|sort|eval "$sed_command3"|eval "$sed_command4"
            
            PrintJustInTitle "Done"
        }
        elif [ "$common_flag" = "1" -a "$preserve_order_flag" = "0" ]; then
        {
            title_index="$(( 1 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                ProcFindLinks "files_1"
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -k 3|uniq -f 2
            
            title_index="$(( 2 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                for i in $(seq 2 $files_0); do
                    ProcFindLinks "files_$i"
                done
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -k 3|uniq -f 2
            
            PrintJustInTitle "Searching for common links [$((title_index + 1))]..."
        }|{\
            sort -k 3|uniq -d -f 2|sort|eval "$sed_command3"
            
            PrintJustInTitle "Done"
        }
        elif [ "$different_flag" = "1" -a "$preserve_order_flag" = "1" ]; then
            PrintJustInTitle "Searching for links..."
            {
                link_count=0
                current_line=""
                for line in $(\
                        diff \
                        <( \
                            PrintJustInTitle "Searching for links [1]..."; \
                            ProcFindLinksShowLineNumbers "files_1"; \
                            printf '%s\n' "### Sepparator 1";\
                        ) \
                        <( \
                            PrintJustInTitle "Searching for links [2]..."; \
                            for i in $(seq 2 $files_0); do ProcFindLinksShowLineNumbers "files_$i"; done; \
                            printf '%s\n' "### Sepparator 2";\
                        ) \
                    ); do
                    link_count+=1
                    previous_line="$current_line"
                    current_line="$line"
                    #if ( current line starts with "<" and previous line starts with "<" ) OR ( current line starts with ">" and previous line starts with ">" ):
                    if [ \( \( ! "${current_line#"<"}" = "${current_line}" \) -a \( ! "${previous_line#"<"}" = "${previous_line}" \) \) -o \( \( ! "${current_line#">"}" = "${current_line}" \) -a  \( ! "${previous_line#">"}" = "${previous_line}" \) \) ]; then
                        printf '%s\n' "$previous_line"
                    fi
                done
            }

        fi
    elif [ "$domains_full_flag" = "1" ]; then
        
        if [ "$different_flag" = "1" ]; then
        # Find links (second eval) that are not in the common domains list (first eval):
            # Links in the first file given as parameter (second eval):
            grep -F -vf <( eval $uniql_command1 --subshell1; ) <( eval $uniql_command2 --subshell2; )
            # Links in the files 2..N - given as parameters (second eval):
            grep -F -vf <( eval $uniql_command3 --subshell3; ) <( eval $uniql_command4 --subshell4; )
        elif [ "$common_flag" = "1" ]; then
        # Find links (second eval) that are in the common domains list (first eval):
            # Links in the first file given as parameter (second eval):
            grep -F -f <( eval $uniql_command1 --subshell1; ) <( eval $uniql_command2 --subshell2; )
            # Links in the files 2..N - given as parameters (second eval):
            grep -F -f <( eval $uniql_command3 --subshell3; ) <( eval $uniql_command4 --subshell4; )
        fi
        # grep flags explained:
        #    -F = do not interpret pattern string (treat string literally)
        #    -v = select non-matching lines
        #    -f = obtain pattern strings from next file - each string on a new line (containing site domains in this case)
    fi
    
    CleanUp
fi

Answer

관심 있는 분들을 위해 다음은 URL이 파일의 어디에 있는지(반드시 줄의 시작 부분에 있을 필요는 없음)에 관계없이 프로토콜(예: https://)을 포함하는 두 개 이상의 파일에서 URL을 비교하는 스크립트입니다.

#!/bin/sh

# uniql.sh

# UNIQue Links - a "uniq" and "diff" utilities-like .sh 
# script (bash, zsh compatible) that can find 
# different/common URL web links in one file compared to 
# a group of files

# Please note that: for simplicity, in this script, only 
# URLs containing "://" are taken into consideration, 
# although there can be URLs that do not contain it 
# (such as mailto:[email protected])

GetOS () {
    
    OS_kernel_name=$(uname -s)
    
    case "$OS_kernel_name" in
        "Linux")
            eval $1="Linux"
        ;;
        "Darwin")
            eval $1="Mac"
        ;;
        "CYGWIN"*|"MSYS"*|"MINGW"*)
            eval $1="Windows"
        ;;
        "")
            eval $1="unknown"
        ;;
        *)
            eval $1="other"
        ;;
    esac
    
}

DetectShell () {
    eval $1=\"\";
    if [ -n "$BASH_VERSION" ]; then
        eval $1=\"bash\";
    elif [ -n "$ZSH_VERSION" ]; then
        eval $1=\"zsh\";
    else
        eval $1=\"undetermined\";
    fi
}

PrintInTitle () {
    printf "\033]0;%s\007" "$1"
}

PrintJustInTitle () {
    PrintInTitle "$1">$where_to_print
}

trap1 () {
    CleanUp
    printf "\nAborted.\n">$where_to_print
}

CleanUp () {
    
    #Restore "INTERRUPT" (CTRL-C) and "TERMINAL STOP" (CTRL-Z) signals:
    trap - INT
    trap - TSTP
    
    #Clear the title:
    printf "\033]0;%s\007" "">/dev/tty
    
    #Restore initial IFS:
    #IFS=$old_IFS
    unset IFS
}

ProcFindLinks () {

    if [ -z "$count" ]; then
        #For files_1 and files_2: initialise next variables
        count="0"
        mask="00000000000000000000"
    fi
    
    #extract number from the end of files_N:
    file_index="$1"
    file_index="${file_index##*"_"}"
    
    #index_amplifier is "1" for "--subshell1" and "--subshell3" flags or "2" for "--subshell2" and "--subshell4" flags, and otherwise "1"
    if [ "$file_index" = "1" ]; then
        #files_1:
        title_index="$(( 1 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
        file_group_index="1"
    else
        #files_2..N
        title_index="$(( 2 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
        file_group_index="2"
    fi
    
    for link in $(\
        eval cat "\$$1" |\
        eval "$sed_command1" |\
        eval "$sed_command2"\
    ); do
        count_prev=$count
        count=$((count+1))
        if [ "${#count_prev}" -lt "${#count}" ]; then
            mask="${mask%?}"
        fi
        number="$mask$count"
        printf '%s\n' "$file_group_index $number $link"
        printf "\033]0;%s\007" "Links found [$title_index]: $((count))...">$where_to_print
    done;
    #Prepare initialisation for files_2:
    if [ "$1" = "files_1" ]; then count=""; fi
}

ProcFindLinksShowLineNumbers () {
    
    if [ -z "$count" ]; then
        #For files_1 and files_2: initialise next variables:
        count=0
    fi
    
    #extract number from the end of files_N:
    file_index="$1"
    file_index="${file_index##*"_"}"
    
    if [ "$file_index" = "1" ]; then
        #files_1:
        file_group_index="1"
    else
        #files_2..N
        file_group_index="2"
    fi
    
    for link in $(\
        eval cat "\$$1" |\
        eval "$sed_command1" |\
        eval "$sed_command2"\
    ); do
        count=$((count+1))
        number="$count"
        printf '%s\n' "File: $file_group_index Link: $number"
        printf '%s\n' "$link"
        printf "\033]0;%s\007" "Links found [$file_group_index]: $((count))...">$where_to_print
    done
    #Prepare initialisation for files_2:
    if [ "$1" = "files_1" ]; then count=""; fi
}

DisplayHelp () {
    printf "\n"
    printf "uniql - UNIQue URL web Links - find different/common URL web links in one file compared to a group of files and viceversa\n"
    printf "\n"
    printf "    What it does:\n"
    printf "        - compares the URL web links in the files provided as parameters - <file1>, <file2>, ..., <fileN> - and shows the different/common web links in <file1> compared to the group of files <file2>, ..., <fileN> and viceversa (see flags below):\n"
    printf "    Syntax:\n"
    printf "        <caller_shell> '/path/to/uniql.sh' <file1> <file2> ... <fileN> [flags]\n"
    printf "        - where:\n"
    printf "            - <caller_shell> can be any of the shells: bash, zsh\n"
    printf "            - '/path/to/uniql.sh' represents the path of this script\n"
    printf "            - <file1>, <file2>, ..., <fileN> represent the files to be compared: the web links in <file1> are compared with all the web links in <file2>, ... <fileN> and viceversa\n"
    printf "            - [flags] can be:\n"
    printf "                --help or -h\n"
    printf "                    Displays this help information\n"
    printf "                --common or -c\n"
    printf "                    - compare the URL web links in the files provided as parameters - in the group of files: <file1>, <file2>, ..., <fileN> - and show the common web links that are found in all files\n"
    printf "                --different or -d\n"
    printf "                    - compare the URL web links in the files provided as parameters - <file1> and the group of files: <file2>, ..., <fileN> - and show the missing web links that are found in <file1> but not in the group <file2>, ..., <fileN>, and viceversa\n"
    printf "                --preserve-order or -p\n"
    printf "                    - preserve the order and the occurences in which the links appear in files 1..N in this order (where file 1 is the first file given as parameter and file N is the last file given as parameter)\n"
    printf "                --domains\n"
    printf "                    - compare and print only the URL domains (plus subdomains) not the full URLs\n"
    printf "                --domains-full\n"
    printf "                    - compare only the URL domains (plus subdomains) but print the full URLs\n"
    printf "    Output:\n"
    printf "        For the \"-d\" flag:\n"
    printf "            - lines starting with '<' signify web links from <file1>\n"
    printf "            - lines starting with '>' signify web links from <file2>, ..., <fileN>\n"
    printf "    Notes:\n"
    printf "               - for simplicity, in this script, only URLs containing \"://\" are taken into consideration, although there can be URLs that do not contain it (such as mailto:[email protected])\n"
    printf "\n"
}

GetOS OS

#################################################################################
##    Uncomment the next line if your OS is not Linux or Mac (and eventually   ##
##    modify the commands used (sed, sort, uniq) according to your system):    ##
#################################################################################
#OS="userdefined"

DetectShell current_shell

if [ "$current_shell" = "bash" ]; then
    current_script_path="${BASH_SOURCE}"
elif [ "$current_shell" = "zsh" ]; then
    current_script_path="${(%):-%N}"
elif [ "$current_shell" = "undetermined" ]; then
    printf "\Error: This script was designed to work with bash and zsh shells.\n\n">$where_to_print
    CleanUp & exit 1
fi

#Get the program parameters into the array "params":
params_count=0
for i; do
    params_count=$((params_count+1))
    eval params_$params_count=\"\$i\"
done
params_0=$((params_count))

if [ "$params_0" = "0" ]; then #if no parameters are provided: display help
    DisplayHelp
    CleanUp && exit 0
fi


#Create a flags array. A flag denotes special parameters:
different_flag="0"
common_flag="0"
domains_flag="0"
domains_full_flag="0"
preserve_order_flag="0"
help_flag="0"
subshell_flag="0"
subshell1_flag="0"
subshell2_flag="0"
subshell3_flag="0"
subshell4_flag="0"
i=1;
j=0;
while [ "$i" -le "$((params_0))" ]; do
    eval params_i=\"\$\{params_$i\}\"
    case "${params_i}" in
        "--different" | "-d" )
            different_flag="1"
        ;;
        "--common" | "-c" )
            common_flag="1"
        ;;
        "--domains" )
            domains_flag="1"
        ;;
        "--domains-full" )
            domains_full_flag="1"
        ;;
        "--preserve_order" | "-p" )
            preserve_order_flag="1"
        ;;
        "--help" | "-h" )
            help_flag="1"
        ;;
        "--subshell1" )
            subshell1_flag="1"
            subshell_flag="1"
        ;;
        "--subshell2" )
            subshell2_flag="1"
            subshell_flag="1"
        ;;
        "--subshell3" )
            subshell3_flag="1"
            subshell_flag="1"
        ;;
        "--subshell4" )
            subshell4_flag="1"
            subshell_flag="1"
        ;;
        * )
            j=$((j+1))
            eval selected_params_$j=\"\$params_i\"
        ;;
    esac
    
    i=$((i+1))
done

error="false"
if [ "$different_flag" = "0" -a "$common_flag" = "0" ]; then
    printf '%s\n' "ERROR: Expected either -c or -d flag!">/dev/stderr
    error="true"
elif [ "$common_flag" = "1" -a "$different_flag" = "1" ]; then
    printf '%s\n' "ERROR: Invalid combination of flags ( -d and -c )!"
    error="true"
# If the --domains-full flag is set ("1"): the --preserve-order flag is set by default ("1") for both -c and -d flags;
# Otherwise, if the --domains-full flag is not set ("0"):
elif [ ! "$domains_full_flag" = "1" -a "$common_flag" = "1" -a "$preserve_order_flag" = "1" ]; then
    printf '%s\n' "ERROR: The -p flag cannot be used together with the -c flag)!"
    error="true"
fi
if [ "$domains_flag" = "1" -a "$domains_full_flag" = "1" ]; then
    printf '%s\n' "ERROR: Invalid combination of flags ( --domains and --domains-full )!"
    error="true"
fi
if [ "$error" = "true" ]; then
    CleanUp & exit 1
fi

# When "$domains_full_flag" is "1":
#
# Currently: process substitution cannot be waited in bash, 
# and as a result multiple processes from process substitution
# would display at the same time in title:
# i.e. command1 + command2 or command1 + command3
#
# so - as a workarround: 
# we display in the title the most time consuming of the processes
# from process substitution:
# i.e. command1 and command3

# WHEN PROCESS SUBSTITUTION RUNS:

# IN SUBSHELLS 1 and 3 (command 1 and 3):
if [ "$subshell1_flag" = "1" -o "$subshell3_flag" = "1" ]; then
    where_to_print="/dev/tty"
# IN SUBSHELLS 2 and 4 (commands 2 and 4):
elif [ "$subshell2_flag" = "1" -o "$subshell4_flag" = "1" ]; then
    where_to_print="/dev/null"
# OTHERWISE: NO SUBSHELL CONFLICT:
else
    where_to_print="/dev/tty"
fi

# IN SUBSHELLS 1 and 2 (commands 1 and 2):
if [ "$subshell1_flag" = "1" -o "$subshell2_flag" = "1" ]; then
    index_amplifier="1"
# IN SUBSHELLS 3 and 4 (commands 3 and 4):
elif [ "$subshell3_flag" = "1" -o "$subshell4_flag" = "1" ]; then
    index_amplifier="2"
# OTHERWISE: NO SUBSHELL CONFLICT:
else
    index_amplifier="1"
fi


selected_params_0=$j

#Rebuild params array:
for i in $(seq 1 $selected_params_0); do
    eval params_$i=\"\$\{selected_params_$i\}\"
done
params_0=$selected_params_0

if [ "$help_flag" = "1" ]; then
    DisplayHelp
else #Run program:
    
    NL=$(printf '%s' "\n\n"); #final NewLine is deleted
    #or use:
    #NL=$'\n'
    
    error1="false"
    error2="false"
    error3="false"
    { sed --help >/dev/null 2>/dev/null; } || { error1="true"; }
    { sort --help >/dev/null 2>/dev/null; } || { error2="true"; }
    { uniq --help >/dev/null 2>/dev/null; } || { error3="true"; }
    { diff --help >/dev/null 2>/dev/null; } || { error4="true"; }
    { grep --help >/dev/null 2>/dev/null; } || { error5="true"; }
    if [ "$error1" = "true" -o "$error2" = "true" -o "$error3" = "true" -o "$error4" = "true" -o "$error5" = "true" ]; then
        {
            printf "\n"
            if [ "$error1" = "true" ]; then printf '%s\n' "ERROR: Could not run \"sed\" (necessary in order for this script to function correctly)!"; fi
            if [ "$error2" = "true" ]; then printf '%s\n' "ERROR: Could not run \"sort\" (necessary in order for this script to function correctly)"; fi
            if [ "$error3" = "true" ]; then printf '%s\n' "ERROR: Could not run \"uniq\" (necessary in order for this script to function correctly)"; fi
            if [ "$error4" = "true" ]; then printf '%s\n' "ERROR: Could not run \"diff\" (necessary in order for this script to function correctly)"; fi
            if [ "$error5" = "true" ]; then printf '%s\n' "ERROR: Could not run \"grep\" (necessary in order for this script to function correctly)"; fi
            printf "\n"
        }>/dev/stderr
        CleanUp & exit 1
    fi

    #Copy the parameters to the array "files":
    for i in $(seq 1 $selected_params_0); do
        eval files_$i=\"\$selected_params_$i\"
    done
    files_0=$selected_params_0
    
    error="false"
    if [ "$files_0" -lt "2" ]; then
        printf '\n%s\n' "ERROR: Please provide at least two parameters!">/dev/stderr
        error="true"
    fi
    
    if [ "$error" = "true" ]; then
        printf "\n"
        CleanUp & exit 1
    fi
    
    if [ "$OS" = "Linux" -o "$OS" = "Mac" -o "$OS" = "userdefined" ]; then
        
        #Commands to display common/different domains when --domains-full flag is not set:
        if [ "$domains_full_flag" = "0" ]; then
            # Command to add a NewLine character at the beginning of each URL found (an URL is identified by: "...://..."):
            #   command1: sed -E 's/([a-zA-Z]*\:\/\/)/\\${NL}\1/g'
            sed_command1='sed -E '"'"'s/([a-zA-Z]*\:\/\/)/'"\\${NL}"'\1/g'"'";
            
            # Command to strip anything beside the URL body ("...://...") = for each new line, remove anything after (including) the first SPACE/TAB/NEWLINE found:
            if [ "$domains_flag" = "0" ]; then
                # Do not strip the URL domain (keep the full URL):
                #   command2: sed -n 's/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^ \|^\t\|^\n]*\).*/\4\5\7/p'
                sed_command2='sed -n '"'"'s/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^ \|^\t\|^\n]*\).*/\4\5\7/p'"'"
            elif [ "$domains_flag" = "1" ]; then
                # Filter only URLS that contain "."
                # Strip anything beside the URL domain (the string until the first encountered SLASH/COLON/NEWLINE beside "://"):
                #   command2: sed -n 's/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^/\|^\:\|^\n]*\).*/\4\5\7/p'|sed '/\./!D'
                sed_command2='sed -n '"'"'s/\(\(.*\([^a-zA-Z+]\)\|\([a-zA-Z]\)\)\)\(\([a-zA-Z]\)*\:\/\/\)\([^/\|^\:\|^\n]*\).*/\4\5\7/p'"'"\|'sed '"'"'/\./!D'"'"
            fi
            
            #Command to remove URLs that contain "file:///" (URLs that contain the file protocol (file:///...) are automatically skipped):
            sed_command2="$sed_command2"\|'sed '"'"'/file\:\/\/\/./D'"'"
            
            # For each line: Command to remove the second column (containing a padded number which was used to preserve initial find order (after sorting)):
            if [ "$different_flag" = "1" ]; then
                #   command3: sed -E 's/(.) [0-9]* (.*)/\1 \2/g'
                sed_command3='sed -E '"'"'s/(.) [0-9]* (.*)/\1 \2/g'"'";
                #   command4: sed -E 's/^2/>/g;s/^1/</g'
                sed_command4='sed -E '"'"'s/^2/>/g;s/^1/</g'"'"
            elif [ "$common_flag" = "1" ]; then
                #   command3: sed -E 's/(.) [0-9]* (.*)/\2/g'
                sed_command3='sed -E '"'"'s/(.) [0-9]* (.*)/\2/g'"'";
            fi
        #Commands to display common/different domains when --domains-full flag is set:
        elif [ "$domains_full_flag" = "1" ]; then
            # Command to find common domains:
            #     command1: 
            uniql_command1="$current_shell $current_script_path -c --domains $(for i in seq 1 $files_0; do eval printf \'%s \' \'\$files_$i\'; done)"

            # Links that are only in first parameter file (file group 1):
            #     command2: 
            uniql_command2="$current_shell $current_script_path -d '$files_1' \"/dev/null\""
            
            #     command3: 
            uniql_command3="$current_shell $current_script_path -c --domains $(for i in seq 1 $files_0; do eval printf \'%s \' \'\$files_$i\'; done)"
            # Links that are only in 2..N parameter files (file group 2):
            #     command4: 
            uniql_command4="$current_shell $current_script_path -d \"/dev/null\" $(for i in seq 2 $files_0; do eval printf \'%s \' \'\$files_$i\'; done)"
        fi
                    
    else
        printf '\n%s\n\n' "Error: Unsupported OS!">/dev/stderr
        CleanUp & exit 1
    fi
    
    error="false"
    for i in $(seq 1 $files_0); do
        eval current_file=\"\$files_$i\"
        # If current file does not exist or is a directory:
        if [ ! -e "$current_file" -o -d "$current_file" ]; then
            printf '\n%s\n' "ERROR: File \"$current_file\" does not exist or is not a accessible!">/dev/stderr
            error="true"
        else
            # If current file is not readable:
            if [ ! -r "$current_file" ]; then
                printf '\n%s\n' "ERROR: File <file$i> = \"$current_file\" is not accessible!">/dev/stderr
                error="true"
            fi
        fi
    done
    
    if [ "$error" = "true" ]; then
        printf "\n"
        CleanUp & exit 1
    fi
    
    #Proceed to finding and comparing links:
    
    #Trap "INTERRUPT" (CTRL-C) and "TERMINAL STOP" (CTRL-Z) signals:
    trap 'trap1' INT
    trap 'trap1' TSTP
    
    old_IFS="$IFS" #Store initial IFS value
    IFS="
"
    
    if [ "$domains_full_flag" = "0" ]; then
        
        if [ "$different_flag" = "1" -a "$preserve_order_flag" = "0" ]; then
        {
            title_index="$(( 1 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                ProcFindLinks "files_1"
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -u -k 3
            
            title_index="$(( 2 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                for i in $(seq 2 $files_0); do
                    ProcFindLinks "files_$i"
                done
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -u -k 3
            
            PrintJustInTitle "Searching for unique links [$((title_index + 1))]..."
        }|{\
            sort -k 3|uniq -u -f 2|sort|eval "$sed_command3"|eval "$sed_command4"
            
            PrintJustInTitle "Done"
        }
        elif [ "$common_flag" = "1" -a "$preserve_order_flag" = "0" ]; then
        {
            title_index="$(( 1 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                ProcFindLinks "files_1"
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -k 3|uniq -f 2
            
            title_index="$(( 2 + ( $subshell_flag * ( $index_amplifier - 1 ) ) * 2 ))"
            PrintJustInTitle "Searching for links [$title_index]..."
            {
                for i in $(seq 2 $files_0); do
                    ProcFindLinks "files_$i"
                done
                
                PrintJustInTitle "Sorting results [$title_index]..."
            }|sort -k 3|uniq -f 2
            
            PrintJustInTitle "Searching for common links [$((title_index + 1))]..."
        }|{\
            sort -k 3|uniq -d -f 2|sort|eval "$sed_command3"
            
            PrintJustInTitle "Done"
        }
        elif [ "$different_flag" = "1" -a "$preserve_order_flag" = "1" ]; then
            PrintJustInTitle "Searching for links..."
            {
                link_count=0
                current_line=""
                for line in $(\
                        diff \
                        <( \
                            PrintJustInTitle "Searching for links [1]..."; \
                            ProcFindLinksShowLineNumbers "files_1"; \
                            printf '%s\n' "### Sepparator 1";\
                        ) \
                        <( \
                            PrintJustInTitle "Searching for links [2]..."; \
                            for i in $(seq 2 $files_0); do ProcFindLinksShowLineNumbers "files_$i"; done; \
                            printf '%s\n' "### Sepparator 2";\
                        ) \
                    ); do
                    link_count+=1
                    previous_line="$current_line"
                    current_line="$line"
                    #if ( current line starts with "<" and previous line starts with "<" ) OR ( current line starts with ">" and previous line starts with ">" ):
                    if [ \( \( ! "${current_line#"<"}" = "${current_line}" \) -a \( ! "${previous_line#"<"}" = "${previous_line}" \) \) -o \( \( ! "${current_line#">"}" = "${current_line}" \) -a  \( ! "${previous_line#">"}" = "${previous_line}" \) \) ]; then
                        printf '%s\n' "$previous_line"
                    fi
                done
            }

        fi
    elif [ "$domains_full_flag" = "1" ]; then
        
        if [ "$different_flag" = "1" ]; then
        # Find links (second eval) that are not in the common domains list (first eval):
            # Links in the first file given as parameter (second eval):
            grep -F -vf <( eval $uniql_command1 --subshell1; ) <( eval $uniql_command2 --subshell2; )
            # Links in the files 2..N - given as parameters (second eval):
            grep -F -vf <( eval $uniql_command3 --subshell3; ) <( eval $uniql_command4 --subshell4; )
        elif [ "$common_flag" = "1" ]; then
        # Find links (second eval) that are in the common domains list (first eval):
            # Links in the first file given as parameter (second eval):
            grep -F -f <( eval $uniql_command1 --subshell1; ) <( eval $uniql_command2 --subshell2; )
            # Links in the files 2..N - given as parameters (second eval):
            grep -F -f <( eval $uniql_command3 --subshell3; ) <( eval $uniql_command4 --subshell4; )
        fi
        # grep flags explained:
        #    -F = do not interpret pattern string (treat string literally)
        #    -v = select non-matching lines
        #    -f = obtain pattern strings from next file - each string on a new line (containing site domains in this case)
    fi
    
    CleanUp
fi

쉘 스크립트를 사용하여 두 텍스트 파일의 차이점을 인쇄하는 방법은 무엇입니까?

답변1

답변2

답변3

답변4

관련 정보