Latex와 함께 사용할 수 있도록 참고문헌 변환

Question

아래 프로그램이 awk작동해야 합니다. 각 행의 요소를 찾아 ( ... )"author, year" 또는 "author1 year1,author2 year2,..." 패턴과 일치하는지 확인합니다. 그렇다면 참조 명령을 생성하고 그룹을 대체합니다 ( ... ). 그렇지 않으면 그룹은 그대로 유지됩니다.

#!/usr/bin/awk -f


# This small function creates an 'authorYYYY'-style string from
# separate author and year fields. We split the "author" field
# additionally at each space in order to strip leading/trailing
# whitespace and further authors.
function contract(author, year)
{
    split(author,auth_fields," ");
    auth=tolower(auth_fields[1]);
    return sprintf("%s%4d",auth,year);
}



# This function checks if two strings correspond to "author name(s)" and
# "year", respectively.
function check_entry(string1, string2)
{
    if (string1 ~ /^ *([[:alpha:].-]+ *)+$/ && string2 ~ /^ *[[:digit:]]{4} *$/) return 1;
    return 0;
}




# This function creates a 'citation' command from a raw element. If the
# raw element does not conform to the reference syntax of 'author, year' or
# 'author1 year1,author2 year2, ...', we should leave it "as is", and return
# a "0" as indicator.
function create_cite(raw_elem)
{
    cite_argument=""

    # Split at ','. The single elements are either name(list) and year,
    # or space-separated name(list)-year statements.
    n_fields=split(raw_elem,sgl_elem,",");

    if (n_fields == 2 && check_entry(sgl_elem[1],sgl_elem[2]))
    {
        cite_argument=contract(sgl_elem[1],sgl_elem[2]);
    }
    else
    {
        for (k=1; k<=n_fields; k++)
        {
            n_subfield=split(sgl_elem[k],subfield," ");

            if (check_entry(subfield[1],subfield[n_subfield]))
            {
                new_elem=contract(subfield[1],subfield[n_subfield]);
                if (cite_argument == "")
                {
                    cite_argument=new_elem;
                }
                else
                {
                    cite_argument=sprintf("%s,%s",cite_argument,new_elem);
                }
            }
            else
            {
                return 0;
            }
        }
    }


    cite=sprintf("\\{%s}",cite_argument);
    return cite;
}




# Actual program
# For each line, create a "working copy" so we can replace '(...)' pairs
# already processed with different text (here: 'X ... Y'); otherwise 'sub'
# would always stumble across the same opening parentheses.
# For each '( ... )' found, check if it fits the pattern. If so, we replace
# it with a 'cite' command; otherwise we leave it as it is.

{
    working_copy=$0;

    # Allow for unmatched ')' at the beginning of the line:
    # if a ')' was found before the first '(', mark is as processed
    i=index(working_copy,"(");
    j=index(working_copy,")");
    if (i>0 && j>0 && j<i) sub(/\)/,"Y",working_copy);

    while (i=index(working_copy,"("))
    {
        sub(/\(/,"X",working_copy); # mark this '(' as "already processed

        j=index(working_copy,")");
        if (!j)
        {
            continue;
        }
        sub(/\)/,"Y",working_copy); # mark this ')', too


        elem=substr(working_copy,i+1,j-i-1);

        replacement=create_cite(elem);
        if (replacement != "0")
        {
            elem="\\(" elem "\\)"
            sub(elem,replacement);
        }

    }
    print $0;
}

이 프로그램을 호출해

~$ awk -f transform_citation.awk input.tex

노트프로그램은 입력이 "잘 구성된" 것으로 예상합니다. 즉, 줄의 모든 대괄호는 쌍으로 일치해야 합니다(줄 시작 부분에 하나의 오른쪽 대괄호가 허용되고 일치하지 않는 왼쪽 대괄호는 무시됩니다).

또한 참고하시기 바랍니다위 구문 중 일부에는 GNU awk가 필요합니다. 다른 구현으로 이식하려면

if (string1 ~ /^ *([[:alpha:].-]+ *)+$/ && string2 ~ /^ *[[:digit:]]{4} *$/) return 1;

그리고

if (string1 ~ /^ *([a-zA-Z.-]+ *)+$/ && string2 ~ /^ *[0123456789][0123456789][0123456789][0123456789] *$/) return 1;

그리고 데이터 정렬 로캘이 로 설정되어 있는지 확인하세요 C.

Answer 1