이미 실행 중인 nohup 프로세스에 새 명령을 보내거나 nohup에서 두 명령을 함께/동시에 실행하는 방법은 무엇입니까?

이미 실행 중인 nohup 프로세스에 새 명령을 보내거나 nohup에서 두 명령을 함께/동시에 실행하는 방법은 무엇입니까?

nohup 작업을 실행하고 이미 실행 중인 프로세스에서 특정 새 명령(예: kerberos 인증)을 실행하고 싶습니다. 실제로 이상적인 솔루션은 reauth 명령을 먼저 실행한 다음 실제 작업을 실행하는 것입니다.모두 동일한 nohup 프로세스 ID 아래에 있습니다..이렇게 하면 nohup 프로세스에서 kerberos 티켓이 손실되지 않습니다.그래서 Kerberos 티켓을 잃지 않도록 다른 Python 스크립트와 동기식으로 재인증을 실행하고 싶습니다.화면이나 tmux 없음또는 스크립트와 상호 작용해야 하는 모든 것. 나는 기본적으로 Python 작업(또는 모든 작업)을 실행하고 실행이 완료될 때까지 티켓을 잃지 않는 데몬을 구현하고 싶습니다. 어떻게 해야 하나요?

이것이 현재 시도이지만 제대로 작동하는지 확실하지 않습니다.

# - set up this main sh script
source ~/.bashrc
source ~/.bash_profile
source ~/.bashrc.user
echo HOME = $HOME

source cuda11.1

conda init bash
conda activate metalearning_gpu

# - get a job id for this tmux session
export SLURM_JOBID=$(python -c "import random;print(random.randint(0, 1_000_000))")
echo SLURM_JOBID = $SLURM_JOBID
export OUT_FILE=$PWD/main.sh.o$SLURM_JOBID
export ERR_FILE=$PWD/main.sh.e$SLURM_JOBID
export WANDB_DIR=$HOME/wandb_dir

export CUDA_VISIBLE_DEVICES=4
echo CUDA_VISIBLE_DEVICES = $CUDA_VISIBLE_DEVICES
python -c "import torch; print(torch.cuda.get_device_name(0));"

# - CAREFUL, if a job is already running it could do damage to it, rm reauth process, qian doesn't do it so skip it
# top -u brando9
# pkill -9 reauth -u brando9

# - expt python script then inside that python pid attach a reauth process
# should I run rauth within python with subprocess or package both the nohup command and the rauth together in badsh somehow
#python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_sl_with_ddp.py --manual_loads_name sl_hdb1_5cnn_adam_cl_filter_size --filter_size 4 > $OUT_FILE 2> $ERR_FILE &
#nohup (echo $SU_PASSWORD | /afs/cs/software/bin/reauth; python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_sl_with_ddp.py --manual_loads_name sl_hdb1_5cnn_adam_cl_filter_size --filter_size 4 > $OUT_FILE 2> $ERR_FILE) &
nohup (echo $SU_PASSWORD | /afs/cs/software/bin/reauth; python -u ~/main.py > $OUT_FILE 2> $ERR_FILE) &
# other option is to run `echo $SU_PASSWORD | /afs/cs/software/bin/reauth` inside of python, right?
export JOB_PID=$!
echo JOB_PID = $JOB_PID

# - Done
echo "Done with the dispatching (daemon) sh script

아마도 충분히 오래 실행되지 않았기 때문일까요? 확실하지 않다.

또 다른 옵션은 Python 스크립트 자체 내에서 재인증을 실행하는 것입니다. 테스트되지 않았지만 작동할 수도 있나요? 주요 비결은 Kerberos 티켓 누락으로 인해 Nohup 프로세스가 죽지 않는 것입니다.

reauth의 내용은 다음과 같습니다.

(metalearning_gpu)~ $ cat /afs/cs/software/bin/reauth
#!/usr/bin/perl
# $Id: reauth 2737 2011-06-20 18:14:05Z miles $
#
# Original version (C) Martin Schulz, 2'2002
# University Karlsruhe
#
# Modifications by Miles Davis <[email protected]>
#  Super minimal -- call programs rather than functions to reduce dependence
#  on extra perl modules.
#
# Heimdal patches thanks to Georgios Asimenos <[email protected]>
#

# General:
##########

# This little script aims at maintaining a valid AFS token from a
# users password for long running jobs.

# As everybody knows (or should know) Kerberos tickets and AFS tokens
# only have a limited lifetime. This is so by design and is usually
# reasonable. After 12 hours, it is no more obvious that it is really
# that user sitting in front of the computer that once typed the
# correct password in. Furthermore the damage caused by compromized
# AFS tokens is limited to the lifetime of that ticket.

# However, there are situations when users want to use long running
# jobs that will write to AFS filespace for several days. Renewable
# tickets are not so much of help here, since they can only be renewed
# if ....

# Therefore the secret has somehow deposited on the local computer
# that will run the long time job. This can be eiter done by storing a
# keytab on the local disk, maybe with a cron(*) principal with
# reduces priviledges. The approach taken here is to work with the
# original password and keep it in RAM only.

# When starting this program, the user is asked for his principal and
# the corresponding password. Then the TGT and AFS token is obtained
# and displayed, afterwards, a background process is forked and the
# main process will return to the system prompt. The workload program
# can now be started.

# The background process will periodically attempt to obtain krb
# tickets and AFS tokens. If this fails for some reason (Kerberos
# server not available or anything, the program aborts.

# aklog does not create a new pag if not told so. If you want your
# background process have a separate pag, create it beforehand.

# The reauth.pl program will work until eternity if is not stopped
# somehow. The canonical way is kill it by "kill $pid", where $pid is
# the process id printed before the return of the initial call to
# reauth.pl or found in the output of "ps".


# (*) Cron jobs are another issue. Our institute introduced
# user.cron-style principals to enable cron to obtain a token and then
# work on restricted parts of the users home directories.


# Security issues:
##################

# reauth.pl will run forever if you do not stop it, so don't forget that!

# The password is kept in RAM (of the child process). AFAIK, this can
# only be recovered by local root (who you need to trust anyway). It
# will not survive a reboot of the local machine.

# The password is not kept on any disk. Therefore any bootfloppy
# (reboot to single user mode..)  or screwdriver (take disk away..)
# attacks are not promising.

# Be aware that your NSA-, FBI-, MI5-, KGB-, ElQaida-, or (*insert
# your favorite opponent or competitor here*)-sponsored cleaning
# personnel or coworkers might have even more elaborate means... :-)


# BUGS:
#######

# Only mildly tested only on Linux and Solaris.

# Uses kinit, aklog, klist and tokens programs for a KerberosV/ Ken
# Hornstein's migration kit centered AFS setup. Please adjust to your
# config.




###########################################################################
# Configs:


# kinit program, add path if necessary
if ( -e "/usr/kerberos/bin/kinit" ) {
    $kinit="/usr/kerberos/bin/kinit";
} elsif ( -e "/usr/lib/heimdal/bin/kinit" ) {
    $kinit = "/usr/lib/heimdal/bin/kinit";
} elsif ( -e "/usr/bin/kinit" ) {
    $kinit="/usr/bin/kinit";
} else {
    die("Couln't find kinit.\n");
}


# aklog program, add path if necessary
if ( -e "/usr/bin/aklog" ) {
    $aklog="/usr/bin/aklog";
} elsif ( -e "/usr/lib/heimdal/bin/afslog" ) {
    # or, afslog, for heimdal weirdos
    $aklog="/usr/lib/heimdal/bin/afslog";
} else {
    die("Couln't find aklog or afslog.\n");
}

# klist program, add path if necessary
$klist="/usr/kerberos/bin/klist";

# tokens program, add path if necessary
$tokens="/usr/bin/tokens";


#################################################################
# Program:

use Getopt::Long;
use POSIX qw(setuid);
use POSIX qw(setgid);
use POSIX qw(setsid);

# Defaults for command line options.
my $keytab = '';
my $command = '';
my $username = '';
my $debug = 0;
my $verbose = 0;
my $interval=15000; # time interval in seconds: 4+ hours:

my %opts = (
    # Keytab
    'k=s' => sub {
                    $keytab = @_[1];
                    $kinit_opts .= "-k -t $keytab ";
                },
    # Run command
    'c=s' => sub {
                    $command = @_[1];
                },
    # Run command as user
    'u=s' => sub {
                    $username = @_[1];
                },
    # Time interval to sleep
    'i=i' => sub {
                    $interval = @_[1];
                },
    # Debug
    'd'   => sub {
                    $debug++;
                },
    # Be versbose
    'v'   => sub {
                    $verbose++;
                },
);


GetOptions(%opts) or die "Usage: reauth [ -k=keytab ] [ -u user ] [ -i <sleep_interval ] [ -v ] [ -c <command> ]\n";




if(@ARGV) {
    $princ = $ARGV[0];
    debug_print(2, "Principal name provided by argument = $princ");
} else {
   # Assume we want the login name as the principal name
    $princ = getpwuid($<);
    debug_print(2, "Principal name provided by argument = $princ");
}

if ($keytab) {
    # Don't ask for password, a keytab was provided.
    debug_print(1, "Keytab provided = $keytab");
} else {
    # read password, but turn off echo before:
    print "Password for $princ: ";
    system "stty -echo";
    $passwd = <STDIN>;
    system "stty echo";
    printf "\n";
    chomp $passwd;
    # Actually get the tickets/tokens
    if(obtain_tokens()!=0) {
        die "Can't obtain kerberos tickets\n";
    }
    if ($verbose) {
        show_tokens();
    }
}

# fork to go into background:
# a) the parent will exit
# b) the child will work on
$pid = fork();
if ($pid) {
    # I am the parent.
    printf "Background process pid is: $pid\n";
    if ($command) {
        debug_print(1,"Waiting for child to die.");
        wait;
        debug_print(1,"Child is dead.");
    }
    exit 0;
} else {
    # I am the child.
    debug_print(2,"I am process $$");
    print "Can't set session id\n" unless setsid();

    debug_print(2,"KRB5CCNAME: " . $ENV{KRB5CCNAME});
    #if ($ENV{KRB5CCNAME}) {
        #$ENV{KRB5CCNAME} =  $ENV{KRB5CCNAME} . "_reauth_$$";
    #} else {
        #$ENV{KRB5CCNAME} =  "/tmp/krb5cc_reauth_$$";
    #}

    #debug_print(2,"Creating " . $ENV{KRB5CCNAME});
    #system "touch $ENV{KRB5CCNAME}";


    if ($username) {
        debug_print(1, "Looking up UID for $username");
        ($name,$passwd,$UID,$GID, @junk) = getpwnam($username);
        debug_print(1, "Changing to UID $UID, GID $GID");
        print "Can't set group id\n" unless setgid($GID);
        print "Can't set user id\n" unless setuid($UID);
        if ($ENV{KRB5CCNAME}) {
            $ENV{KRB5CCNAME} =  $ENV{KRB5CCNAME} . "_reauth_$$";
        } else {
            $ENV{KRB5CCNAME} =  "/tmp/krb5cc_reauth_$$";
        }
    }

    debug_print(2, "Running as uid " . $<);
    # Actually get the tickets/tokens
    if(obtain_tokens()!=0) {
        die "Can't obtain kerberos tickets\n";
    }

    if ($verbose) {
        show_tokens();
    }

    # If I was told to run a command, do it.
    if ($command) {
        debug_print(1,"About to exec $command");
        exec($command) or die "Can't execute '$command'.\n";
        exit
    }

    debug_print(2,"Going into auth loop (interval is $interval).");

    #close(STDOUT);
    #close(STDERR);

    # Otherwise, work until killed:
    while (1) {
        debug_print(2,"Waking up to obtain new tokens.");
        obtain_tokens();
        if ($verbose) {
            show_tokens();
        }
        sleep $interval;
    };
}

#################################################################


sub obtain_tokens() {

  # ignore sigpipes' (according to perlopentut)
  $SIG{PIPE} = 'IGNORE';

    #debug_print(1,"Running: | $kinit -f $kinit_opts -p $princ 1>/dev/null 2>&1");

  # run kinit
  open(KINIT, "| $kinit -f $kinit_opts -p $princ 1>/dev/null 2>&1");

  # pass password to stdin, password does not show up on command line
  if (! $keytab) {
      print(KINIT "${passwd}\n");
  }

  # close pipe and get status
  close(KINIT); $status=$?;

    debug_print(1,"kinit exited with status $status\n");
  # act on status..
  if($status == 256) {
        if ($verbose) {
            print "WARNING: kinit is not able to obtain Kerberos ticket ($status).\n";
            print "         Possible DNS or network problem. Continuing anyway...\n";
        }
        return 1;
  } elsif($status!=0) {
    print "kinit is not able to obtain Kerberos ticket: $status\n";
     return 2;
  };

    debug_print(1,"Running $aklog...\n");
  $status = system "$aklog >/dev/null" ;
    debug_print(1,"aklog exited with status $status\n");
  if($status!=0) {
    print "aklog is not able to obtain AFS token: $status\n";
     return 3;
  };

  return 0;

};

##################################################################

sub show_tokens() {
    system $klist ;
    system $tokens ;
};

##################################################################

sub debug_print($$) {
    my $level = shift;
    my $message = shift;

    if ($debug >= $level) {
        print "DEBUG$debug: $message\n";
    }
}

##################################################################

Python 재검증 시도:

def run_bash_command(cmd: str) -> Any:
    import subprocess

    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
    output, error = process.communicate()
    if error:
        raise Exception(error)
    else:
        return output


def stanford_reauth():
    # def stanford_rauth(password: Optional[str] = None):
    # password: str = os.environ['SU_PASSWORD'] if password is None else None
    # assert password is not None, f'Err: {password=}'
    reauth_cmd: str = f'echo $SU_PASSWORD | /afs/cs/software/bin/reauth'
    out = run_bash_command(reauth_cmd)
    print('Output of reauth (/afs/cs/software/bin/reauth with password)')
    print(f'{out=}')

나는 현재 그들이 통과하는지/죽지 않는지 확인하기 위해 둘 다 실행 중입니다.


새로운 오류:

나는 내가 노력하고 있는 것을 시도하고 있습니다:

nohup sh -c "echo $SU_PASSWORD | /afs/cs/software/bin/reauth; python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_diversity_with_task2vec.py --manual_loads_name diversity_ala_task2vec_hdb1_mio > $OUT_FILE 2> $ERR_FILE" > $PWD/main.sh.nohup.out$SLURM_JOBID &

그러나 그것이 여전히 작동하는지 확실하지 않습니다. 다음 오류가 발생합니다.

Password for brando9: stty: 'standard input': Inappropriate ioctl for device
stty: 'standard input': Inappropriate ioctl for device

Can't obtain kerberos tickets

나는 문제를 이해한다고 생각합니다. 새 명령은 터미널이 비밀번호를 보낼 것으로 예상하므로 에코가 비밀번호를 보낼 수 없습니다. 나는 시도했다:

나는 SSH를 가지고 있고비밀번호를 보내야 합니다. 비밀번호를 보낼 수 있는 방법이 있나요 ssh?


관련된:

답변1

이유는 모르겠지만 여기에서 대안이 작동하지 않는 것 같습니다.tty/stty(터미널)가 아니고 Expect, sshpass가 없을 때 어떻게 명령에 비밀번호를 보내나요?


세 가지 시도가 모두 성공한 것 같습니다.

  • 참조된 sh 스크립트에서 reauth 실행
  • Python에서 인증 실행
  • 위의 두 가지를 모두 수행하세요.

작업을 실행하는 데 사용하는 샘플 스크립트:

# https://unix.stackexchange.com/questions/724902/how-does-one-send-new-commands-to-run-to-an-already-running-nohup-process-e-g-r
# sh ~/diversity-for-predictive-success-of-meta-learning/main_nohup_snap.sh
# - set up this main sh script
export RUN_PWD=$(pwd)
source ~/.bashrc
source ~/.bash_profile
source ~/.bashrc.user
echo HOME = $HOME
# since snap .bash.user cd's me into HOME at dfs
cd $RUN_PWD
echo RUN_PWD = $RUN_PWD
realpath .

source cuda11.1

conda init bash
conda activate metalearning_gpu

# - get a job id for this tmux session
export SLURM_JOBID=$(python -c "import random;print(random.randint(0, 1_000_000))")
echo SLURM_JOBID = $SLURM_JOBID
export OUT_FILE=$PWD/main.sh.o$SLURM_JOBID
export ERR_FILE=$PWD/main.sh.e$SLURM_JOBID
export WANDB_DIR=$HOME/wandb_dir
echo $OUT_FILE
echo $ERR_FILE

export CUDA_VISIBLE_DEVICES=5
echo CUDA_VISIBLE_DEVICES = $CUDA_VISIBLE_DEVICES
python -c "import torch; print(torch.cuda.get_device_name(0));"

# - CAREFUL, if a job is already running it could do damage to it, rm reauth process, qian doesn't do it so skip it
# top -u brando9
# pkill -9 reauth -u brando9

# - expt python script then inside that python pid attach a reauth process
# should I run rauth within python with subprocess or package both the nohup command and the rauth together in badsh somehow
#python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_sl_with_ddp.py --manual_loads_name sl_hdb1_5cnn_adam_cl_filter_size --filter_size 4 > $OUT_FILE 2> $ERR_FILE &
nohup sh -c 'echo $SU_PASSWORD | /afs/cs/software/bin/reauth; python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_sl_with_ddp.py --manual_loads_name sl_hdb1_5cnn_adam_cl_filter_size --filter_size 4 > $OUT_FILE 2> $ERR_FILE' > $PWD/nohup.out$SLURM_JOBID &
#nohup python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_sl_with_ddp.py --manual_loads_name sl_hdb1_5cnn_adam_cl_filter_size --filter_size 4 > $OUT_FILE 2> $ERR_FILE &

# other option is to run `echo $SU_PASSWORD | /afs/cs/software/bin/reauth` inside of python, right?
export JOB_PID=$!
echo JOB_PID = $JOB_PID
echo SLURM_JOBID = $SLURM_JOBID

# - Done
echo "Done with the dispatching (daemon) sh script"

주요 명령:

nohup sh -c 'echo $SU_PASSWORD | /afs/cs/software/bin/reauth; python -u ~/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_sl_with_ddp.py --manual_loads_name sl_hdb1_5cnn_adam_cl_filter_size --filter_size 4 > $OUT_FILE 2> $ERR_FILE' > $PWD/nohup.out$SLURM_JOBID &

나는 옵션 3을 계속 사용할 것이라고 생각하므로 둘 다 실행하십시오. 왜냐하면 분산 작업을 실행하는 경우 해당 프로세스가 내 허락 없이 무작위로 종료되는 것을 원하지 않기 때문입니다. 이를 위해 Python 스크립트는 다음을 수행합니다.

def run_bash_command(cmd: str) -> Any:
    import subprocess

    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
    output, error = process.communicate()
    if error:
        raise Exception(error)
    else:
        return output

def stanford_reauth():
    """"
    re-authenticates the python process in the kerberos system so that the
    python process is not killed randomly.

    ref: https://unix.stackexchange.com/questions/724902/how-does-one-send-new-commands-to-run-to-an-already-running-nohup-process-or-run
    """
    reauth_cmd: str = f'echo $SU_PASSWORD | /afs/cs/software/bin/reauth'
    out = run_bash_command(reauth_cmd)
    print('Output of reauth (/afs/cs/software/bin/reauth with password): ')
    print(f'{out=}')

재인증 작업을 자주 확인하는 것을 잊지 마세요.

# - CAREFUL, if a job is already running it could do damage to it, rm reauth process
# pkill -9 reauth -u brando9

관련 정보