웹의 다른 사이트에서 여러분 중 일부가 눈치채셨을 수도 있겠지만, 저는 Stack Exchange에 깨진 이미지와 링크를 수정하는 몇 가지 스크립트를 가지고 있습니다. 이러한 스크립트의 대부분은 내 Raspberry Pi 4에서 cronjob으로 자동 실행됩니다.
링크의 특징을 발견했습니다jstor.org. Mac과 RPi의 브라우저에서 링크에 액세스할 수 있습니다. 이 스크립트(웹 검색과 유사한 방식 curl
)는 RPi에서 실행될 때 reCAPTCHA에 의해 차단되지만 Mac에서는 차단되지 않습니다. 사이트에 어느 정도 크롤링 방지 기능이 있다는 것은 논리적이지만, 서로 다른 컴퓨터(동일한 홈 네트워크에 있음) 사이의 차이점을 본 것은 이번이 처음입니다.
다음은 내 Raspberry Pi의 Chromium 개발자 도구에서 요청을 가져온 구체적인 예입니다.
curl 'https://www.jstor.org/stable/2533862' \
-H 'accept-encoding: deflate, gzip' \
-H 'upgrade-insecure-requests: 1' \
-H 'user-agent: Mozilla/5.0 (X11; CrOS armv7l 13597.84.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.98 Safari/537.36' \
-H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
-H 'accept-language: en-US,en;q=0.9' \
--compressed -v
sec-
(참고: 관련성이 없는 일부 제목을 삭제했습니다 .)
터미널에서 이 명령은 내 Mac에서는 작동하지만 Raspberry Pi에서는 작동하지 않습니다. Mac의 사용자 에이전트를 사용하면 아무런 차이가 없습니다. 생성된 HTML은 다음과 같습니다.
이것은 컬의 전체 출력입니다.
pi@raspberrypi:~ $ curl 'https://www.jstor.org/stable/2533862' \
> -H 'accept-encoding: deflate, gzip' \
> -H 'upgrade-insecure-requests: 1' \
> -H 'user-agent: Mozilla/5.0 (X11; CrOS armv7l 13597.84.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.98 Safari/537.36' \
> -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
> -H 'accept-language: en-US,en;q=0.9' \
> --compressed -v
* Expire in 0 ms for 6 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 0 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 1 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 2 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 3 ms for 1 (transfer 0x1e5a950)
* Expire in 3 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 3 ms for 1 (transfer 0x1e5a950)
* Expire in 3 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 4 ms for 1 (transfer 0x1e5a950)
* Expire in 5 ms for 1 (transfer 0x1e5a950)
* Trying 151.101.36.152...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x1e5a950)
* Connected to www.jstor.org (151.101.36.152) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: none
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
* subject: C=US; ST=New York; L=New York; O=Ithaka Harbors, Inc.; CN=jstor.org
* start date: Apr 12 15:57:42 2022 GMT
* expire date: May 14 15:57:41 2023 GMT
* subjectAltName: host "www.jstor.org" matched cert's "*.jstor.org"
* issuer: C=BE; O=GlobalSign nv-sa; CN=GlobalSign Atlas R3 OV TLS CA 2022 Q2
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x1e5a950)
> GET /stable/2533862 HTTP/2
> Host: www.jstor.org
> accept-encoding: deflate, gzip
> upgrade-insecure-requests: 1
> user-agent: Mozilla/5.0 (X11; CrOS armv7l 13597.84.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.98 Safari/537.36
> accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
> accept-language: en-US,en;q=0.9
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 403
< server: Varnish
< retry-after: 0
< content-type: text/html
< accept-ranges: bytes
< date: Mon, 18 Apr 2022 07:47:18 GMT
< via: 1.1 varnish
< set-cookie: _pxhd=vW1nDMNFFFI3tkNkjQYWOgLI99ajK-hT6LI4ua0sZy38e1p4v9XUHY6a2DoRXv2CFRxDjEnHFZYyof3sUytsZw==:h1LPuATkQi5XBRiv7qid2Y8pMCDr93JembEBMBbV9Cjwzp3HvjzErajD8VCWHMVi0Cc0FTRhPNO6W3t4pYHs/wawxsyE89qgcX4Ci7BGRyI=; Expires=Tue, 18 Apr 2023 07:47:18 GMT; path=/;
< x-served-by: cache-ams21078-AMS
< x-cache: MISS
< x-cache-hits: 0
< content-length: 3468
<
<!DOCTYPE html>
<html class="popup no-js" lang="en">
<head>
<meta name="robots" content="noarchive,NOODP" />
<meta name="description" content="JSTOR is a digital library of academic journals, books, and primary sources." />
<meta name="viewport" content="width=device-width" />
<meta charset="UTF-8"/>
<link rel="stylesheet" href="/assets/global_20171026T1134/build/global/css/popup.css" />
<link rel="apple-touch-icon" href="/assets/global_20171026T1134/build/images/apple-touch-icon.png" />
<title>JSTOR: Access Check</title>
<!-- Custom CSS -->
</head>
<body>
<div class="logo-container">
<a href="/"><img src="/assets/global_20171026T1134/build/images/jstor-logo.png" srcset="/assets/global_20171026T1134/build/images/jstor-logo.png" class="non-responsive" alt="JSTOR Home" width="65" height="90" /></a>
</div>
<div id="content" role="main" class="row content brdra">
<div class="small-12 columns paxl mtxl">
<div class="row popup-inner">
<div class="small-12 columns noGlobalSrch">
<h2>Access Check</h2>
<p>Our systems have detected unusual traffic activity from your network. Please complete this reCAPTCHA to demonstrate that it's
you making the requests and not a robot. If you are having trouble seeing or completing this challenge,
<a href="https://support.jstor.org/hc/en-us/articles/115011068868-Troubleshooting-CAPTCHA-" target="_blank" title="This link opens in a new window">this page</a> may help.
If you continue to experience issues, you can <a href="https://support.jstor.org/" target="_blank" title="This link opens in a new window">contact JSTOR support</a>.</p>
<div id="px-captcha"> </div>
<p>Block Reference: #c5d172ad-beeb-11ec-8c24-556c625a4161<br/>
VID: #<br/>
IP: [my IP address]<br/>
Date and time: Mon, 18 Apr 2022 07:47:18 GMT<br/>
<noscript>Javascript is disabled</noscript></p>
<p>Go back to <a href="/" title="Go back to JSTOR">JSTOR</a></p>
</div>
</div>
</div>
</div>
<div class="row">
<div class="small-12 columns pts">
<small>©2000-<script type="text/javascript">document.write(new Date().getFullYear());</script> ITHAKA. All Rights Reserved. JSTOR®, the JSTOR logo, JPASS®, and ITHAKA® are registered trademarks of ITHAKA.</small>
</div>
</div>
<!-- Px --> <script> window._pxAppId = 'PXu4K0s8nX'; window._pxJsClientSrc = '/u4K0s8nX/init.js'; window._pxFirstPartyEnabled = true; window._pxVid = ''; window._pxUuid = 'c5d172ad-beeb-11ec-8c24-556c625a4161'; window._pxHostUrl = '/u4K0s8nX/xhr'; </script>
<script> var s = document.createElement('script'); s.src = '/u4K0s8nX/captcha/captcha.js?a=c&u=c5d172ad-beeb-11ec-8c24-556c625a4161&v=&m=0'; var p = document.getElementsByTagName('head')[0]; p.insertBefore(s, null); if (true ){s.onerror = function () {s = document.createElement('script'); var suffixIndex = '/u4K0s8nX/captcha/captcha.js?a=c&u=c5d172ad-beeb-11ec-8c24-556c625a4161&v=&m=0'.indexOf('/captcha.js'); var temperedBlockScript = '/u4K0s8nX/captcha/captcha.js?a=c&u=c5d172ad-beeb-11ec-8c24-556c625a4161&v=&m=0'.substring(suffixIndex); s.src = '//captcha.px-cdn.net/PXu4K0s8nX' + temperedBlockScript; p.parentNode.insertBefore(s, p);};}</script>
<!-- Custom Script -->
</body>
* Connection #0 to host www.jstor.org left intact
</html>
참고로 이것은 내 Mac에서 얻은 것입니다(길이 때문에 HTML 출력을 건너뛰었지만 이것이 제가 예상한 것입니다).
glorfindel@Glorfindels-MacBook ~ % curl 'https://www.jstor.org/stable/2533862' \
-H 'accept-encoding: deflate, gzip' \
-H 'upgrade-insecure-requests: 1' \
-H 'user-agent: Mozilla/5.0 (X11; CrOS armv7l 13597.84.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.98 Safari/537.36' \
-H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
-H 'accept-language: en-US,en;q=0.9' \
--compressed -v
* Trying 151.101.36.152:443...
* Connected to www.jstor.org (151.101.36.152) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: C=US; ST=New York; L=New York; O=Ithaka Harbors, Inc.; CN=jstor.org
* start date: Apr 12 15:57:42 2022 GMT
* expire date: May 14 15:57:41 2023 GMT
* subjectAltName: host "www.jstor.org" matched cert's "*.jstor.org"
* issuer: C=BE; O=GlobalSign nv-sa; CN=GlobalSign Atlas R3 OV TLS CA 2022 Q2
* SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x156012200)
> GET /stable/2533862 HTTP/2
> Host: www.jstor.org
> accept-encoding: deflate, gzip
> upgrade-insecure-requests: 1
> user-agent: Mozilla/5.0 (X11; CrOS armv7l 13597.84.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.98 Safari/537.36
> accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
> accept-language: en-US,en;q=0.9
>
< HTTP/2 200
< server: Apache/2.4.29 (Ubuntu)
< x-frame-options: SAMEORIGIN
< set-cookie: AccessSession=H4sIAAAAAAAAAK2RSW_UQBCF7_kVls_pUS_VG7exiRPEkXBCKCqXu8HIw4y8RIIo_x2vAxPIjWN_9er1q6qnqyRJ6ypN3iSpQOmdJUtRauBOlpXDCCijJl9yYdPrSUyrukZiiCjEgr-tOCsKlRX-Lejc5pksPOdaFFxxpcw-g3xRt6taQXBWUhXKoIGCcpJLKwyAsRX4qBb1MKxyD8YBj4o56zQDkJ45owNTngyVshRe4NKCQ_91aonYdGEmj9gsLsLo8ROnDFjD51J9WuYHtRPW7oQzOyFgNaoO9fdLp647XoKeLt9I1HcT-pQ8je8_Nsw597PvyNY4KWwgbFN6KJXymkWP45RecuYcBiZ9FZ1wFFWFW0__4xTmptv2OJzO1mecYVfTRW0cpz7UP0PR4JdJ0rdDGCvP1__Iql5mlX9lDY7bKhCLypYMyGqG5CTjJNyYnkqO_P9nTT4vh-jP9zRW_75ng68VhlcK2PftfLF1B3k-J7r5uCXJ72cQ2kNojg-3Nxv_sPB393f79_t0ynb1_AtWUGjGVAMAAA; Path=/; SameSite=Lax; Secure
< set-cookie: AccessSessionSignature=3322ae2ad6c2aca1491af2e0e493b5ab6c9533cd0d0024b488f8cb4904e4b6a3; Path=/; SameSite=Lax; Secure
< set-cookie: AccessSessionTimedSignature=aa6b942b5efb553254984f1935040fef7c65023499c190c6884e3cc79a66b75a; Path=/; SameSite=Lax; Secure
< set-cookie: UUID=946840f3-8785-4429-865e-39c6cb2b191a; expires=Thu, 17 Apr 2025 07:39:25 GMT; Max-Age=94608000; Path=/; SameSite=None; Secure
< set-cookie: csrftoken=BtBwhZoFKH61vmo7vv3CZAs9DrDmaPkXski76lA478b0kEYpLO8P35H0M2ymzpA4; expires=Mon, 17 Apr 2023 07:39:25 GMT; Max-Age=31449600; Path=/; SameSite=Lax; Secure
< set-cookie: ReferringRequestId=excelsior:3ebf19131196bae82406e55730913657; Path=/; SameSite=Lax; Secure
< content-encoding: gzip
< content-type: text/html; charset=utf-8
< x-jstor-restarts: 2
< accept-ranges: bytes
< date: Mon, 18 Apr 2022 07:39:25 GMT
< via: 1.1 varnish
< set-cookie: _pxhd=4X5A9pQYXcrgxAXzUOZVi-aK2X5V-aHyliphZo8MwnOdMZDI-s0-wFgAEPOOhZwLs2bHY6gFurYQD-XHQ8LKTg==:29IO778AT925teKlLC1rJlVwEP2U/dhPyCHtvFGriTKChA-n8uiCGYCX5scjIwh5sTZ478ZG8SGwxd4lmCJM/DO1SZTeMfI/pjaeDtq44OQ=; Expires=Tue, 18 Apr 2023 07:39:25 GMT; path=/;
< x-served-by: cache-ams21083-AMS
< x-cache: MISS
< x-cache-hits: 0
< vary: Cookie,Accept-Encoding,Fastly-SSL,Origin,X-Requested-Host
<
<!DOCTYPE html>
<html class="no-js" lang="en">
</html>
* Connection #0 to host www.jstor.org left intact
내 거실제스크립트는 Java에서 실행되지만 동일한 문제가 있습니다. 그래서 그런 것 같아요무엇Raspberry Pi 또는 운영 체제에서 이 문제가 발생할 수 있는 원인은 무엇입니까? 나는 "Raspbian GNU/Linux 10 (buster)"로 실행 /etc/os-release
중이며curl --version
curl 7.64.0 (arm-unknown-linux-gnueabihf) libcurl/7.64.0 OpenSSL/1.1.1n zlib/1.2.11 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.5) libssh2/1.8.0 nghttp2/1.36.0 librtmp/2.3
Release-Date: 2019-02-06
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL
(Mac의 컬은 약간 더 새로운 버전(7.79.1)이지만 동작이 도구 독립적인 것처럼 보이므로 이것이 문제가 되지 않는다고 생각합니다.) 내가 원래 Raspberry Pi Stack Exchange에서 이 질문을 했던 중재자 중 한 명은 Fedora에서도 컬이 실패했다고 말했습니다.