사용해 보았지만 한 페이지만 검색됩니다."wget --mirror http://tshepang.net/
tshepang.net/index.html". 이것은 wget의 버그입니까?
이 옵션을 사용한 출력은 다음과 같습니다 --debug
.
DEBUG output created by Wget 1.12 on linux-gnu.
Enqueuing http://tshepang.net/ at depth 0
Queue count 1, maxcount 1.
[IRI Enqueuing `http://tshepang.net/' with None
Dequeuing http://tshepang.net/ at depth 0
Queue count 0, maxcount 1.
--2011-01-15 12:32:51-- http://tshepang.net/
Resolving tshepang.net... 66.216.125.32
Caching tshepang.net => 66.216.125.32
Connecting to tshepang.net|66.216.125.32|:80... connected.
Created socket 4.
Releasing 0x089e2be0 (new refcount 1).
---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: tshepang.net
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 302 Found
Server: nginx/0.7.65
Date: Sat, 15 Jan 2011 10:33:45 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Status: 302 Found
Location: http://posterous.com/sso/verify/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F
X-Runtime: 3
Set-Cookie: cookies_enabled=true; path=/
Cache-Control: no-cache
Content-Length: 141
X-Varnish: 419207385
Age: 0
Via: 1.1 varnish
X-Cache: MISS
---response end---
302 Found
Stored cookie tshepang.net -1 (ANY) / <session> <insecure> [expiry none] cookies_enabled true
Registered socket 4 for persistent reuse.
Location: http://posterous.com/sso/verify/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F [following]
Skipping 141 bytes of body: [<html><body>You are being <a href="http://posterous.com/sso/verify/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F">redirected</a>.</body></html>] done.
--2011-01-15 12:32:52-- http://posterous.com/sso/verify/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F
conaddr is: 66.216.125.32
Resolving posterous.com... 184.106.20.99
Caching posterous.com => 184.106.20.99
Releasing 0x089e3e20 (new refcount 1).
Found posterous.com in host_name_addresses_map (0x89e3e20)
Connecting to posterous.com|184.106.20.99|:80... connected.
Created socket 5.
Releasing 0x089e3e20 (new refcount 1).
---request begin---
GET /sso/verify/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: posterous.com
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 302 Found
Server: nginx/0.7.65
Date: Sat, 15 Jan 2011 10:33:46 GMT
Content-Type: text/html; charset=utf-8
Connection: close
Status: 302 Found
Location: http://tshepang.net/sso/recovery/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F
X-Runtime: 7
Set-Cookie: _sharebymail_session_id=296a636c8ed3cb6e4e7cabb10256008a; domain=.posterous.com; path=/; HttpOnly
Cache-Control: no-cache
Content-Length: 142
X-Varnish: 2019529137
Age: 0
Via: 1.1 varnish
X-Cache: MISS
---response end---
302 Found
cdm: 1 2
Stored cookie posterous.com -1 (ANY) / <session> <insecure> [expiry none] _sharebymail_session_id 296a636c8ed3cb6e4e7cabb10256008a
Location: http://tshepang.net/sso/recovery/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F [following]
Closed fd 5
--2011-01-15 12:32:53-- http://tshepang.net/sso/recovery/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F
Reusing existing connection to tshepang.net:80.
Reusing fd 4.
---request begin---
GET /sso/recovery/2d35d71b1e728dc99f3c153eaf6f8fa0?jumpto=%2F HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: tshepang.net
Connection: Keep-Alive
Cookie: cookies_enabled=true
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 302 Found
Server: nginx/0.7.65
Date: Sat, 15 Jan 2011 10:33:46 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Status: 302 Found
Location: http://tshepang.net/
X-Runtime: 5
Set-Cookie: _sharebymail_session_id=cab0227db8c38f17e572984ee188dc5e; domain=tshepang.net; path=/; HttpOnly
Cache-Control: no-cache
Content-Length: 86
X-Varnish: 419207606
Age: 0
Via: 1.1 varnish
X-Cache: MISS
---response end---
302 Found
cdm: 1 2
Stored cookie tshepang.net -1 (ANY) / <session> <insecure> [expiry none] _sharebymail_session_id cab0227db8c38f17e572984ee188dc5e
Location: http://tshepang.net/ [following]
Skipping 86 bytes of body: [<html><body>You are being <a href="http://tshepang.net/">redirected</a>.</body></html>] done.
--2011-01-15 12:32:54-- http://tshepang.net/
Reusing existing connection to tshepang.net:80.
Reusing fd 4.
---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: tshepang.net
Connection: Keep-Alive
Cookie: _sharebymail_session_id=cab0227db8c38f17e572984ee188dc5e; cookies_enabled=true
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Server: nginx/0.7.65
Date: Sat, 15 Jan 2011 10:33:49 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Status: 200 OK
ETag: "6ec7aeb4e15e3a80e733f7c2b5e00d6f"
X-Runtime: 1680
Cache-Control: private, max-age=0, must-revalidate
Content-Length: 66513
X-Varnish: 419207692
Age: 0
Via: 1.1 varnish
X-Cache: MISS
---response end---
200 OK
Length: 66513 (65K) [text/html]
Saving to: `tshepang.net/index.html'
0K .......... .......... .......... .......... .......... 76% 25.7K 1s
50K .......... .... 100% 39.3K=2.3s
2011-01-15 12:32:58 (27.9 KB/s) - `tshepang.net/index.html' saved [66513/66513]
Deciding whether to enqueue "http://tshepang.net/".
Already on the black list.
Decided NOT to load it.
Redirection "http://tshepang.net/" failed the test.
FINISHED --2011-01-15 12:32:58--
Downloaded: 1 files, 65K in 2.3s (27.9 KB/s)
답변1
이 --no-cookies
옵션은 도움이 됩니다(감사합니다그네):
모든 리디렉션으로 인해 wget이 요청을 중단하는 것 같습니다. --no-cookies를 사용해 보세요.
이는 첨부된 로그를 읽어 확인했습니다.
답변2
wget이 경로에 있다고 가정하고(그렇지 않은 경우 전체 경로를 입력해야 함) 다음 명령을 실행합니다.
mkdir wget_files
cd wget_files
wget --mirror –-wait=2 --page-requisites --html-extension –-convert-links –-directory-prefix wget_files/example1 http://www.yourdomain.com
답변3
-r
또한 재귀 및 링크 깊이 도 설정해야 합니다 -l X
. 여기서 X는 정수입니다. -A
보관할 수 있는 파일 형식 목록을 설정하는 것도 좋은 생각입니다(그렇지 않으면 HTML 파일만 받게 됩니다).