綾小路龍之介の素人思考

[wget] wget で https URI の大量ダウンロードを行うとメモリリーク

wget 1.13.4 で起きている。治ったのは 1.14 らしい。

トラブルのあった wget のバージョンは以下。

$ wget --version
GNU Wget 1.13.4 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls -ntlm +opie +ssl/gnutls

Wgetrc:
    /etc/wgetrc (system)
Locale: /usr/share/locale
Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
    -DLOCALEDIR="/usr/share/locale" -I. -I../lib -I../lib
    -D_FORTIFY_SOURCE=2 -Iyes/include -g -O2 -fstack-protector
    --param=ssp-buffer-size=4 -Wformat -Werror=format-security
    -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
Link: gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
    -Werror=format-security -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
    -Wl,-z,relro -Lyes/lib -lgnutls -lgcrypt -lgpg-error -lz -lidn -lrt
    ftp-opie.o gnutls.o ../lib/libgnu.a

Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
Please send bug reports and questions to <bug-wget@gnu.org>.

コマンドは以下。

$ wget -w 0 -i list.txt
(snip)
Total wall clock time: 25m 15s
Downloaded: 189 files, 25M in 20s (1.29 MB/s)

wget 実行中のメモリ増加。

$ ps u -p 3385; sleep 60; while true; do ps u --no-headers -p 3385; sleep 60; done
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
******    3385 17.3  1.5  64096 59820 pts/5    R+   05:12   0:10 wget -w 0 -i list.txt
******    3385 17.1  3.0 121956 117540 pts/5   R+   05:12   0:20 wget -w 0 -i list.txt
******    3385 16.4  4.3 170712 166544 pts/5   R+   05:12   0:29 wget -w 0 -i list.txt
******    3385 15.8  5.4 216404 212132 pts/5   R+   05:12   0:37 wget -w 0 -i list.txt
******    3385 15.8  6.9 271896 267616 pts/5   R+   05:12   0:47 wget -w 0 -i list.txt
******    3385 15.8  8.3 326448 322176 pts/5   R+   05:12   0:56 wget -w 0 -i list.txt
******    3385 15.8  9.7 381868 377456 pts/5   R+   05:12   1:06 wget -w 0 -i list.txt
******    3385 15.8 11.1 436756 432480 pts/5   R+   05:12   1:16 wget -w 0 -i list.txt
******    3385 15.8 12.6 492520 488244 pts/5   R+   05:12   1:25 wget -w 0 -i list.txt
******    3385 15.8 14.0 546340 542156 pts/5   R+   05:12   1:35 wget -w 0 -i list.txt
******    3385 15.8 15.4 600680 596404 pts/5   R+   05:12   1:44 wget -w 0 -i list.txt
******    3385 15.8 16.8 654392 650120 pts/5   R+   05:12   1:54 wget -w 0 -i list.txt
******    3385 15.7 18.1 706660 702512 pts/5   R+   05:12   2:03 wget -w 0 -i list.txt
******    3385 15.7 19.5 758500 754236 pts/5   R+   05:12   2:12 wget -w 0 -i list.txt
******    3385 15.7 20.9 811832 807428 pts/5   R+   05:12   2:21 wget -w 0 -i list.txt
******    3385 15.7 22.2 865000 860736 pts/5   R+   05:12   2:31 wget -w 0 -i list.txt
******    3385 15.8 23.7 921636 917468 pts/5   R+   05:12   2:41 wget -w 0 -i list.txt
******    3385 15.8 25.2 980228 975960 pts/5   R+   05:12   2:51 wget -w 0 -i list.txt
******    3385 15.9 26.8 1040636 1036224 pts/5 R+   05:12   3:02 wget -w 0 -i list.txt
******    3385 15.9 28.2 1096052 1091648 pts/5 R+   05:12   3:11 wget -w 0 -i list.txt
******    3385 15.9 29.6 1150944 1146496 pts/5 R+   05:12   3:21 wget -w 0 -i list.txt
******    3385 15.9 31.0 1205368 1200964 pts/5 R+   05:12   3:30 wget -w 0 -i list.txt
******    3385 15.9 32.5 1259740 1255472 pts/5 R+   05:12   3:40 wget -w 0 -i list.txt
******    3385 15.9 33.9 1316696 1312432 pts/5 R+   05:12   3:50 wget -w 0 -i list.txt
******    3385 15.9 35.4 1372976 1368800 pts/5 R+   05:12   3:59 wget -w 0 -i list.txt
$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 30095
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 30095
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

この問題が解決されている 1.15 で同じ実験。ダウンロードに要した時間もメモリも桁違い。

$ wget --version
GNU Wget 1.15 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie +ssl/gnutls

Wgetrc:
    /etc/wgetrc (system)
Locale:
    /usr/share/locale
Compile:
    gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
    -DLOCALEDIR="/usr/share/locale" -I. -I../lib -I../lib
    -D_FORTIFY_SOURCE=2 -I/usr/include -g -O2 -fstack-protector
    --param=ssp-buffer-size=4 -Wformat -Werror=format-security
    -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
Link:
    gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
    -Werror=format-security -DNO_SSLv2 -D_FILE_OFFSET_BITS=64 -g -Wall
    -Wl,-z,relro -L/usr/lib -lnettle -lgnutls -lz -lidn -luuid
    ftp-opie.o gnutls.o http-ntlm.o ../lib/libgnu.a

Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
Please send bug reports and questions to <bug-wget@gnu.org>.
$ wget -w 0 -i list.txt
(snip)
Total wall clock time: 27s
Downloaded: 189 files, 25M in 18s (1.43 MB/s)
$ while true; do ps u --no-headers -p 6916; sleep 60; done
*******   6916 13.8  0.1   8916  5960 pts/4    R+   06:18   0:01 wget -w 0 -i list.txt

リファレンス

  1. #642563 - wget needs many memory for recursive https downloads - Debian Bug report logs
  2. wget https memory - Google 検索

ソーシャルブックマーク

  1. はてなブックマーク
  2. Google Bookmarks
  3. del.icio.us

ChangeLog

  1. Posted: 2008-08-28T04:08:24+09:00
  2. Modified: 2008-08-28T04:08:24+09:00
  3. Generated: 2017-02-01T23:09:19+09:00